Similarity Search

Definition of Terms
Element: An atomic data element

Target: Anything, really...

Arrangement: A data structure, representing an organisation of elements

Similarity Function: A function that, given a target and an arrangement yields a value in $$]-\infty;0]$$, where 0 represents perfect similarity.

Algorithm Constraints: A set of constraints for the algorithm. Examples:
 * Worst-case time complexity
 * Worst-case space complexity

Arrangement Constraints: A set of constraints concerning the resulting arrangement. Examples:
 * An element may be used only once or more than once
 * An element may be left out
 * The arrangement must yield a similarity higher than some specified value

Similarity Search Algorithm: An algorithm that, given a multiset of elements and a target, returns an arrangement

Creating a Specific Problem Definition
Creating a specific problem definition involves:


 * 1) select types:
 * 2) a target data type
 * 3) an arrangement data structure type
 * 4) domain of legal elements
 * 5) define a similarity function
 * 6) define arrangement and algorithm constraints
 * 7) create an algorithm that, given a specific target and a multiset of elements, returns an arrangement that abides the arrangement and algorithm constraints.

String Matching
element                   : character target data type          : character string arrangement data structure : character string similarity function       : ? algorithm constraints     : ? arrangement constraints   : similarity must be above a certain value similarity search algorithm: given a multiset of characters and a string, return a string representing an arrangement of the characters that abides the constraints

100 Piece Puzzle
This problem represents solving a 100-piece picture puzzle, potentially with arbitrary pieces that has nothing to do with the target image As there is 100! = 10^158 solutions, we should use an algorithm with lower time complexity than a brute force algorithm. element                   : an image of dimensions W/10 x H/10 target data type          : an image of dimensions W x H arrangement data structure : a 10 x 10 matrix of numbers representing piece placement similarity function       : ? algorithm constraints     : ? arrangement constraints   : each piece must be used exactly once, and similarity must a above a certain value similarity search algorithm: given a set of 100 images and a string, return a 10 x 10 matrix of the numbers 1-100 representing the placement of the pieces that abides the constraints

GM Sample Matching
Create a GM Midi file that when played on a given sound card, is similar to a given sample.

Subproblems: (NaÃ¯ve solution: you could subtract the spectrogram images from each other and calculate the intensity of the resulting image?)
 * Fourier Analysis of sample, and of each GM sound
 * Sonology: what is similarity of sound?

Piano Piece Sample Matching
Based on playability constraints (interesting finger-algorithm!), create a (Lilypond) score that, when played on the piano and perhaps time-compressed, is similar to a given sample. Use a function that returns the partial tones of a given piano note for the solution.

LEGO
Given a 3D object and a multiset of LEGO blocks, make a LEGO figure similar to the object.

Solutions
The most important and time-consuming aspects of solving a specific similarity search problem are defining the similarity function and creating the algorithm.