The Similarity Search Problem Family
Definition of Terms
An atomic data element
A data structure, representing an organisation of elements
A function that, given a target and an arrangement yields a value in <math>]-\infty;0]</math>, where 0 represents perfect similarity.
A set of constraints for the algorithm. Examples:
- Worst-case time complexity
- Worst-case space complexity
A set of constraints concerning the resulting arrangement. Examples:
- An element may be used only once or more than once
- An element may be left out
- The arrangement must yield a similarity higher than some specified value
Similarity Search Algorithm:
An algorithm that, given a multiset of elements and a target, returns an arrangement
Creating a Specific Problem Definition
Creating a specific problem definition involves:
- select types:
- a target data type
- an arrangement data structure type
- domain of legal elements
- define a similarity function
- define arrangement and algorithm constraints
- create an algorithm that, given a specific target and a multiset of elements, returns an arrangement that abides the arrangement and algorithm constraints.
element : character target data type : character string arrangement data structure : character string similarity function : ? algorithm constraints : ? arrangement constraints : similarity must be above a certain value similarity search algorithm: given a multiset of characters and a string, return a string representing an arrangement of the characters that abides the constraints
100 Piece Puzzle
This problem represents solving a 100-piece picture puzzle, potentially with arbitrary pieces that has nothing to do with the target image As there is 100! = 10^158 solutions, we should use an algorithm with lower time complexity than a brute force algorithm.
element : an image of dimensions W/10 x H/10 target data type : an image of dimensions W x H arrangement data structure : a 10 x 10 matrix of numbers representing piece placement similarity function : ? algorithm constraints : ? arrangement constraints : each piece must be used exactly once, and similarity must a above a certain value similarity search algorithm: given a set of 100 images and a string, return a 10 x 10 matrix of the numbers 1-100 representing the placement of the pieces that abides the constraints
GM Sample Matching
Create a GM Midi file that when played on a given sound card, is similar to a given sample.
- Fourier Analysis of sample, and of each GM sound
- Sonology: what is similarity of sound?
(NaÃ¯ve solution: you could subtract the spectrogram images from each other and calculate the intensity of the resulting image?)
Piano Piece Sample Matching
Based on playability constraints (interesting finger-algorithm!), create a (Lilypond) score that, when played on the piano and perhaps time-compressed, is similar to a given sample. Use a function that returns the partial tones of a given piano note for the solution.
Given a 3D object and a multiset of LEGO blocks, make a LEGO figure similar to the object.
The most important and time-consuming aspects of solving a specific similarity search problem are defining the similarity function and creating the algorithm.