## Contents

## The Similarity Search Problem Family

### Definition of Terms

**Element**:

An atomic data element

**Target**:

Anything, really...

**Arrangement**:

A data structure, representing an organisation of elements

**Similarity Function**:

A function that, given a target and an arrangement yields
a value in <math>]-\infty;0]</math>, where 0 represents perfect similarity.

**Algorithm Constraints**:

A set of constraints for the algorithm. Examples:

- Worst-case time complexity
- Worst-case space complexity

**Arrangement Constraints**:

A set of constraints concerning the resulting arrangement. Examples:

- An element may be used only once or more than once
- An element may be left out
- The arrangement must yield a similarity higher than some specified value

**Similarity Search Algorithm**:

An algorithm that, given a multiset of elements and a
target, returns an arrangement

### Creating a Specific Problem Definition

Creating a specific problem definition involves:

- select
**types**:- a target
*data type* - an arrangement data structure
*type* *domain*of legal elements

- a target
- define a
**similarity function** - define arrangement and algorithm
**constraints** - create an
**algorithm**that, given a specific target and a multiset of elements, returns an arrangement that abides the arrangement and algorithm constraints.

## Example Problems

### String Matching

element : character target data type : character string arrangement data structure : character string similarity function : ? algorithm constraints : ? arrangement constraints : similarity must be above a certain value similarity search algorithm: given a multiset of characters and a string, return a string representing an arrangement of the characters that abides the constraints

### 100 Piece Puzzle

This problem represents solving a 100-piece picture puzzle, potentially with arbitrary pieces that has nothing to do with the target image As there is 100! = 10^158 solutions, we should use an algorithm with lower time complexity than a brute force algorithm.

element : an image of dimensions W/10 x H/10 target data type : an image of dimensions W x H arrangement data structure : a 10 x 10 matrix of numbers representing piece placement similarity function : ? algorithm constraints : ? arrangement constraints : each piece must be used exactly once, and similarity must a above a certain value similarity search algorithm: given a set of 100 images and a string, return a 10 x 10 matrix of the numbers 1-100 representing the placement of the pieces that abides the constraints

### GM Sample Matching

Create a GM Midi file that when played on a given sound card, is similar to a given sample.

Subproblems:

- Fourier Analysis of sample, and of each GM sound
- Sonology: what is similarity of sound?

(NaÃ¯ve solution: you could subtract the spectrogram images from each other and calculate the intensity of the resulting image?)

### Piano Piece Sample Matching

Based on playability constraints (interesting finger-algorithm!), create a (Lilypond) score that, when played on the piano and perhaps time-compressed, is similar to a given sample. Use a function that returns the partial tones of a given piano note for the solution.

### LEGO

Given a 3D object and a multiset of LEGO blocks, make a LEGO figure similar to the object.

## Solutions

The most important and time-consuming aspects of solving a specific similarity search
problem are defining the **similarity function** and creating the **algorithm**.