image matching algorithms

Tue Mar 11 02:20:53 EDT 2008

> | The various free tools differ by their chosen optimization paths and
> | their degree of specialization. My preference would be,
> |
> | 1. Doesn't really matter how long it takes to compute the N numbers per
> image
>
> Your problem here is that there is really no such thing as 'general
> features' and correspondingly, no such thing as 'general similarity of
> features'.

Yes there are! :) Image manipulation experts defined dozens of ways of
characterizing what 'similarity' means for images and all I was asking
is whether anyone here knew of a simple one.

> The features extracted have to have a specific definition. The
> features represent a severe lossy compression of the original. What to
> keep depends on the application.

Yes, and if you know *any* simple but useful (yes, useful, in *any*
sense) definition, I'd be happy to hear it.

> Example: classify each pixel as white, black, red, green, or blue. Will
> that match your intuitive idea of what matches?

Probably not, but thanks for the idea.

> To be a bit more sophisticated, use more color bins and do the binning
> separately for multiple areas, such as top, left, center, right, and bottom
> (or center, upper right, upper left, lower right, and lower left). I
> suspect Google does something like this to match, for instance, pictures
> with skin tones in the center, or pictures with blue tops (sky?) and green
> bottoms (vegetation?).

Now this sounds like a simple and good idea. I'll try this and see how
far I get.

> | 2. Lookups should be fast, consequently N should not be too large (I
> guess)
> | 3. It should be a generic algorithm working on generic images (everyday
> photos)
>
> Given feature vectors, there are various ways to calculate a distance or
> similarity coefficient. There have been great debates on what is 'best'.

True. As I've said, *any* but concrete and useful example would make me happy.

> | 4. PIL should be enough for the implementation
> |
> | So if anyone knows of a good resource that is close to being pseudo
> | code I would be very grateful!
>
> If you do not have sufficient insight into your own idea of 'matches', try
> something on a test set of perhaps 20 photos, calculate a 'match matrix',
> and compare that you your intuition.

Yes, this is what I'll do. The second thing I'll try (after trying
your suggestion) is based on this paper which I found in the meantime:
http://salesin.cs.washington.edu/abstracts.html#MultiresQuery
In case anyone is interested, it describes a multiresolution querying
algorithm and best of all, it has pseudo code for the various steps. I
don't know yet how difficult the implementation will be but so far
this looks the most promising.

Cheers,
Daniel