Find similar images using python
Terry Hancock
hancock at anansispaceworks.com
Fri Mar 31 11:52:50 EST 2006
Thomas W wrote:
>How can I use python to find images that looks quite similar? Thought
>I'd scale the images down to 32x32 and convert it to use a standard
>palette of 256 colors then compare the result pixel for pixel etc, but
>it seems as if this would take a very long time to do when processing
>lots of images.
>
>Any hint/clue on this subject would be appreciated.
>
>
This really depends on what is meant by "quite similar".
If you mean "to the human eye, the two pictures are identical",
as in the case of a tool to get rid of trivially-different duplications,
then you can use the technique you propose. I don't imagine that
you can save any time over that process. You'd use something
like PIL to do the comparisons, of course -- I suspect you want to
do something like:
1) resize both
2) quantize the colors
3) subtract the two images
4) resize to 1x1
5) threshhold the result (i.e. we've used PIL to sum the differences)
strictly speaking, it might be more mathematically ideal to take
the sum of the difference of the squares of the pixels (i.e. compute
chi-square). This of course, avoids the painfully slow process of
comparing pixel-by-pixel in a Python loop, which would, of course
be painfully slow.
This is conceptually equivalent to using an "epsilon" to test "equality"
of floating point numbers.
The more general case of matching images with similar content (but
which would be recognizeably different to the human eye), is a much
more challenging cutting-edge AI problem, as has already been
mentioned -- but I was going to mention imgSeek myself (I see someone's
already given you the link).
More information about the Python-list
mailing list