Number of colors in an image

David Bolen db3l at fitlinxx.com
Fri Nov 26 17:18:04 EST 2004


Laszlo Zsolt Nagy <gandalf at geochemsource.com> writes:

> How can I determine the number of colors used in an image? I tried to
> search on Google but I could figure out. I read the PIL handbook but I
> do not see how to do it. Can anyone help?

I saw your later post about using an external ImageMagick, but just in
case you do want to try it with PIL, one approach is to get the raw
data from PIL for each band, combining them into a tuple for the color
code, and then constructing a dictionary to count the colors.

For example, for an RGB image:

    import sys
    from PIL import Image
    from itertools import izip, repeat

    if __name__ == "__main__":

        im = Image.open(sys.argv[1])
        r = im.getdata(0)
        g = im.getdata(1)
        b = im.getdata(2)

        # Use a dictionary to count colors by constructing it with an
        # iterator of tuples of (pixel,1).  The inner izip makes each
        # pixel tuple out of the individual band values, and the outer
        # izip combines it with the infinitely repeating 1 to generate
        # the values for the dictionary constructor.
        colors = dict(izip(izip(r,g,b), repeat(1)))

        print 'Number of colors =', len(colors)

For a greyscale, or single banded image, it should be faster just to
use the built-in PIL "histogram" method and take the length of the
resulting list.  You could also generalize the above to handle CMYK
too by using getbands() to dynamically work with as many bands as
available.

The use of the itertools functions helps to keep memory usage down
(but does require Python 2.3+) in terms of intermediate object
creation, but it'll still be fairly wasteful since the getband()
methods return a full list of integer values for the image pixels, and
the dictionary will have to have a tuple and value for each unique
color entry.

On my older PIII-450 desktop with Windows, this processes a 1416x1028
RGB image with 269750 colors in under 12 seconds.  (A 3.2GHz P4
running FreeBSD does the same image in about 2s).  It appears to be
about 5-6 times faster than a simple nested for loop processing each
pixel (via PILs getpixel()).  Of course, it's nowhere near nowhere
near what some good C/C++ code could do if it had direct access to the
image data.

-- David




More information about the Python-list mailing list