imghdr.what failed to identify JPEG

Sun Jun 8 07:18:25 EDT 2003

Steven Taschuk <staschuk at telusplanet.net> wrote in message news:<mailman.1055048975.23465.python-list at python.org>...
> Quoth Arsenal:
> > I have 206 jpg files. And imghdr.what returns 'None' for 52 of them,
> > and I found that actually those 52 files can be viewed perfectly with
> > an image viewer (xnview), and xnview also identifed them as JPEG
> > TrueColor(v1.1) in the File Properties.
> > 
> > Is it true that the jpg file format has more than one "identifying
> > signature" which the imghdr.what function didn't yet incorporate, and
> > hence the false negatives?
> 
> That seems a reasonable guess.
> 
> A glance at the source shows that imghdr.what identifies JPEGs
> this way:
>         if h[6:10] == 'JFIF':
> That is, the test is whether bytes 6 through 9 inclusive of the
> file are 'JFIF'.  (As the documentation notes, it identifies JPEGs
> in JFIF format specifically.)
> 
> Do your files have this property?  If not, what are these bytes?
> 
> (My magic(4), for comparison, identifies JPEGs thus:
>     0       beshort         0xffd8          JPEG image data
>     >6      string          JFIF            \b, JFIF standard
>     >6      string          Exif            \b, EXIF standard
> )

Thanks for your info.

Bytes [6:10] for my 52 jpg files are 'Exif'. So if imghdr.what adds
this additonal check, all of my jpgs would be correctly identified.

Anyways, I have used the Python Image Library (PIL) instead, which
supports a lot more image format and pretty fast too.