[Image-SIG] Determine byte offset of JPEG SOI marker

Franz Buchinger fbuchinger at gmail.com
Sun Feb 1 18:41:43 CET 2009


I think I made a conceptual mistake: I should look for SOS (ffda,
start of stream), not SOI (start of image).

Unfortunately, figuring out this marker is not so easy, because many
JPEG files from digital still cameras have a thumbnail embedded in one
of their header segments. As this thumbnail image is a JPEG file
itself, simply searching for SOS will lead to wrong results.

I observed that PIL just reads until SOS when I open a JPEG image. I
think the easiest way of retrieving the SOS marker offset would be
accessing PILs internal data structures. Any hints on that?

Greetings,

Franz

2009/2/1 Ned Batchelder <ned at nedbatchelder.com>:
> In the files I'm looking at, the SOI marker (ffd8) is at offset 0.  But
> assuming you have some where it is not, why not just open the file, and look
> for the marker?:
>
> head = open(jpg_filename, "rb").read(4000)
> soi_offset = head.find('\xff\xd8')
>
> You're going to have to open and read the file to compute the checksum
> anyway...
>
> --Ned.
> http://nedbatchelder.com
>
> Franz Buchinger wrote:
>
> Hi,
>
> I want to implement some sort of "post header checksum" for JPEG
> images, i.e. the checksum should only change if the actual image data
> was altered, EXIF/IPTC metadata modifications should have no effect.
> My plan to do this is to scan for the JPEG SOI marker, read n bytes
> after this marker and calcultate an MD5/SHA checksum for this.
>
> Can PIL tell me the byte offset of this marker?
>
> Greetings,
>
> Franz
> _______________________________________________
> Image-SIG maillist  -  Image-SIG at python.org
> http://mail.python.org/mailman/listinfo/image-sig
>
>
>
>
>
> --
> Ned Batchelder, http://nedbatchelder.com
>


More information about the Image-SIG mailing list