[Image-SIG] Reading images from a TAR file

Christopher Brooks cab938 at mail.usask.ca
Tue Nov 18 20:52:23 CET 2008

I've got tar files filled with images that I'm trying to read with PIL.  I'm
using the following snipit:

if file.endswith( ".tar" ):
            # Read image files out of tar and into memory
            tar = tarfile.open( file , "r:" )
            item = [] #hold a link to a file object
            for tarinfo in tar:
                fileobj = tar.extractfile( tarinfo )
                im = Image.open( fileobj )
                item.append( im )

This doesn't work, I get an exception:

Traceback (most recent call last):
  File "ImageWorker.py", line 84, in <module>
    results = process( source_files , commands )
  File "ImageWorker.py", line 28, in process
    im = Image.open( fileobj )
  File "/usr/lib/python2.5/site-packages/PIL/Image.py", line 1917, in open
    raise IOError("cannot identify image file")
IOError: cannot identify image file

This is odd, because the docs for Image.open say:

 " You can use either a string (representing the filename) or a file object.
In the latter case, the file object must implement read, seek, and tell
methods, and be opened in binary mode."

And the docs for Tar.extractfile say:

"Note: The file-like object is read-only and provides the following methods:
read(), readline(), readlines(), seek(), tell()."

So it seems that contracts are all met.  A little bit of googling suggested
that the TarIO module might be useful, but I can't seem to find this on PIL
module reference page:

My question is thus:

1. What's the best practice for reading images out of a tar file with pil
and into in-memory Image objects?
2. Which of PIL.open() or the Tar.extractfile() is incorrect in their
documentation, and where would I file this as a bug if it is PIL (even if
it's just a bug in PILs documentation)?

Thanks for any thoughts,


Christopher Brooks, MSc.
Web: http://www.cs.usask.ca/~cab938
Mail: Advanced Research in Intelligent Educational Systems Laboratory
      Department of Computer Science
      University of Saskatchewan
      176 Thorvaldson Building
      110 Science Place
      Saskatoon, SK
      S7N 5C9

More information about the Image-SIG mailing list