Read a gzip file from inside a tar file

Fredrik Lundh fredrik at
Mon Dec 13 22:07:30 CET 2004

Craig Ringer wrote:

>> These are huge files. My goal is to analyze the content of the gzip
>> file in the tar file without having to un gzip.  If that is possible.
> As far as I know, gzip is a stream compression algorithm that can't be
> decompressed in small blocks. That is, I don't think you can seek 500k
> into a 1MB file and decompress the next 100k.


> I'd say you'll have to progressively read the file from the beginning,
> processing and discarding as you go. It looks like a no-brainer to me -
> see zlib.decompressobj.

it can be a bit tricky to set things up properly, though.  here's a piece
of code that uses Python's good old consumer interface to decode things

you can either use this as is; just create a "target consumer", wrap it in the
gzip consumer, and feed data to the gzip consumer in suitable pieces.

alternatively, hack it until it does what you want.


