How to "gunzip-iterate" over a file?

Robert Kern robert.kern at gmail.com
Wed Jul 29 16:22:43 EDT 2009


On 2009-07-29 15:05, kj wrote:
>
>
> I need to iterate over the lines of *very* large (>1 GB) gzipped
> files.  I would like to do this without having to read the full
> compressed contents into memory so that I can apply zlib.decompress
> to these contents.  I also would like to avoid having to gunzip
> the file (i.e. creating an uncompressed version of the file in the
> filesystem) prior to iterating over it.
>
> Basically I'm looking for something that will give me the same
> functionality as Perl's gzip IO layer, which looks like this (from
> the documentation):
>
>           use PerlIO::gzip;
>           open FOO, "<:gzip", "file.gz" or die $!;
>           print while<FOO>; # And it will be uncompressed...
>
> What's the best way to achieve the same functionality in Python?

http://docs.python.org/library/gzip

import gzip

f = gzip.open('filename.gz')
for line in f:
     print line
f.close()

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco




More information about the Python-list mailing list