[Tutor] use gzip with large files

frank h. frank.hoffsummer at gmail.com
Tue Jul 19 16:15:46 CEST 2005


hello all
I am trying to write a script in python that parses a gzipped logfile

the unzipped logfiles can be very large (>2GB)

basically the statements

file = gzip.GzipFile(logfile)
data = file.read()

for line in data.striplines():
....


would do what I want, but this is not feasible becasue the gzip files
are so huge.

So I do file.readline() in a for loop, but have no idea how long to
continue, because I dont know how many lines the files contain. How do
I check for end of file when using readline() ?
simply put it in a while loop and enclose it with try: except: ?

what would be the best (fastest) approach to deal with such large gzip
files in python?

thanks


More information about the Tutor mailing list