A fast way to read last line of gzip archive ?
MRAB
google at mrabarnett.plus.com
Thu May 21 12:01:53 EDT 2009
Barak, Ron wrote:
> Hi,
>
> I need to read the end of a 20 MB gzip archives (To extract the date
> from the last line of a a gzipped log file).
> The solution I have below takes noticeable time to reach the end of the
> gzip archive.
>
> Does anyone have a faster solution to read the last line of a gzip archive ?
>
> Thanks,
> Ron.
>
> #!/usr/bin/env python
>
> import gzip
>
> path = "./a/20/mb/file.tgz"
>
> in_file = gzip.open(path, "r")
> first_line = in_file.readline()
> print "first_line ==",first_line
> in_file.seek(-500)
> last_line = in_file.readlines()[-1]
> print "last_line ==",last_line
>
It takes a noticeable time to reach the end because, well, the data is
compressed! The compression method used requires the preceding data to
be read first.
More information about the Python-list
mailing list