A fast way to read last line of gzip archive ?

MRAB google at mrabarnett.plus.com
Thu May 21 12:01:53 EDT 2009


Barak, Ron wrote:
> Hi,
>  
> I need to read the end of a 20 MB gzip archives (To extract the date 
> from the last line of a a gzipped log file).
> The solution I have below takes noticeable time to reach the end of the 
> gzip archive.
>  
> Does anyone have a faster solution to read the last line of a gzip archive ?
>  
> Thanks,
> Ron.
>  
> #!/usr/bin/env python
>  
> import gzip
>  
> path = "./a/20/mb/file.tgz"
>  
> in_file = gzip.open(path, "r")
> first_line = in_file.readline()
> print "first_line ==",first_line
> in_file.seek(-500)
> last_line = in_file.readlines()[-1]
> print "last_line ==",last_line
> 
It takes a noticeable time to reach the end because, well, the data is
compressed! The compression method used requires the preceding data to
be read first.



More information about the Python-list mailing list