[Tutor] gzip file close

David Rock david at graniteweb.com
Tue Aug 10 23:45:28 CEST 2004


* Danny Yoo <dyoo at hkn.eecs.berkeley.edu> [2004-08-10 13:02]:
> 
> Hi David,
> 
> 
> The assumption that you're making here is that gzip.open() raises an
> exception on a non-gzipped file, but this might not be true.
> 
> 
> For example:
> 
> ###
> >>> from StringIO import StringIO
> >>> bogusData = StringIO("I am not a zipped file")
> >>> bogusData.seek(0)
> >>>
> >>> import gzip
> >>> unzippedFile = gzip.GzipFile(fileobj=bogusData)
> >>> unzippedFile.read(1)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "/usr/lib/python2.3/gzip.py", line 224, in read
>     self._read(readsize)
>   File "/usr/lib/python2.3/gzip.py", line 260, in _read
>     self._read_gzip_header()
>   File "/usr/lib/python2.3/gzip.py", line 161, in _read_gzip_header
>     raise IOError, 'Not a gzipped file'
> IOError: Not a gzipped file
> ###
> 
> 
> So it appears that gzip.GzipFile only starts reading the source file on a
> request for data, and not on open().

Looks like that's it. I have added some logic to the script to open the
file and do a quick read to verify if it is a gzipped file (based on
what the gzip.py module looks like), so now it looks like this:

    fp_input = gzip.open( inputfile, 'rb' )
    try:
        # test if it's a gzipped file (gzip.py actually calls
		# _read_gzip_header with a read(2) to look for the 
		# gzip magic info, so this forces a read which fails
		# on non-gzipped files
        fp_input.read(5)
    except IOError:
        fp_input.close()
        fp_input = open( inputfile, 'rb' )
    else:
		# This may not be necessary, but I figure it's safer to be
		# sure you're reading from the beginning
        fp_input.rewind()

Thanks for the insight. It's obvious once you _read_ the traceback ;-)

-- 
David Rock
david at graniteweb.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/tutor/attachments/20040810/ebfa224b/attachment.pgp


More information about the Tutor mailing list