[Patches] [ python-Patches-618135 ] gzip.py and files > 2G

noreply@sourceforge.net noreply@sourceforge.net
Mon, 04 Nov 2002 07:16:14 -0800


Patches item #618135, was opened at 2002-10-03 12:16
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=618135&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Geert Jansen (geertj)
>Assigned to: A.M. Kuchling (akuchling)
Summary: gzip.py and files > 2G

Initial Comment:
Problem:

Currently, the gzip module is not able to work with files 
> 2G uncompressed. The source of the problem is that 
at the end of a .gz file, there is a trailer containing a 32  
bit length field. This field is of course unable to represent 
a file length > 4G. Because of mixed type arithmetic in 
gzip.py, this limit is lowered to 2G.

Testcase:

python gzip.py <file> # must be > 2G
python gzip.py -d <file.gz> # error

Proposed fix:

Test the uncompressed data size modulo 4G. A patch 
implementing this fix is attached. This is also the 
solution that gzip itself uses.

Two other remarks:

I don't understand lines 22-23 of gzip.py: why is the 
test: "if value < 0" necessary when writing an unsigned 
int?

The testing of the crc value in GzipFile._read_eof() is 
done modulo 4G. Is this necessary? crc32 is just read 
from the file as a normal int, and self.crc is from zlib.crc 
which always returns a regular int.

Regards,
Geert Jansen

----------------------------------------------------------------------

Comment By: Geert Jansen (geertj)
Date: 2002-10-04 03:36

Message:
Logged In: YES 
user_id=537938

Sorry -- it seems the file upload went wrong! Second try.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=618135&group_id=5470