Re: [Python-ideas] Gzip and zip extra field

16 Nov 2013


      29.05.13 16:25, Serhiy Storchaka написав(ла):
...
Gzip files can contains an extra field [1] and some applications use
this for extending gzip format. The current GzipFile implementation
ignores this field on input and doesn't allow to create a new file with
an extra field.
ZIP file entries also can contains an extra field [2]. Currently it just
saved as bytes in the `extra` attribute of ZipInfo.
I propose to save an extra field for gzip file and provide structural
access to subfields.
f = gzip.GzipFile('somefile.gz', 'rb')
f.extra_bytes # A raw extra field as bytes
# iterating over all subfields
for xid, data in f.extra_map.items():
     ...
# get Apollo file type information
f.extra_map[b'AP'] # (or f.extra_map['AP']?)
# creating gzip file with extra field
f = gzip.GzipFile('somefile.gz', 'wb', extra=extrabytes)
f = gzip.GzipFile('somefile.gz', 'wb', extra=[(b'AP', apollodata)])
f = gzip.GzipFile('somefile.gz', 'wb', extra={b'AP': apollodata})
# change Apollo file type information
f.extra_map[b'AP'] = ...
Issue #17681 [3] has preliminary patches. There is some open doubt about
interface. Is not it over-engineered?
Currently GzipFile supports seamless reading a sequence of separately
compressed gzip files. Every such chunk can have own extra field (this
is used in dictzip for example). It would be desirable to be able to
read only until the end of current chunk in order not to miss an extra
field.
[1] http://www.gzip.org/format.txt
[2] http://www.pkware.com/documents/casestudies/APPNOTE.TXT
[3] http://bugs.python.org/issue17681
Is anyone interested in this feature? It needs bikeshedding.