[Python-ideas] Gzip and zip extra field
Serhiy Storchaka
storchaka at gmail.com
Sat Nov 16 21:58:32 CET 2013
29.05.13 16:25, Serhiy Storchaka написав(ла):
> Gzip files can contains an extra field [1] and some applications use
> this for extending gzip format. The current GzipFile implementation
> ignores this field on input and doesn't allow to create a new file with
> an extra field.
>
> ZIP file entries also can contains an extra field [2]. Currently it just
> saved as bytes in the `extra` attribute of ZipInfo.
>
> I propose to save an extra field for gzip file and provide structural
> access to subfields.
>
> f = gzip.GzipFile('somefile.gz', 'rb')
> f.extra_bytes # A raw extra field as bytes
> # iterating over all subfields
> for xid, data in f.extra_map.items():
> ...
> # get Apollo file type information
> f.extra_map[b'AP'] # (or f.extra_map['AP']?)
> # creating gzip file with extra field
> f = gzip.GzipFile('somefile.gz', 'wb', extra=extrabytes)
> f = gzip.GzipFile('somefile.gz', 'wb', extra=[(b'AP', apollodata)])
> f = gzip.GzipFile('somefile.gz', 'wb', extra={b'AP': apollodata})
> # change Apollo file type information
> f.extra_map[b'AP'] = ...
>
> Issue #17681 [3] has preliminary patches. There is some open doubt about
> interface. Is not it over-engineered?
>
> Currently GzipFile supports seamless reading a sequence of separately
> compressed gzip files. Every such chunk can have own extra field (this
> is used in dictzip for example). It would be desirable to be able to
> read only until the end of current chunk in order not to miss an extra
> field.
>
> [1] http://www.gzip.org/format.txt
> [2] http://www.pkware.com/documents/casestudies/APPNOTE.TXT
> [3] http://bugs.python.org/issue17681
Is anyone interested in this feature? It needs bikeshedding.
More information about the Python-ideas
mailing list