On Wed, Apr 14, 2021 at 5:00 AM Joachim Wuttke <j.wuttke@fz-juelich.de> wrote:
gzip compression, using class GzipFile from gzip.py, by default
inserts a timestamp to the compressed stream. If the optional
argument `mtime` is absent or None, then the current time is used [1].

This makes outputs non-deterministic, which can badly confuse
unsuspecting users: If you run "diff" over two outputs to see
whether they are unaffected by changes in your application,
then you would not expect that the *.gz binaries differ just
because they were created at different times.

I'd propose to introduce a new constant `NO_TIMESTAMP` as
possible value of `mtime`.

Furthermore, if policy about API changes allows, I'd suggest
that `NO_TIMESTAMP` become the new default value for `mtime`.

How to proceed from here? Is this the kind of proposals that
has to go through a PEP?

For something like this you would open an issue and see if a core developer is intrigued enough to work with you to see the change occur; no PEP is necessary.

-Brett
 

- Joachim

[1]
https://github.com/python/cpython/blob/6f1e8ccffa5b1272a36a35405d3c4e4bbba0c082/Lib/gzip.py#L163

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/OTUGLATLYB736SAPPRWSSXWAKM5JHWZN/
Code of Conduct: http://python.org/psf/codeofconduct/