[Python-Dev] Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

Mon Feb 23 21:51:09 CET 2015

On 23 February 2015 at 18:40, Brett Cannon <brett at python.org> wrote:
> Couldn't you just keep it in memory as bytes and then write directly over
> the file? I realize that's a bit wasteful memory-wise but it is possible.
> The docs could mention the memory cost is something to watch out for when
> doing an in-place replacement. Heck the code could even make it an
> io.BytesIO instance so the rest of the code doesn't have to care about this
> special case.

The real problem with overwriting is if there's a failure during the
overwrite you lose the original file. My original API had overwrite as
the default, but I think the risk makes that a bad idea.

One option would be to allow outputs (TARGET in pack() and NEW_ARCHIVE
in set_interpreter()) to be open files (open for write in bytes mode)
as well as filenames[1]. Then the caller has the choice of how to
manage the output. The docs could include an example of overwriting
via a BytesIO object, and point out the risk.

BTW, while I was looking at the API, I realised I don't like the order
of arguments in pack(). I'm tempted to make it pack(directory,
target=None, interpreter=None, main=None) where a target of None means
"use the name of the source directory with .pyz tacked on", exactly as
for the command line API.

What do you think? The change would be no more than a few minutes'
work if it's acceptable.
Paul

[1] What's the standard practice for such dual-mode arguments? ZipFile
tests if the argument is a str instance and assumes a file if not. I'd
be inclined to follow that practice here.