On Mon, Feb 23, 2015 at 8:22 PM, Ethan Furman
On Mon, Feb 23, 2015 at 1:49 PM, Paul Moore wrote:
On 23 February 2015 at 18:40, Brett Cannon wrote:
Couldn't you just keep it in memory as bytes and then write directly
over
the file? I realize that's a bit wasteful memory-wise but it is
The docs could mention the memory cost is something to watch out for when doing an in-place replacement. Heck the code could even make it an io.BytesIO instance so the rest of the code doesn't have to care about
On 02/23/2015 11:01 AM, Daniel Holth wrote: possible. this
special case.
I did consider this option, and I still quite like it. In fact, originally I wrote the API to *only* be in-place, until I realised that wouldn't work for things bigger than memory (but who has a Python app that's bigger than RAM?)
I'm happy to modify the API along these lines (details to be thrashed out) if people think it's worthwhile.
Sounds reasonable. It could be done by just reading the entire file contents after the shebang and re-writing them with the necessary offset all in RAM, truncating the file if necessary, without involving the zipfile module very much; the shebang could have some amount of padding by default; the file could just be re-compressed in memory depending on your appetite for complexity.
This could be a completely stupid question, but how does the zip file know where the individual files are? More to the point, does the index work via relative or absolute offset? If absolute, wouldn't the index have to be rewritten if the zip portion of the file moves?
Yes and no. The ZIP format uses a 'central directory' which is a record of
each file in the archive. The offsets are relative (although the
specification is a little vague on what they're relative *to* when using a
.zip file. The wording talks about disk numbers, ZIP being from the era of
floppy disks.) You find the central directory by searching from the end (or
reading a specific spot at the end, if you don't support archive comments.
zipimport, for example, doesn't support archive comments) and it turns out
you can find the central directory from just that information (and as far
as I know, all tools do.) However, there are still some offsets that would
change if you add stuff to the front of the ZIP file (or remove it), and
some zip tools will complain (usually just in verbose mode, though.)
--
Thomas Wouters