Mailman 3 Fwd: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support - Python-Dev

newer
Request for Pronouncement: PEP 486...

Fwd: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

Guido van Rossum

24 Feb 2015 24 Feb '15

10:58 a.m.

[Sorry, accidentally dropped the list from this message.] Here's my review. I really like where this is going but I have a few questions and suggestions (I can't help myself :-). [I sneaked a peek at the update you sent to peps@python.org.] "Currently, pyyzer [5] and pex [6] are two tools known to exist." -> "... are two such tools." It's not stated whether the archive names include the .pyz[w] extension or not (though I'm guessing it's not -- this should be stated). The naming of the functions feels inconsistent -- maybe pack(directory, target) -> create_archive(directory, archive), and set_interpreter() -> copy_archive(archive, new_archive)? Why no command-line equivalent for the other two methods? I propose the following interface: if there's only one positional argument, we're asking to print its shebang line; if there are two and the input position is an archive instead of a directory, we're copying. (In the future people will want an option to print more stuff, e.g. the main function or even a full listing.) I've not seen the pkg.mod:fn notation before. Where is this taken from? Why can't it be pkg.mod.fn? I'd specify that when the output argument is a file open for writing, it is the caller's responsibility to close the file. Also, can the file be a pipe? (I.e. are we using seek()/tell() or not?) And what about the input archive? Can that be a file open for reading? -- --Guido van Rossum (python.org/~guido)

Attachments:

attachment.htm (text/html — 1.8 KB)

Show replies by date

Daniel Holth

24 Feb 24 Feb

11:23 a.m.

On Tue, Feb 24, 2015 at 1:58 PM, Guido van Rossum <guido@python.org> wrote:

...

[Sorry, accidentally dropped the list from this message.]

Here's my review. I really like where this is going but I have a few questions and suggestions (I can't help myself :-).

[I sneaked a peek at the update you sent to peps@python.org.]

"Currently, pyyzer [5] and pex [6] are two tools known to exist." -> "... are two such tools."

It's not stated whether the archive names include the .pyz[w] extension or not (though I'm guessing it's not -- this should be stated).

The naming of the functions feels inconsistent -- maybe pack(directory, target) -> create_archive(directory, archive), and set_interpreter() -> copy_archive(archive, new_archive)?

Why no command-line equivalent for the other two methods? I propose the following interface: if there's only one positional argument, we're asking to print its shebang line; if there are two and the input position is an archive instead of a directory, we're copying. (In the future people will want an option to print more stuff, e.g. the main function or even a full listing.)

I've not seen the pkg.mod:fn notation before. Where is this taken from? Why can't it be pkg.mod.fn?

Translates to import pkg.mod; pkg.mod.fn() with no exception handling to figure out which part is importable. pkg.mod:ob.prop.fn would turn into import pkg.mod; pkg.mod.ob.prop.fn()

...

I'd specify that when the output argument is a file open for writing, it is the caller's responsibility to close the file. Also, can the file be a pipe? (I.e. are we using seek()/tell() or not?) And what about the input archive? Can that be a file open for reading?

It seems like the very next thing I would want after trying pack() would be to pass a callback that returns True iff a file should be included in the archive. After that I might just want a ZipFile subclass or a regular ZipFile to which I could add my own files? "return ZipFile with shebang already filled in". It's hard for me to say where the boundary between the convenience API and re-implementing this simple thing yourself if you have complex needs should be.

Paul Moore

11:33 a.m.

On 24 February 2015 at 19:23, Daniel Holth <dholth@gmail.com> wrote:

...

...
I'd specify that when the output argument is a file open for writing, it is the caller's responsibility to close the file. Also, can the file be a pipe? (I.e. are we using seek()/tell() or not?) And what about the input archive? Can that be a file open for reading?

It seems like the very next thing I would want after trying pack() would be to pass a callback that returns True iff a file should be included in the archive. After that I might just want a ZipFile subclass or a regular ZipFile to which I could add my own files? "return ZipFile with shebang already filled in". It's hard for me to say where the boundary between the convenience API and re-implementing this simple thing yourself if you have complex needs should be.

Yes, it's a slippery slope. The whole API is a pretty thin wrapper over a ZipFile object. I'd rather keep it to the most basic requirements, and defer anything even slightly complicated to the ZipFile API. The one exception is the set_interpreter/get_interpreter APIs, which are messy to do with ZipFile, and a pain to do "by hand" (because working with part-text, part-binary files is just naturally messy). It would be possible to write up how easy it is to create a pyz file by hand using the zipfile module, but doing so would (IMO) lose the nice simple message of this PEP - "use zipapp to bundle your code into an app that's supported all the way back to Python 2.6". Paul

Paul Moore

12:20 p.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On 24 February 2015 at 18:58, Guido van Rossum <guido@python.org> wrote:

...

Why no command-line equivalent for the other two methods? I propose the following interface: if there's only one positional argument, we're asking to print its shebang line; if there are two and the input position is an archive instead of a directory, we're copying. (In the future people will want an option to print more stuff, e.g. the main function or even a full listing.)

Thinking about this, there are 3 main uses: 1. Create an archive 2. Print the shebang 3. Change the shebang Of these, (1) is the crucial one. Basic usage should be python -m zipapp mydir [-o anothername.pyz] [-p interpreter] [-m entry:point] This zips up mydir to create an archive mydir.pyz. Options to change the target name, set a shebang line (side note: --python/-p or --interpreter/-i?) and set the entry point, I see this as pretty non-negotiable, this is the key use case that needs to be as simple as possible. To print the shebang, we could use python -m zipapp myapp.pyz --show This allows for future expansion by adding options, although most other things you might want to do (list the files, display __main__.py) can be done with a standard zip utility. I'm not keen on the option name --show, but I can't think of anything substantially better. To modify an archive could be done using python -m zipapp old.pyz new.pyz [-p interpreter] Default is to strip the shebang (no -p option). There's no option to omit the target and do an inplace update because I feel the default action (strip the shebang from the existing file with no backup) is too dangerous. To be explicit, "python -m zipapp app.pyz" will fail with a message "In-place editing of python zip applications is not supported". That seems to work. Open questions: 1. To create an archive, use -o target for an explicit target name, or just "target". The former is more conventional, the latter consistent with modification. Or we could make modification use a (mandatory) -o option. 2. -p/--python or -i/--interpreter for the shebang setting option 3. What to call the "show the shebang line" option Paul

Barry Warsaw

12:32 p.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On Feb 24, 2015, at 08:20 PM, Paul Moore wrote:

...

(side note: --python/-p or --interpreter/-i?) and set the entry point,

Both virtualenv and (I think) pex use --python/-p so that seems to be the overwhelming trend <wink>.

...

To modify an archive could be done using

python -m zipapp old.pyz new.pyz [-p interpreter]

Default is to strip the shebang (no -p option). There's no option to omit the target and do an inplace update because I feel the default action (strip the shebang from the existing file with no backup) is too dangerous.

You have to be careful about the case where old.pyz == new.pyz (e.g. either handling this case safely or complaining loudly) , but also you could handle it by using a .tmp file and renaming. E.g. old.pyz -> old.pyz.bak and old.pyz.tmp -> old.pyz.

...

3. What to call the "show the shebang line" option

I don't know how useful this is, given that (on *nix at least) you can effectively do the same with head(1). Cheers, -Barry

Paul Moore

12:51 p.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On 24 February 2015 at 20:32, Barry Warsaw <barry@python.org> wrote:

...

...
To modify an archive could be done using

python -m zipapp old.pyz new.pyz [-p interpreter]

Default is to strip the shebang (no -p option). There's no option to omit the target and do an inplace update because I feel the default action (strip the shebang from the existing file with no backup) is too dangerous.

You have to be careful about the case where old.pyz == new.pyz (e.g. either handling this case safely or complaining loudly) , but also you could handle it by using a .tmp file and renaming. E.g. old.pyz -> old.pyz.bak and old.pyz.tmp -> old.pyz.

There are a *lot* of obscure failure modes here. What if old and new are symlinks (or hard links) to the same file? What if a .tmp file already exists? What if the user hits Ctrl-C at a bad moment? On the principle of keeping it simple, I prefer just requiring a target, giving an error if the source name and target name are the same (which still leaves loopholes for the determined fool on case insensitive filesystems :-)) and just documenting that inplace modification isn't supported. The PEP clearly states that it's *minimal* tooling, after all...

...

...
3. What to call the "show the shebang line" option

I don't know how useful this is, given that (on *nix at least) you can effectively do the same with head(1).

I don't think it's that useful, TBH (although will head not print binary junk if there is no shebang line?) I quite like Brett's suggestion of --info, and maybe be a bit verbose: $ python -m zipapp foo.pyz --info Interpreter: /usr/bin/python $ python -m zipapp bar.pyz --info Interpreter: <none> I can't see it being useful for scripting, and if it matters, there's always get_interpreter() then. It's mainly just as a diagnostic for people who are getting the wrong interpreter invoked. Paul

Ethan Furman

1 p.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On 02/24/2015 12:51 PM, Paul Moore wrote:

...

I don't think it's that useful, TBH (although will head not print binary junk if there is no shebang line?) I quite like Brett's suggestion of --info, and maybe be a bit verbose:

$ python -m zipapp foo.pyz --info Interpreter: /usr/bin/python $ python -m zipapp bar.pyz --info Interpreter: <none>

I like that! +1 -- ~Ethan

Ethan Furman

1:09 p.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On 02/24/2015 01:00 PM, Ethan Furman wrote:

...

On 02/24/2015 12:51 PM, Paul Moore wrote:

...

...
$ python -m zipapp foo.pyz --info Interpreter: /usr/bin/python $ python -m zipapp bar.pyz --info Interpreter: <none>

Another way to support this is with subcommands. Have the default [implicit] command be to create the zip app, and then add any subcommands we need: python -m zipapp [create] foo #creates a foo.pyz from the foo directory python -m zipapp info foo.pyz # prints out shebang for foo.pyz python -m zipapp info --all foo.pyz # prints out shebang and directory structure and files and .... This easily leaves the door open to add new commands in the future if any are still needed, and makes the current commands easy and simple to use. -- ~Ethan~

Paul Moore

1:43 p.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On 24 February 2015 at 21:09, Ethan Furman <ethan@stoneleaf.us> wrote:

...

Another way to support this is with subcommands. Have the default [implicit] command be to create the zip app, and then add any subcommands we need:

python -m zipapp [create] foo #creates a foo.pyz from the foo directory

python -m zipapp info foo.pyz # prints out shebang for foo.pyz

python -m zipapp info --all foo.pyz # prints out shebang and directory structure and files and ....

This easily leaves the door open to add new commands in the future if any are still needed, and makes the current commands easy and simple to use.

I didn't know an implicit subcommand was allowed. That would work, although it does mean that you can't build a pyz from a directory "info" without explicitly using the create subcommand. I think I'm going to go with "python -m zipapp foo.pyz --info" (And an -o option for the target, mandatory when the source is a pyz file, and --python for the interpreter). It's not set in stone yet, so if anyone objects, there's still time to change my mind. Paul

Guido van Rossum

2:54 p.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

Maybe just fail if the target name already exists? On Tue, Feb 24, 2015 at 12:51 PM, Paul Moore <p.f.moore@gmail.com> wrote:

...

On 24 February 2015 at 20:32, Barry Warsaw <barry@python.org> wrote:

...
...
To modify an archive could be done using

python -m zipapp old.pyz new.pyz [-p interpreter]

Default is to strip the shebang (no -p option). There's no option to omit the target and do an inplace update because I feel the default action (strip the shebang from the existing file with no backup) is too dangerous.

You have to be careful about the case where old.pyz == new.pyz (e.g. either handling this case safely or complaining loudly) , but also you could handle it by using a .tmp file and renaming. E.g. old.pyz -> old.pyz.bak and old.pyz.tmp -> old.pyz.

There are a *lot* of obscure failure modes here. What if old and new are symlinks (or hard links) to the same file? What if a .tmp file already exists? What if the user hits Ctrl-C at a bad moment?

On the principle of keeping it simple, I prefer just requiring a target, giving an error if the source name and target name are the same (which still leaves loopholes for the determined fool on case insensitive filesystems :-)) and just documenting that inplace modification isn't supported. The PEP clearly states that it's *minimal* tooling, after all...

...
...
3. What to call the "show the shebang line" option

I don't know how useful this is, given that (on *nix at least) you can effectively do the same with head(1).

I don't think it's that useful, TBH (although will head not print binary junk if there is no shebang line?) I quite like Brett's suggestion of --info, and maybe be a bit verbose:

$ python -m zipapp foo.pyz --info Interpreter: /usr/bin/python $ python -m zipapp bar.pyz --info Interpreter: <none>

I can't see it being useful for scripting, and if it matters, there's always get_interpreter() then. It's mainly just as a diagnostic for people who are getting the wrong interpreter invoked.

Paul _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org

-- --Guido van Rossum (python.org/~guido)

Nick Coghlan

2:54 p.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On 25 Feb 2015 06:52, "Paul Moore" <p.f.moore@gmail.com> wrote:

...

On 24 February 2015 at 20:32, Barry Warsaw <barry@python.org> wrote:

...
...
To modify an archive could be done using

python -m zipapp old.pyz new.pyz [-p interpreter]

Default is to strip the shebang (no -p option). There's no option to omit the target and do an inplace update because I feel the default action (strip the shebang from the existing file with no backup) is too dangerous.

You have to be careful about the case where old.pyz == new.pyz (e.g.

either

...

...
handling this case safely or complaining loudly) , but also you could handle it by using a .tmp file and renaming. E.g. old.pyz -> old.pyz.bak and old.pyz.tmp -> old.pyz.

There are a *lot* of obscure failure modes here. What if old and new are symlinks (or hard links) to the same file? What if a .tmp file already exists? What if the user hits Ctrl-C at a bad moment?

On the principle of keeping it simple, I prefer just requiring a target, giving an error if the source name and target name are the same (which still leaves loopholes for the determined fool on case insensitive filesystems :-)) and just documenting that inplace modification isn't supported. The PEP clearly states that it's *minimal* tooling, after all...

https://docs.python.org/3/library/os.path.html#os.path.samefile covers this check in a robust, cross-platform way.

...

...
...
3. What to call the "show the shebang line" option

I don't know how useful this is, given that (on *nix at least) you can effectively do the same with head(1).

I don't think it's that useful, TBH (although will head not print binary junk if there is no shebang line?) I quite like Brett's suggestion of --info, and maybe be a bit verbose:

$ python -m zipapp foo.pyz --info Interpreter: /usr/bin/python $ python -m zipapp bar.pyz --info Interpreter: <none>

I can't see it being useful for scripting, and if it matters, there's always get_interpreter() then. It's mainly just as a diagnostic for people who are getting the wrong interpreter invoked.

The corresponding CLI option for the inspect module is "--details": https://docs.python.org/3/library/inspect.html#command-line-interface (By default "python -m inspect <modname>" prints the module source code) Cheers, Nick.

...

Paul _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:

https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com

Paul Moore

25 Feb 25 Feb

1:55 a.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On 24 February 2015 at 22:54, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

...
On the principle of keeping it simple, I prefer just requiring a target, giving an error if the source name and target name are the same (which still leaves loopholes for the determined fool on case insensitive filesystems :-)) and just documenting that inplace modification isn't supported. The PEP clearly states that it's *minimal* tooling, after all...

https://docs.python.org/3/library/os.path.html#os.path.samefile covers this check in a robust, cross-platform way.

Wow, I hadn't realised that samefile had become reliable on Windows (ino/dev always used to be unreliable years ago). It's the little things like this that sneak into new releases without much fanfare that make so much *actual* difference. Thanks to whoever implemented this, and to all the people putting in the little changes that make new versions so much better :-) Paul

Brett Cannon

24 Feb 24 Feb

12:41 p.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On Tue Feb 24 2015 at 3:21:30 PM Paul Moore <p.f.moore@gmail.com> wrote:

...

On 24 February 2015 at 18:58, Guido van Rossum <guido@python.org> wrote:

...
Why no command-line equivalent for the other two methods? I propose the following interface: if there's only one positional argument, we're asking to print its shebang line; if there are two and the input position is an archive instead of a directory, we're copying. (In the future people will want an option to print more stuff, e.g. the main function or even a full listing.)

Thinking about this, there are 3 main uses:

1. Create an archive 2. Print the shebang 3. Change the shebang

Of these, (1) is the crucial one.

Basic usage should be

python -m zipapp mydir [-o anothername.pyz] [-p interpreter] [-m entry:point]

This zips up mydir to create an archive mydir.pyz. Options to change the target name, set a shebang line (side note: --python/-p or --interpreter/-i?) and set the entry point,

I see this as pretty non-negotiable, this is the key use case that needs to be as simple as possible.

To print the shebang, we could use

python -m zipapp myapp.pyz --show

This allows for future expansion by adding options, although most other things you might want to do (list the files, display __main__.py) can be done with a standard zip utility. I'm not keen on the option name --show, but I can't think of anything substantially better.

To modify an archive could be done using

python -m zipapp old.pyz new.pyz [-p interpreter]

Default is to strip the shebang (no -p option). There's no option to omit the target and do an inplace update because I feel the default action (strip the shebang from the existing file with no backup) is too dangerous.

To be explicit, "python -m zipapp app.pyz" will fail with a message "In-place editing of python zip applications is not supported".

That seems to work.

Open questions:

1. To create an archive, use -o target for an explicit target name, or just "target". The former is more conventional, the latter consistent with modification. Or we could make modification use a (mandatory) -o option.

EIBTI suggests requiring the -o. Pragmatic suggests just [in] [out] and use context based on what kind of thing [in] points at as well as whether -p is specified and whether it has an argument, which is the most minimal UX you can have. Question is whether you can screw up by specifying the wrong thing somehow (you might have to require that [out] doesn't already exist to make it work).

...

2. -p/--python or -i/--interpreter for the shebang setting option

Since you are going to be using `python -m pyzip` then -i/--interpreter is less redundant-looking on the command-line.

...

3. What to call the "show the shebang line" option

As suggested above, -p w/o an argument could do it, otherwise --show or --info seems fine (I like --shebang, but that will probably be tough on non-English speakers).

Paul Moore

25 Feb 25 Feb

2 a.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On 24 February 2015 at 18:58, Guido van Rossum <guido@python.org> wrote:

...

The naming of the functions feels inconsistent -- maybe pack(directory, target) -> create_archive(directory, archive), and set_interpreter() -> copy_archive(archive, new_archive)?

One possible source of confusion with copy_archive (and its command line equivalent "python -m zipapp old.pyz -o new.pyz") is that it isn't technically a copy, as it changes the shebang line (if you omit the interpreter argument it removes the existing shebang). We could change it to copy by default, but (a) that's redundant as a file copy works better, and (b) we'd need to add a method of specifying "remove the shebang" to replace omitting the interpreter arg. Is this a big enough issue to be worth changing the name of the function and the command line behaviour? I'm inclined to leave it, but mainly on the basis that I feel like I'm getting to the point of over-thinking things... Paul

Jim J. Jewett

8:02 a.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On 24 February 2015 at 18:58, Guido van Rossum <guido at python.org> wrote:

...

The naming of the functions feels inconsistent -- maybe pack(directory, target) -> create_archive(directory, archive), and set_interpreter() -> copy_archive(archive, new_archive)?

Paul Moore wrote:

...

One possible source of confusion with copy_archive (and its command line equivalent "python -m zipapp old.pyz -o new.pyz") is that it isn't technically a copy, as it changes the shebang line (if you omit the interpreter argument it removes the existing shebang).

Is the difference between create and copy important? e.g., is there anything wrong with create_archive(old_archive, output=new_archive) working as well as create_archive(directory, archive)? -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ

Paul Moore

9:06 a.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On 25 February 2015 at 16:02, Jim J. Jewett <jimjjewett@gmail.com> wrote:

...

On 24 February 2015 at 18:58, Guido van Rossum <guido at python.org> wrote:

...
The naming of the functions feels inconsistent -- maybe pack(directory, target) -> create_archive(directory, archive), and set_interpreter() -> copy_archive(archive, new_archive)?

Paul Moore wrote:

...
One possible source of confusion with copy_archive (and its command line equivalent "python -m zipapp old.pyz -o new.pyz") is that it isn't technically a copy, as it changes the shebang line (if you omit the interpreter argument it removes the existing shebang).

Is the difference between create and copy important? e.g., is there anything wrong with

create_archive(old_archive, output=new_archive) working as well as create_archive(directory, archive)?

Probably not, now. The semantics have converged enough that this might be reasonable. It's how the command line interface works, after all. It would mean that the behaviour would be different depending on the value of the source argument (supplying the main argument and omitting the target are only valid for create), but again that's how the command line works. I'll have a go at implementing this change this evening and see how it plays out. Paul

Paul Moore

11:33 a.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On 25 February 2015 at 17:06, Paul Moore <p.f.moore@gmail.com> wrote:

...

...
Is the difference between create and copy important? e.g., is there anything wrong with

create_archive(old_archive, output=new_archive) working as well as create_archive(directory, archive)?

Probably not, now. The semantics have converged enough that this might be reasonable. It's how the command line interface works, after all. It would mean that the behaviour would be different depending on the value of the source argument (supplying the main argument and omitting the target are only valid for create), but again that's how the command line works.

I'll have a go at implementing this change this evening and see how it plays out.

That worked out pretty well, IMO. The resulting API is a lot cleaner (internally, there's not much change, I still have a copy_archive function but it's now private). I've included the resulting API documentation below. It looks pretty good to me. Does anyone have any further suggestions or comments, or does this look ready to go back to Guido for a second review? Paul Python API ---------- The module defines two convenience functions: .. function:: create_archive(directory, target=None, interpreter=None, main=None) Create an application archive from *source*. The source can be any of the following: * The name of a directory, in which case a new application archive will be created from the content of that directory. * The name of an existing application archive file, in which case the file is copied to the target. The file name should include the ``.pyz`` extension, if required. * A file object open for reading in bytes mode. The content of the file should be an application archive, and the file object is assumed to be positioned at the start of the archive. The *target* argument determines where the resulting archive will be written: * If it is the name of a file, the archive will be written to that file. * If it is an open file object, the archive will be written to that file object, which must be open for writing in bytes mode. * If the target is omitted (or None), the source must be a directory and the target will be a file with the same name as the source, with a ``.pyz`` extension added. The *interpreter* argument specifies the name of the Python interpreter with which the archive will be executed. It is written as a "shebang" line at the start of the archive. On POSIX, this will be interpreted by the OS, and on Windows it will be handled by the Python launcher. Omitting the *interpreter* results in no shebang line being written. If an interpreter is specified, and the target is a filename, the executable bit of the target file will be set. The *main* argument specifies the name of a callable which will be used as the main program for the archive. It can only be specified if the source is a directory, and the source does not already contain a ``__main__.py`` file. The *main* argument should take the form "pkg.module:callable" and the archive will be run by importing "pkg.module" and executing the given callable with no arguments. It is an error to omit *main* if the source is a directory and does not contain a ``__main__.py`` file, as otherwise the resulting archive would not be executable. If a file object is specified for *source* or *target*, it is the caller's responsibility to close it after calling create_archive. When copying an existing archive, file objects supplied only need ``read`` and ``readline``, or ``write`` methods. When creating an archive from a directory, if the target is a file object it will be passed to the ``zipfile.ZipFile`` class, and must supply the methods needed by that class. .. function:: get_interpreter(archive) Return the interpreter specified in the ``#!`` line at the start of the archive. If there is no ``#!`` line, return :const:`None`. The *archive* argument can be a filename or a file-like object open for reading in bytes mode. It is assumed to be at the start of the archive.

Brett Cannon

11:40 a.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On Wed, Feb 25, 2015 at 2:33 PM Paul Moore <p.f.moore@gmail.com> wrote:

...

On 25 February 2015 at 17:06, Paul Moore <p.f.moore@gmail.com> wrote:

...
...
Is the difference between create and copy important? e.g., is there anything wrong with

create_archive(old_archive, output=new_archive) working as well as create_archive(directory, archive)?

Probably not, now. The semantics have converged enough that this might be reasonable. It's how the command line interface works, after all. It would mean that the behaviour would be different depending on the value of the source argument (supplying the main argument and omitting the target are only valid for create), but again that's how the command line works.

I'll have a go at implementing this change this evening and see how it plays out.

That worked out pretty well, IMO. The resulting API is a lot cleaner (internally, there's not much change, I still have a copy_archive function but it's now private). I've included the resulting API documentation below. It looks pretty good to me.

Does anyone have any further suggestions or comments, or does this look ready to go back to Guido for a second review?

+1 from me. -Brett

...

Paul

Python API ----------

The module defines two convenience functions:

.. function:: create_archive(directory, target=None, interpreter=None, main=None)

Create an application archive from *source*. The source can be any of the following:

* The name of a directory, in which case a new application archive will be created from the content of that directory. * The name of an existing application archive file, in which case the file is copied to the target. The file name should include the ``.pyz`` extension, if required. * A file object open for reading in bytes mode. The content of the file should be an application archive, and the file object is assumed to be positioned at the start of the archive.

The *target* argument determines where the resulting archive will be written:

* If it is the name of a file, the archive will be written to that file. * If it is an open file object, the archive will be written to that file object, which must be open for writing in bytes mode. * If the target is omitted (or None), the source must be a directory and the target will be a file with the same name as the source, with a ``.pyz`` extension added.

The *interpreter* argument specifies the name of the Python interpreter with which the archive will be executed. It is written as a "shebang" line at the start of the archive. On POSIX, this will be interpreted by the OS, and on Windows it will be handled by the Python launcher. Omitting the *interpreter* results in no shebang line being written. If an interpreter is specified, and the target is a filename, the executable bit of the target file will be set.

The *main* argument specifies the name of a callable which will be used as the main program for the archive. It can only be specified if the source is a directory, and the source does not already contain a ``__main__.py`` file. The *main* argument should take the form "pkg.module:callable" and the archive will be run by importing "pkg.module" and executing the given callable with no arguments. It is an error to omit *main* if the source is a directory and does not contain a ``__main__.py`` file, as otherwise the resulting archive would not be executable.

If a file object is specified for *source* or *target*, it is the caller's responsibility to close it after calling create_archive.

When copying an existing archive, file objects supplied only need ``read`` and ``readline``, or ``write`` methods. When creating an archive from a directory, if the target is a file object it will be passed to the ``zipfile.ZipFile`` class, and must supply the methods needed by that class.

.. function:: get_interpreter(archive)

Return the interpreter specified in the ``#!`` line at the start of the archive. If there is no ``#!`` line, return :const:`None`. The *archive* argument can be a filename or a file-like object open for reading in bytes mode. It is assumed to be at the start of the archive. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ brett%40python.org

Barry Warsaw

12:11 p.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On Feb 25, 2015, at 07:33 PM, Paul Moore wrote:

...

The module defines two convenience functions:

.. function:: create_archive(directory, target=None, interpreter=None, main=None)

Create an application archive from *source*. The source can be any of the following:

I think you meant s/directory/source/ in the signature. Other than that, +1. Cheers, -Barry

Jim J. Jewett

12:12 p.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On Wed, Feb 25, 2015 at 2:33 PM, Paul Moore <p.f.moore@gmail.com> wrote:

...

On 25 February 2015 at 17:06, Paul Moore <p.f.moore@gmail.com> wrote:

...

I've included the resulting API documentation below. It looks pretty good to me.

Me too. I have a few nits anyhow.

...

.. function:: create_archive(directory, target=None, interpreter=None, main=None)

...

Create an application archive from *source*. The source can be any of the following:

(1) *source* makes me think of source code, as opposed to binary. This is only a small objection, in part because I can't think of anything better. (2) If you do keep *source*, I think that the the "directory" parameter should be renamed to "source". (3)

...

* The name of an existing application archive file, in which case the file is copied to the target.

==>

...

* The name of an existing application archive file, in which case the file is copied (possibly with changes) to the target.

My concern is that someone who does want just another copy will use this, see "copied", not read the other options, and be surprised when the shebang is dropped.

...

* A file object open for reading in bytes mode. The content of the file should be an application archive, and the file object is assumed to be positioned at the start of the archive.

I like this way of ducking the "does it need to be seekable" question.

...

The *target* argument determines where the resulting archive will be written:

* If it is the name of a file, the archive will be written to that file.

(4) Note that the filename is not required to end with pyz, although that is good practice. Or maybe just be explicit that the function itself does not add a .pyz, and assumes that the caller will do so when appropriate.

...

The *interpreter* argument specifies the name of the Python interpreter with which the archive will be executed. ... ... Omitting the *interpreter* results in no shebang line being written.

(5) even if there was an explicit shebang line in the source archive.

...

If an interpreter is specified, and the target is a filename, the executable bit of the target file will be set.

(6) (target is a filename, or None) Or does that clarification just confuse the issue, and only benefit people so careful they'll verify it themselves anyway? (7) That is a good idea, but not quite as clear cut as it sounds. On unix, there are generally 3 different executable bits specifying *who* can run it. Setting the executable bit only for the owner is probably a conservative but sensible default. -jJ

Paul Moore

1:45 p.m.

New subject: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

On 25 February 2015 at 20:12, Jim J. Jewett <jimjjewett@gmail.com> wrote:

...

On Wed, Feb 25, 2015 at 2:33 PM, Paul Moore <p.f.moore@gmail.com> wrote:

...
On 25 February 2015 at 17:06, Paul Moore <p.f.moore@gmail.com> wrote:

...
I've included the resulting API documentation below. It looks pretty good to me.

Me too. I have a few nits anyhow.

...
.. function:: create_archive(directory, target=None, interpreter=None, main=None)

...
Create an application archive from *source*. The source can be any of the following:

(1) *source* makes me think of source code, as opposed to binary. This is only a small objection, in part because I can't think of anything better.

(2) If you do keep *source*, I think that the the "directory" parameter should be renamed to "source".

Yep, that's a typo. Think of it as source -> target as opposed to source code and it's fine :-)

...

(3)

...
* The name of an existing application archive file, in which case the file is copied to the target.

==>

...
* The name of an existing application archive file, in which case the file is copied (possibly with changes) to the target.

My concern is that someone who does want just another copy will use this, see "copied", not read the other options, and be surprised when the shebang is dropped.

Hmm, how about "... the content of the archive is copied to the target with a replacement shebang line"?

...

...
* A file object open for reading in bytes mode. The content of the file should be an application archive, and the file object is assumed to be positioned at the start of the archive.

I like this way of ducking the "does it need to be seekable" question.

:-)

...

...
The *target* argument determines where the resulting archive will be written:

* If it is the name of a file, the archive will be written to that file.

(4) Note that the filename is not required to end with pyz, although that is good practice. Or maybe just be explicit that the function itself does not add a .pyz, and assumes that the caller will do so when appropriate.

Hmm, I thought I'd added an explanation. Maybe I did that somewhere else and missed it here. I'll clarify.

...

...
The *interpreter* argument specifies the name of the Python interpreter with which the archive will be executed. ... ... Omitting the *interpreter* results in no shebang line being written.

(5) even if there was an explicit shebang line in the source archive.

I'll clarify the wording.

...

...
If an interpreter is specified, and the target is a filename, the executable bit of the target file will be set.

(6) (target is a filename, or None) Or does that clarification just confuse the issue, and only benefit people so careful they'll verify it themselves anyway?

Probably :-) How about "if the target is a real file" or "unless the target is a file-like object"? But in all honesty I think it's fine as is.

...

(7) That is a good idea, but not quite as clear cut as it sounds. On unix, there are generally 3 different executable bits specifying *who* can run it. Setting the executable bit only for the owner is probably a conservative but sensible default.

I know, but excuse the naivete of a Windows user. I'm inclined to leave it as it is and direct people to read the source if they care that much (I actually used I_EXEC, which is what I've seen other code use). The alternative is to not set the executable bit at all and make the user do it as a separate step. My instinct is that doing that would be less user friendly, but my instincts on what's good Unix behaviour aren't strong... Thanks for the comments. Paul

3560

Age (days ago)

3561

Last active (days ago)

List overview

Download

20 comments

8 participants

participants (8)

Barry Warsaw
Brett Cannon
Daniel Holth
Ethan Furman
Guido van Rossum
Jim J. Jewett
Nick Coghlan
Paul Moore

Fwd: Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

tags

participants (8)