Documentation for ability to execute zipfiles & directories
A few months ago, 2.6 & 3.0 gained the ability to execute zipfiles and directories containing a __main__.py file (see [1] for details). The idea is that a whole application can be bundled into a zipfile containing a __main__.py module in its root directory, and then passed directly to the interpreter for execution, with the zipfile being inserted as the first entry on sys.path to allow easy access to the rest of the application code. It is inspired by Java's JAR option, but not needing an explicit interpreter option makes it more shebang friendly on *nix systems (it can also be mapped more easily to the existing Python file type handling on Windows). The ability to also execute directories containing a __main__.py was something of a side effect of the implementation technique, but was also considered valuable as it makes it much easier to develop such bundled applications (using a directory most of the time, and then bundling into a single zipfile prior to release). The part I'm struggling with now is where to document the way this feature works. Currently, the only real documentation we have of the command line invocation is in section 2.1 of the tutorial, and the idea of packaging whole applications as zipfiles seems far too esoteric to be covering it there. It doesn't really seem to fit in section 6 (covering modules and packages) either. Do we need a new appendix to the tutorial which goes into detail about the CPython interpreter's command line options, environment variables and details on what can be executed? Cheers, Nick. [1] http://bugs.python.org/issue1739468 -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org
On 04/03/2008, Nick Coghlan <ncoghlan@gmail.com> wrote:
Do we need a new appendix to the tutorial which goes into detail about the CPython interpreter's command line options, environment variables and details on what can be executed?
There is a Python man page, which covers the command line usage. However, it's separate from the documentation, and it isn't bundled with the Windows installers - both of which are a real pain (for me, at least). I'd suggest taking the man page, adding the information about executing zip files and directories, and putting the whole lot into the formal documentation. The big problem is that there isn't really anywhere in the docs which is formally CPython-specific. My preference would be to put it in the language reference, as a new chapter (between the current chapters 1 and 2) called "Invoking the Python Interpreter". You could also make the manpage a new document, called "Invoking Python", but it's a bit small to warrant a ful document. An appendix to the Tutorial is OK, I guess, but personally I never think of looking at the tutorial (I've been using Python too long to feel that I need a tutorial any more, although the quality of my code probably says otherwise :-)) Paul.
Paul Moore wrote:
On 04/03/2008, Nick Coghlan <ncoghlan@gmail.com> wrote:
Do we need a new appendix to the tutorial which goes into detail about the CPython interpreter's command line options, environment variables and details on what can be executed?
There is a Python man page, which covers the command line usage. However, it's separate from the documentation, and it isn't bundled with the Windows installers - both of which are a real pain (for me, at least).
I'd suggest taking the man page, adding the information about executing zip files and directories, and putting the whole lot into the formal documentation.
The big problem is that there isn't really anywhere in the docs which is formally CPython-specific. My preference would be to put it in the language reference, as a new chapter (between the current chapters 1 and 2) called "Invoking the Python Interpreter".
You could also make the manpage a new document, called "Invoking Python", but it's a bit small to warrant a ful document.
An appendix to the Tutorial is OK, I guess, but personally I never think of looking at the tutorial (I've been using Python too long to feel that I need a tutorial any more, although the quality of my code probably says otherwise :-))
While I hesitate to suggest a change of such magnitude, there's something to recommend the old IBM mainframe approach of separating out "Principles of Operation" (which would be the reference manuals, in Python's case the Language and Library refs) from "Users' Guide" which contains the practical stuff you need to actually make use of a product. I've always found it rather counter-intuitive that you have to go to the Library Reference manual to find information about Python's built-in types, for example. I though the whole point of libraries was that they *aren't* built in, and represent baggage that should only be carried on necessary trips. And let's not get started on the import statement. I have just spent some time trying to work out how we might get rid of the embarrassing "XXX can't be bothered to spell this out right now ..." mess, and have come to the conclusion that a thorough review and a complete rewrite is the only thing that will do the topic justice. I can only hope that Brett plans to make a start on this as a part of his rework of the import code (and if you're reading, Brett, I'd like to help). It doesn't help my motivation that the import mechanism is about to change yet again, though I am happy that it will be more regular and easier to understand after the next change. I believe with 3.0 the biggest improvement we could make to the language for newcomers would be to reorganize our documentation so that things live in the places they belong rather than the place they landed and got stuck over time. Please note this isn't a rant, in the sense that I believe there are perfectly comprehensible reasons for how the docs got to be the shape they are. But I fear that we are possibly blinded to their inappropriate nature by our closeness to and familiarity with them, and I think a major effort to revise their structure (and to a lesser degree their content) could pay itself back many times over in increased user friendliness. Georg's recent revision of the technology puts us in a better position to prepare for this, but it would still be a major project. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/
On Tue, Mar 04, 2008 at 08:58:57AM -0500, Steve Holden wrote:
While I hesitate to suggest a change of such magnitude, there's something to recommend the old IBM mainframe approach of separating out "Principles of Operation" (which would be the reference manuals, in Python's case the Language and Library refs) from "Users' Guide" which contains the practical stuff you need to actually make use of a product.
Good suggestion. Using the debugger and profiler could also be covered in the User's Guide. Would splitting up the docs make them more useful for IronPython/Jython? For example, Jython could eventually take the 2.6 language docs as-is, but modify the library reference to remove unsupported modules and add Jython-specific ones. --amk
On Tue, Mar 04, 2008 at 08:58:57AM -0500, Steve Holden wrote:
While I hesitate to suggest a change of such magnitude, there's something to recommend the old IBM mainframe approach of separating out "Principles of Operation" (which would be the reference manuals, in Python's case the Language and Library refs) from "Users' Guide" which contains the practical stuff you need to actually make use of a product.
Good suggestion. Using the debugger and profiler could also be covered in the User's Guide.
Would splitting up the docs make them more useful for IronPython/Jython? For example, Jython could eventually take the 2.6 language docs as-is, but modify the library reference to remove unsupported modules and add Jython-specific ones. Speaking for Jython, this would be extremely helpful for us. Once we get caught up, better docs will become one of our most important
On Tue, Mar 4, 2008 at 1:36 PM, A.M. Kuchling <amk@amk.ca> wrote: priorities I think. -Frank
Steve Holden schrieb:
Paul Moore wrote:
On 04/03/2008, Nick Coghlan <ncoghlan@gmail.com> wrote:
Do we need a new appendix to the tutorial which goes into detail about the CPython interpreter's command line options, environment variables and details on what can be executed?
There is a Python man page, which covers the command line usage. However, it's separate from the documentation, and it isn't bundled with the Windows installers - both of which are a real pain (for me, at least).
I'd suggest taking the man page, adding the information about executing zip files and directories, and putting the whole lot into the formal documentation.
Look no further: http://docs.python.org/dev/using/cmdline.html There's even more platform-specific stuff at http://docs.python.org/dev/using/.
The big problem is that there isn't really anywhere in the docs which is formally CPython-specific. My preference would be to put it in the language reference, as a new chapter (between the current chapters 1 and 2) called "Invoking the Python Interpreter".
The "Using Python" documentation section could be marked as CPython specific very well.
You could also make the manpage a new document, called "Invoking Python", but it's a bit small to warrant a ful document.
An appendix to the Tutorial is OK, I guess, but personally I never think of looking at the tutorial (I've been using Python too long to feel that I need a tutorial any more, although the quality of my code probably says otherwise :-))
While I hesitate to suggest a change of such magnitude, there's something to recommend the old IBM mainframe approach of separating out "Principles of Operation" (which would be the reference manuals, in Python's case the Language and Library refs) from "Users' Guide" which contains the practical stuff you need to actually make use of a product.
I've always found it rather counter-intuitive that you have to go to the Library Reference manual to find information about Python's built-in types, for example. I though the whole point of libraries was that they *aren't* built in, and represent baggage that should only be carried on necessary trips.
You speak my mind. For ages I've wanted to put the builtins together with the language reference into a new document called "Python Core Language". I've just never had the time to draft a serious proposal.
I believe with 3.0 the biggest improvement we could make to the language for newcomers would be to reorganize our documentation so that things live in the places they belong rather than the place they landed and got stuck over time.
I fully agree. Georg
Georg Brandl writes:
You speak my mind. For ages I've wanted to put the builtins together with the language reference into a new document called "Python Core Language". I've just never had the time to draft a serious proposal.
I think that combination is reasonable, but I would like to see the clear division between the language (ie, the syntax) and the built-in functionality maintained. I'm not sure I like the proposed title for that reason.
On Tue, Mar 4, 2008 at 3:13 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Georg Brandl writes:
You speak my mind. For ages I've wanted to put the builtins together with the language reference into a new document called "Python Core Language". I've just never had the time to draft a serious proposal.
I think that combination is reasonable, but I would like to see the clear division between the language (ie, the syntax) and the built-in functionality maintained. I'm not sure I like the proposed title for that reason.
Such a division would make it unnecessarily hard to find documentation on True, False, None, etc. They've become keywords for pragmatic purposes (to prevent accidental modification), not because we think they ideally should be syntax instead of builtins. -- Adam Olsen, aka Rhamphoryncus
Adam Olsen wrote:
Such a division would make it unnecessarily hard to find documentation on True, False, None, etc. They've become keywords for pragmatic purposes (to prevent accidental modification), not because we think they ideally should be syntax instead of builtins.
Maybe the solution is to rename the Library Reference to the Class and Module Reference or something like that. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing@canterbury.ac.nz +--------------------------------------+
Greg Ewing wrote:
Adam Olsen wrote:
Such a division would make it unnecessarily hard to find documentation on True, False, None, etc. They've become keywords for pragmatic purposes (to prevent accidental modification), not because we think they ideally should be syntax instead of builtins.
Maybe the solution is to rename the Library Reference to the Class and Module Reference or something like that.
Although DRY is fine as a programming principle, it fails for pedagogic purposes. We should therefore be prepared to repeat the same material in different contexts (hopefully by including some common documentation source rather than laborious and error-prone copy-and-paste). Document things where people expect to find them. (Now *there's* a usability study screaming to be done ... and SoC is coming up). regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/
On Tue, Mar 4, 2008 at 5:04 PM, Steve Holden <steve@holdenweb.com> wrote:
Greg Ewing wrote:
Adam Olsen wrote:
Such a division would make it unnecessarily hard to find documentation on True, False, None, etc. They've become keywords for pragmatic purposes (to prevent accidental modification), not because we think they ideally should be syntax instead of builtins.
Maybe the solution is to rename the Library Reference to the Class and Module Reference or something like that.
Although DRY is fine as a programming principle, it fails for pedagogic purposes. We should therefore be prepared to repeat the same material in different contexts (hopefully by including some common documentation source rather than laborious and error-prone copy-and-paste).
Document things where people expect to find them. (Now *there's* a usability study screaming to be done ... and SoC is coming up).
Python's usage of import makes it clear when something is imported from a library, as opposed to being an integral part of the language. To me, this is an obvious basis on whether to look in the language reference or in the stdlib reference. That said, it would be useful to also have a link for major builtin types in the stdlib section, if only for people who learned to look for them there. -- Adam Olsen, aka Rhamphoryncus
Adam Olsen writes:
On Tue, Mar 4, 2008 at 3:13 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
I would like to see the clear division between the language (ie, the syntax) and the built-in functionality maintained. I'm not sure I like the proposed title for that reason.
Such a division would make it unnecessarily hard to find documentation on True, False, None, etc. They've become keywords for pragmatic purposes (to prevent accidental modification), not because we think they ideally should be syntax instead of builtins.
This is Python; of course practicality beats purity. I have no problem with putting some keywords in the "built-in functionality" section, or even (boggle) duplicate them across the two sections. I too was put off by the separation of syntax from built-in functionality when I first started using the documentation, but later I came to appreciate it. I'm a relatively casual user of Python, and having a spare "syntax" section has made it much easier to learn new syntax such as comprehensions and generators. I suspect it will make it a lot easier to learn the differences between Python 2 and Python 3, too. I do not want to lose that. I don't pretend to be speaking for anyone else, but I'd be surprised if I were unique.<wink>
On Tue, Mar 4, 2008 at 8:03 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Adam Olsen writes:
On Tue, Mar 4, 2008 at 3:13 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
I would like to see the clear division between the language (ie, the syntax) and the built-in functionality maintained. I'm not sure I like the proposed title for that reason.
Such a division would make it unnecessarily hard to find documentation on True, False, None, etc. They've become keywords for pragmatic purposes (to prevent accidental modification), not because we think they ideally should be syntax instead of builtins.
This is Python; of course practicality beats purity. I have no problem with putting some keywords in the "built-in functionality" section, or even (boggle) duplicate them across the two sections.
-1 on duplicating anything. Provide links to a single location instead. Otherwise you end up with two explanations of the same thing, with different wording and subtle differences or missing details.
I too was put off by the separation of syntax from built-in functionality when I first started using the documentation, but later I came to appreciate it. I'm a relatively casual user of Python, and having a spare "syntax" section has made it much easier to learn new syntax such as comprehensions and generators. I suspect it will make it a lot easier to learn the differences between Python 2 and Python 3, too. I do not want to lose that.
I learned them through third-party docs. Even now I have a hard time finding list comprehension/generator expression in the docs. Apparently they're in the Expression section, under "Displays for lists, sets and dictionaries", and neither that nor anything with list comprehension or generator expression is in the index. The term "Displays" is pretty obscure as well, not something I've seen used besides on this list or right there in the documentation.
I don't pretend to be speaking for anyone else, but I'd be surprised if I were unique.<wink>
Your experiences *shouldn't* be unique, but I'm afraid they might be. Another example is the use of BNF, which although dominant in its field, it provides a steep learning curve for most programmers. I'm afraid this has turned into a rant, but it should be seen as the experiences of someone who relies on the documentation a great deal. Is there a better way to channel feedback on the documentation? -- Adam Olsen, aka Rhamphoryncus
Adam Olsen schrieb:
I don't pretend to be speaking for anyone else, but I'd be surprised if I were unique.<wink>
Your experiences *shouldn't* be unique, but I'm afraid they might be. Another example is the use of BNF, which although dominant in its field, it provides a steep learning curve for most programmers.
We could of course accompany each BNF-described item with an example.
I'm afraid this has turned into a rant, but it should be seen as the experiences of someone who relies on the documentation a great deal. Is there a better way to channel feedback on the documentation?
There's of course the doc-SIG, but it's just another mailing list. In any case, the best way to channel feedback is to provide a patch <wink> Georg
Georg Brandl wrote:
Adam Olsen schrieb:
Another example is the use of BNF, which although dominant in its field, it provides a steep learning curve for most programmers.
We could of course accompany each BNF-described item with an example.
An alternative to BNF would be syntax diagrams. They're just as formal, and can be a lot easier to read. -- Greg
Georg Brandl wrote:
Steve Holden schrieb:
Paul Moore wrote:
On 04/03/2008, Nick Coghlan <ncoghlan@gmail.com> wrote:
Do we need a new appendix to the tutorial which goes into detail about the CPython interpreter's command line options, environment variables and details on what can be executed? There is a Python man page, which covers the command line usage. However, it's separate from the documentation, and it isn't bundled with the Windows installers - both of which are a real pain (for me, at least).
I'd suggest taking the man page, adding the information about executing zip files and directories, and putting the whole lot into the formal documentation.
Look no further: http://docs.python.org/dev/using/cmdline.html
Thanks Georg, that looks like exactly the right place - I'll try to get that updated before the next alpha.
I've always found it rather counter-intuitive that you have to go to the Library Reference manual to find information about Python's built-in types, for example. I though the whole point of libraries was that they *aren't* built in, and represent baggage that should only be carried on necessary trips.
You speak my mind. For ages I've wanted to put the builtins together with the language reference into a new document called "Python Core Language". I've just never had the time to draft a serious proposal.
I borrowed the keys to Guido's time machine: http://svn.python.org/view/sandbox/trunk/userref/ It hasn't been updated for a lot of the 2.6/3.0 features as yet, but it may be a decent basis for what you're considering here. (and all credit to this thread for motivating me to actually get those files cleaned up and into the sandbox - I've been thinking about doing it for ages, but never got around to it). (For MS Office users, you may need to get OpenOffice.org or similar in order to read the Open Document Format files) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org
Nick Coghlan schrieb:
Georg Brandl wrote:
Steve Holden schrieb:
Paul Moore wrote:
On 04/03/2008, Nick Coghlan <ncoghlan@gmail.com> wrote:
Do we need a new appendix to the tutorial which goes into detail about the CPython interpreter's command line options, environment variables and details on what can be executed? There is a Python man page, which covers the command line usage. However, it's separate from the documentation, and it isn't bundled with the Windows installers - both of which are a real pain (for me, at least).
I'd suggest taking the man page, adding the information about executing zip files and directories, and putting the whole lot into the formal documentation.
Look no further: http://docs.python.org/dev/using/cmdline.html
Thanks Georg, that looks like exactly the right place - I'll try to get that updated before the next alpha.
Great! Feel free to add anything you think would make it a more complete document.
I've always found it rather counter-intuitive that you have to go to the Library Reference manual to find information about Python's built-in types, for example. I though the whole point of libraries was that they *aren't* built in, and represent baggage that should only be carried on necessary trips.
You speak my mind. For ages I've wanted to put the builtins together with the language reference into a new document called "Python Core Language". I've just never had the time to draft a serious proposal.
I borrowed the keys to Guido's time machine:
http://svn.python.org/view/sandbox/trunk/userref/
It hasn't been updated for a lot of the 2.6/3.0 features as yet, but it may be a decent basis for what you're considering here.
(and all credit to this thread for motivating me to actually get those files cleaned up and into the sandbox - I've been thinking about doing it for ages, but never got around to it).
Thanks, I'll certainly look at them! Georg
On Tue, Mar 04, 2008 at 10:35:42PM +1000, Nick Coghlan wrote:
not needing an explicit interpreter option makes it more shebang friendly
Sorry, I missed something here. How does one combine a zipfile with a shebang script?! Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.
Oleg Broytmann wrote:
On Tue, Mar 04, 2008 at 10:35:42PM +1000, Nick Coghlan wrote:
not needing an explicit interpreter option makes it more shebang friendly
Sorry, I missed something here. How does one combine a zipfile with a shebang script?!
Very carefully ;) As a more helpful answer, the ZIP spec allows additional data to be included in the file before the ZIP header. A more common way of using this is to add a zip file on to the end of an ELF executable while still using normal zipfile utilities to read the data in the zip file section and ignore the executable part. It turns out you can actually use the same trick to prepend a shebang line like "/usr/bin/env python" and a newline character - the whole zip file is still a binary file, but that doesn't prevent the shell from reading that first line of text and handing the file over to Python for execution. The fact that this actually works was also news to me when the issue I linked in my previous post was first brought to my attention :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org
On Wed, Mar 05, 2008 at 12:14:04AM +1000, Nick Coghlan wrote:
As a more helpful answer, the ZIP spec allows additional data to be included in the file before the ZIP header. A more common way of using this is to add a zip file on to the end of an ELF executable while still using normal zipfile utilities to read the data in the zip file section and ignore the executable part.
It turns out you can actually use the same trick to prepend a shebang line like "/usr/bin/env python" and a newline character
That's what I thought, too.
- the whole zip file is still a binary file, but that doesn't prevent the shell from reading that first line of text and handing the file over to Python for execution.
Unix doesn't distinguish text and binary files. (-:
The fact that this actually works was also news to me when the issue I linked in my previous post was first brought to my attention :)
So it really works? Amazing! Thank you! Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.
At 05:40 PM 3/4/2008 +0300, Oleg Broytmann wrote:
On Wed, Mar 05, 2008 at 12:14:04AM +1000, Nick Coghlan wrote:
As a more helpful answer, the ZIP spec allows additional data to be included in the file before the ZIP header. A more common way of using this is to add a zip file on to the end of an ELF executable while still using normal zipfile utilities to read the data in the zip file section and ignore the executable part.
It turns out you can actually use the same trick to prepend a shebang line like "/usr/bin/env python" and a newline character
That's what I thought, too.
- the whole zip file is still a binary file, but that doesn't prevent the shell from reading that first line of text and handing the file over to Python for execution.
Unix doesn't distinguish text and binary files. (-:
The fact that this actually works was also news to me when the issue I linked in my previous post was first brought to my attention :)
So it really works? Amazing!
Setuptools has been distributed this way for some time: http://pypi.python.org/pypi/setuptools#cygwin-mac-os-x-linux-other It actually contains an entire shell script prefix that launches Python and invokes an entry point inside the egg. With the new interpreter capability, this would've been a *lot* simpler to implement.
participants (11)
-
A.M. Kuchling
-
Adam Olsen
-
Frank Wierzbicki
-
Georg Brandl
-
Greg Ewing
-
Nick Coghlan
-
Oleg Broytmann
-
Paul Moore
-
Phillip J. Eby
-
Stephen J. Turnbull
-
Steve Holden