Re: [Distutils] [Python-Dev] PEP 365 (Adding the pkg_resources module)

At 03:43 PM 3/18/2008 -0500, Guido van Rossum wrote:
Only very few people would care about writing a setup script that works with this bootstrap module; basically only package manager implementers.
That's true today, sure, but as soon as it is widely available, others are sure to want to use it too. I just want a bright-line distinction between what is and isn't bootstrappable, rather than a murky region of "maybe, if you're not doing anything too complicated".
There seems to be a misunderstanding about what I am proposing we do instead. The boostrap installer should only be powerful enough to allow it to be used to install a real package manager like setuptools.
Which is why PEP 365 proposed only downloading an archive to a cache directory, and optionally running something from it. It explicitly disavows "installation" of anything, since the downloaded archive wouldn't have been added to sys.path except for the duration of the bootstrap process, and no scripts were to be installed. (Indeed, apart from the methods it would have used to locate the archive on PyPI, and to determine what to run from inside it, there was nothing particularly egg-specific about the proposed bootstrapping process.)
So, to fully egg-neutralize the bootstrapping approach, we need only know how to locate an appropriate archive, and how to determine what to run from it.
For the latter, we could use the already-in-2.6 convention of running __main__ from a zipfile or directory. (Too bad distutils source distributions have an extra directory name embedded in them, so one can't just execute them directly. Otherwise, we could've just let people drop in a __main__.py next to setup.py. OTOH, maybe it would be enough to use setuptools' algorithm for finding setup.py to locate __main__.py, and I'm fairly sure *that* can be briefly expressed in the PEP.)
The other open question is a naming convention and version detection, so that the bootstrap tool can identify which of the files listed on PyPI is suitable for its use. (Both with regard to the version selection, and file type.) However, if PyPI were to grow support for designating the appropriate files and/or versions in some other way, we wouldn't need a naming convention as such.
Without one or the other, the bootstrap tool would have to grow a version parsing scheme of some type, and play guessing games with file extensions. (Which is one reason I limited PEP 365's scope to downloading eggs actually *uploaded* to PyPI, rather than arbitrary packages *linked* from PyPI.)
So, if I had to propose something right now, I would be inclined to propose:
* using setuptools' version parsing semantics for interpretation of alpha/beta/dev/etc. releases
* having a bdist_bootstrap format that's essentially a bdist_dumb .zip file with the internal path prefixes stripped off, making it an importable .zip with a different file extension. (Or maybe just .pyboot.zip?) The filename convention would use setuptools' canonicalization and escaping of names and version numbers, to allow unambiguous machine parsing of the filename. A __main__ module would have to be present for the archive to be run, as opposed to just being downloaded to a temporary directory.
* calling the bootstrap module 'bootstrap', as in 'python -m bootstrap projectname optionalversion'. The module would expose an API to allow it to be used programmatically as well as the command line, so that bootstrapped packages can use the bootstrap process to locate dependencies if they so desire. (Today's package management tools, at least, are all based on setuptools, so if it's not present they'll need to download that before beginning their own bootstrapping process.)
Apart from keeping the PEP self-contained and short, is there anything in this that you think you would object to? (You may reserve the right, of course, to later not like something in the details of setuptools' version/filename rules, after I've put them into the PEP, or really, anything else. I'm just asking if there's anything that's obviously offensive at this point, before I spend time on a new PEP.)

On Tue, Mar 18, 2008 at 3:36 PM, Phillip J. Eby pje@telecommunity.com wrote:
At 03:43 PM 3/18/2008 -0500, Guido van Rossum wrote:
Only very few people would care about writing a setup script that works with this bootstrap module; basically only package manager implementers.
That's true today, sure, but as soon as it is widely available, others are sure to want to use it too. I just want a bright-line distinction between what is and isn't bootstrappable, rather than a murky region of "maybe, if you're not doing anything too complicated".
How about "anything that uses only distutils in its setup.py and doesn't have external dependencies"? See a (horribly incomplete) prototype I added as sandbox/bootstrap/bootstrap.py. I wrote this on the plane last night and have only tested it with file:/// URLs; it needs to add the ability to consult PyPI to find the download URL, and probably more. (PS: just now I also managed to successfully install setuptools from source by giving it the URL to the gar.gz file.)
There seems to be a misunderstanding about what I am proposing we do instead. The boostrap installer should only be powerful enough to allow it to be used to install a real package manager like setuptools.
Which is why PEP 365 proposed only downloading an archive to a cache directory, and optionally running something from it. It explicitly disavows "installation" of anything, since the downloaded archive wouldn't have been added to sys.path except for the duration of the bootstrap process, and no scripts were to be installed. (Indeed, apart from the methods it would have used to locate the archive on PyPI, and to determine what to run from inside it, there was nothing particularly egg-specific about the proposed bootstrapping process.)
My bootstrap.py does exactly that: it downloads and unzips/untars a file and runs its setup.py with "install" as the only command line argument. (It currently looks for setup.py at the toplevel and one level deep in the unpacked archive.) Of course you will likely have to be root or administrator to run it effectively.
So, to fully egg-neutralize the bootstrapping approach, we need only know how to locate an appropriate archive, and how to determine what to run from it.
Right.
For the latter, we could use the already-in-2.6 convention of running __main__ from a zipfile or directory. (Too bad distutils source distributions have an extra directory name embedded in them, so one can't just execute them directly. Otherwise, we could've just let people drop in a __main__.py next to setup.py. OTOH, maybe it would be enough to use setuptools' algorithm for finding setup.py to locate __main__.py, and I'm fairly sure *that* can be briefly expressed in the PEP.)
What's wrong with just running "setup.py install"? I'd rather continue existing standards / conventions. Of course, it won't work when setup.py requires setuptools; but "old style" setup.py files that use only distutils work great (I managed to install Django from a file:/// URL).
The other open question is a naming convention and version detection, so that the bootstrap tool can identify which of the files listed on PyPI is suitable for its use. (Both with regard to the version selection, and file type.) However, if PyPI were to grow support for designating the appropriate files and/or versions in some other way, we wouldn't need a naming convention as such.
I don't understand PyPI all that well; it seems poor design that the browsing via keywords is emphasized but there is no easy way to *search* for a keyword (the list of all packages is not emphasized enough on the main page -- it occurs in the side bar but not in the main text). I assume there's a programmatic API (XML-RPC?) but I haven't found it yet.
Without one or the other, the bootstrap tool would have to grow a version parsing scheme of some type, and play guessing games with file extensions. (Which is one reason I limited PEP 365's scope to downloading eggs actually *uploaded* to PyPI, rather than arbitrary packages *linked* from PyPI.)
There are two version parsers in distutils, referenced by PEP 345, the PyPI 1.2 metadata standard.
So, if I had to propose something right now, I would be inclined to propose:
- using setuptools' version parsing semantics for interpretation of
alpha/beta/dev/etc. releases
Can you point me to the code for this? What is its advantage over distutils.version?
- having a bdist_bootstrap format that's essentially a bdist_dumb
.zip file with the internal path prefixes stripped off, making it an importable .zip with a different file extension. (Or maybe just .pyboot.zip?) The filename convention would use setuptools' canonicalization and escaping of names and version numbers, to allow unambiguous machine parsing of the filename. A __main__ module would have to be present for the archive to be run, as opposed to just being downloaded to a temporary directory.
Hm. Why not just use the existing convention for running setup.py after unpacking? This works great in my experience, and has the advantage of having an easy fallback if you end up having to do this manually for whatever reason.
- calling the bootstrap module 'bootstrap', as in 'python -m
bootstrap projectname optionalversion'. The module would expose an API to allow it to be used programmatically as well as the command line, so that bootstrapped packages can use the bootstrap process to locate dependencies if they so desire. (Today's package management tools, at least, are all based on setuptools, so if it's not present they'll need to download that before beginning their own bootstrapping process.)
This sounds like going beyond bootstrapping. My vision is that you use the bootstrap module (with the command line you suggest above) once to install setuptools or the alternate package manager of your choice, and then you can use easy_install (or whatever alternative) to install the rest.
Apart from keeping the PEP self-contained and short, is there anything in this that you think you would object to? (You may reserve the right, of course, to later not like something in the details of setuptools' version/filename rules, after I've put them into the PEP, or really, anything else. I'm just asking if there's anything that's obviously offensive at this point, before I spend time on a new PEP.)
I'd love it if you could write or point me to code that takes a package name and optional version and returns the URL for the source archive, and the type (in case it can't be guessed from the filename or the Content-type header).

I don't understand PyPI all that well; it seems poor design that the browsing via keywords is emphasized but there is no easy way to *search* for a keyword (the list of all packages is not emphasized enough on the main page -- it occurs in the side bar but not in the main text).
I don't understand. What is "browsing via keywords" and how is that emphasized? (one I know that, I can look into ways for searching for keywords)
I assume there's a programmatic API (XML-RPC?) but I haven't found it yet.
The recommended "programmatic" API is
http://pypi.python.org/simple/
Not sure what you were trying to achieve programmatically; "typically" people know what they want to install (e.g. "threadedcomments"), and then the tool goes directly to
http://pypi.python.org/simple/threadedcomments/
Regards, Martin

I was using the human interface at python.org/pypi. There are two prominent links at the top of the page: "Browse the tree of packages" and "Submit package information" followed by the 30 most recently changed packages. What I was looking for was the page for a specific package. The "Browse the tree of packages" link was no help. Finally I realized that in the side bar, in a small unobtrusive font, is a link to "List packages" which links to a list of *all* packages, in alphabetical order. I found my package there. I think repeating that link right below "browse the tree" would have been sufficient. But it would have been cool if there had been a search box (also in the start page) where I could type (part of) the name of the package and it would have given me the nearest matches.
On Wed, Mar 19, 2008 at 6:05 PM, "Martin v. Löwis" martin@v.loewis.de wrote:
I don't understand PyPI all that well; it seems poor design that the browsing via keywords is emphasized but there is no easy way to *search* for a keyword (the list of all packages is not emphasized enough on the main page -- it occurs in the side bar but not in the main text).
I don't understand. What is "browsing via keywords" and how is that emphasized? (one I know that, I can look into ways for searching for keywords)
I assume there's a programmatic API (XML-RPC?) but I haven't found it yet.
The recommended "programmatic" API is
http://pypi.python.org/simple/
Not sure what you were trying to achieve programmatically; "typically" people know what they want to install (e.g. "threadedcomments"), and then the tool goes directly to
http://pypi.python.org/simple/threadedcomments/
Regards, Martin

Guido van Rossum schrieb:
I was using the human interface at python.org/pypi. There are two prominent links at the top of the page: "Browse the tree of packages" and "Submit package information" followed by the 30 most recently changed packages.
Ah, ok. In PyPI parlance, these are "classifiers" (Trove classifiers, although the word "trove" means nothing to me), not keywords. They are different from keywords in the sense that they form a hierarchy.
I personally consider trove classifiers over-valued, but apparently, some people really love them (probably the ones who are more organized than I am). Developers continuously request addition of new classifiers; I don't have any statistics whether users actually use them to locate stuff.
What I was looking for was the page for a specific package. The "Browse the tree of packages" link was no help. Finally I realized that in the side bar, in a small unobtrusive font, is a link to "List packages" which links to a list of *all* packages, in alphabetical order. I found my package there. I think repeating that link right below "browse the tree" would have been sufficient.
I can't change that right now, but created
http://sourceforge.net/tracker/index.php?func=detail&aid=1921108&gro...
But it would have been cool if there had been a search box (also in the start page) where I could type (part of) the name of the package and it would have given me the nearest matches.
Did you try the search box in the top-right, and did did not work?
What search term did you enter, and what package did you expect to get?
Regards, Martin

although the word "trove" means nothing to me
http://www.askoxford.com/concise_oed/trove?view=uk
"a store of valuable or delightful things"
Bill

-On [20080320 15:29], "Martin v. Löwis" (martin@v.loewis.de) wrote:
(Trove classifiers, >although the word "trove" means nothing to me)
Isn't that something lifted from SourceForge?
participants (5)
-
"Martin v. Löwis"
-
Bill Janssen
-
Guido van Rossum
-
Jeroen Ruigrok van der Werven
-
Phillip J. Eby