[Catalog-sig] PyPI improvements
Richard Jones
richardjones at optushome.com.au
Tue Jun 15 23:30:40 EDT 2004
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Wednesday 16 Jun 2004 11:53, Ian Bicking wrote:
> Howdy. I just recently posted some ideas for PyPI
> (http://blog.colorstudy.com/ianb/weblog/2004/06/15.html#P123)
I commented there, but I might repeat some of my comments here too where
appropriate.
> 1. Express relationships between packages. These are relationships
> like alternative-implementation, fork, part-of, recommends, requires,
> etc. At the moment I'm thinking purely about displaying this
> information, not any fancy distutils magic installation of
> dependencies.
There's been a number of proposals and I believe some code towards
implementing this kind of meta-data capture.
The two extensions to distutils dealing with this issue that I know of are
PIMP (/PackMan) and the ZPKG tools:
http://undefined.org/python/pimp/
http://www.python.org/packman/
(couldn't find a page giving the technical details of PIMP)
http://zope.org/Members/fdrake/zpkgtools/
(this page has a good list of links to prior discussions / proposals)
Various proposals have also been made on this list. I have no idea how related
those projects are. It would be a shame to develop *another* system.
> 2. Cache packages. I.e., download a copy of the package, and if the
> package disappears then we have a backup.
The disappearance of packages is a concern. An archive network would solve
this issue, but it requires both organisation and support from hosts. I'm
pretty sure the current python.org machine is not suitable for storing
packages.
> The other thing that might be useful is some improved categorization of
> code. The Trove categories are... well, they are stupid. No fault of
> anyone here. CPAN's much more coarsely-grained categories are much
> better, in my opinion (Acme, AI, Algorithm, Apache, AppConfig, Archive,
> Array, and so on: http://www.cpan.org/modules/by-module
The current Trove list may be extended - I simply drew on the two best-known
lists: sourceforge and freshmeat.
What's the "Acme" category hold? :)
> But even more coarsely-grained than that, there are classes of package.
> Right now we have libraries and applications.
PyPI doesn't make this distinction - though I believe it is a useful one.
> I'd like to add modules -- though the name is vague, I'm thinking of
> code on the sophisticated end of the Python Cookbook entries. Small,
> reusable, and not worth distutilifying
This sounds like a good idea, but raises a couple of issues:
1. Distutils isn't involed, but that's OK since PyPI allows TTW entry
of package meta-data.
2. PyPI currently makes no assumptions about what the download_url
points to. Would you advocate using the download_url for locating
the module source?
As I said in response to your weblog entry:
"PyPI is intended to be an index of metadata that is generated by distutils.
I'm not sure I'm comfortable extending that scope to include actual code
fragments. It would confuse the meta-data schema and user interfaces
considerably."
> When you're looking for code, each of these is quite different from the
> others -- for any search, you will probably be interested in any of
> these (a library to use, or a module or application to borrow from).
Yep. And note that some entries will span two (or all?) categories - Roundup,
for example, is both a library and an application.
> Right now we're neither here nor there, as people don't think to add
> applications to PyPI, and the trove categories are inappropriate for
> libraries.
I don't believe the categories as they stand are *that* useless!
> On top of this is the infrastructure issue, which probably also has to
> be dealt with before moving forward much (i.e., SQLite and CGI).
> Concurrent updates to a SQLite database from multiple processes scares
> the crap out of me. But it doesn't look like that should be too hard
> to fix.
As I said in response to your weblog entry:
"Finally, PyPI is bordering on being too large for the technologies it's built
on; sqlite will need to be replaced by postgresql some time soon and the
cgi.py-based web ui scales very poorly. Development such as you're proposing
would push those technologies over the edge :)"
On a separate topic, I believe it's pretty important that a document be
written that captures your intentions. A lot of ideas have floated around on
this list over the years - only to be subsequently forgotten because they're
lost in the list archive. Yes, I'm suggesting writing a PEP about it. That
way there's a single place someone can go to see the content and status of
the proposal.
Richard
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
iD8DBQFAz77grGisBEHG6TARAvUKAJ9Oh4oNtRzSLYmchYWwBdG2uYW2UQCdGHTU
ZIFY1pyM9iM+PM5iLTFOa3w=
=8/Tl
-----END PGP SIGNATURE-----
More information about the Catalog-sig
mailing list