[Catalog-sig] A first step at improving PyPI: the "egg" command
paul at boddie.org.uk
Wed Aug 15 00:37:57 CEST 2007
Bjørn Stabell wrote:
> Basically, the problems I would like to work on solving are:
> 1) Simplifying/enabling discovery of packages
> 2) Simplifying/enabling management of packages
> 3) Improving quality and usefulness of package index
I think we can all agree that these are noble objectives. :-)
> From a usability point-of view I'd like to focus on the requirements
> for the Python newbie, someone that has just discovered Python, but
> is probably used to package management systems from Linux
> distributions, FreeBSD, and other dynamic languages like Perl and
> Ruby (these are also the systems I have experience with, so I'm
> pulling ideas from them).
I've been moderately negative about evolving a parallel infrastructure to
other package and dependency management systems in the past, and I'm not
enthusiastic about things like CPAN or language-specific equivalents. The
first thing most people using a GNU/Linux or *BSD distribution are likely to
wonder is, "Where are the Python packages in my package selector?"
There are exceptions, of course. Some people may be sufficiently indoctrinated
in the ways of Python, which I doubt is the case for a lot of people looking
for packages. Others may be working in restricted environments where system
package management tools don't really help. And people coming from Perl might
wonder where the CPAN equivalent is, but they should also remind themselves
what the system provides - they have manpages for Perl, after all.
It's nice to see someone looking at existing tools, though.
> Ideally everything should be (following Steve Krug's "Don't Make Me
> Think" recommendations) self-evident, and if that's not possible, at
> least self-explanatory. Someone put in front of a keyboard without
> having read any docs should be able to find, install, manage, and
> perhaps even create Python packages. Better usability will of course
> benefit everyone, not just beginners. I'm frankly amazed at how
> people that have programmed Python for years don't really know or use
> PyPI. I'm convinced making more of Python package system
> discoverable and easily accessible will greatly improve the adoption
> of Python, the number of Python packages, and the quality of these
There are many people who don't know about other parts of the python.org
infrastructure besides PyPI, notably the Wiki. However, you have to take into
account communities which are not centred on python.org.
I've read through the text that I've mercilessly cut from this response, and I
admire the scope of this effort, but I do wonder whether we couldn't make use
of existing projects (as others have noted), and not only at the
Python-specific level, especially since the user interface to the "egg" tool
seems to strongly resemble other established tools - as you seem to admit in
this and later messages, Bjørn.
> PYPI IMPROVEMENT SUGGESTIONS
> While doing the application I discovered one important missing
> feature: PyPI doesn't offer a way to programatically bulk-download
> information about all eggs, as is customary for many other packaging
> systems. This means "egg sync" will have to fetch the information
> for each package individually. I think it wouldn't be hard to offer
> a compressed XML file with all of the package information, suitable
> for download.
I was thinking of re-using the Debian indexing strategy. It's very simple,
perhaps almost quaintly so, but a lot of the problems revealed with the
current strategies around PyPI (not exactly mitigated by bizarre tool-related
constraints) could be solved by adopting existing well-worn techniques.
> There's a lot of opportunity in improving the consistency and
> usefulness of package metainformation. Once you have it all sync'ed
> to a local SQlite database and start snooping around, it'll be pretty
> obvious; very few packages use the dependencies etc. (In fact, I
> think the dependencies/obsoletes definitions are overengineered; we
> could get by with just a simple package >= version number).
If I recall correctly, the PEP concerned just "bailed" on the version
numbering and dependency management issue, despite seeming to be inspired by
Debian or RPM-style syntax.
> Many people use other platform-specific packaging system to manage
> Python packages, probably both because this gives dependencies to
> other non-Python packages, but also because PyPI hasn't been very
> useful or easy to use. It may even be asked what the role of PyPI is
> since it's never going to replace platform-specific packaging
> systems; then should it support them? How? In any case, installing
> Python packages from different packaging systems would result in
> problems, and currently "egg" can't find Python packages installed
> using other systems. ("Yolk" has some support for discovering Python
> packages installed using Gentoo.)
As I've said before, it's arguably best to work with whatever is already
there, particularly because of the "interface" issue you mention with
non-Python packages. I suppose the apparent lack of an open and widespread
package/dependency management system on Windows (and some UNIX flavours) can
be used as a justification to write something entirely new, but I imagine
that only very specific tools need writing in order to make existing
distribution mechanisms work with Windows - there's no need to duplicate
existing work from end to end "just because".
> Optional: These days XMLRPC (and the WS-Deathstar) seems to be losing
> steam to REST, so I think we'd gain a lot of "hackability" by
> enabling a REST interface for accessing packages.
> Eventually we probably need to enforce package signing.
Agreed. And by adopting existing mechanisms, we can hopefully avoid having to
reinvent their feature sets, too.
P.S. Sorry if this sounds a bit negative, but I've been reading the archives
of the catalog-sig for a while now, and it's a bit painful reading about how
sensitive various projects are to downtime in PyPI, how various workarounds
have been devised with accompanying whisper campaigns to tell people where
unofficial mirrors are, all whilst the business of package distribution
continues uninterrupted in numerous other communities.
If I had a critical need to get Python packages directly from their authors to
run on a Windows machine, for example, I'd want to know how to do so via a
Debian package channel or something like that. This isn't original thought:
I'm sure that Ximian Red Carpet and Red Hat Network address many related
More information about the Catalog-SIG