[Catalog-sig] PyPI - Evolve our own or reuse existing package systems?

Wed Aug 15 02:15:48 CEST 2007

(Since my email was a bit long and wide I'm trying to update the  
subject when the response is rather focused.)

On Aug 15, 2007, at 06:37, Paul Boddie wrote:
> Bjørn Stabell wrote:
[...]
> I've been moderately negative about evolving a parallel  
> infrastructure to
> other package and dependency management systems in the past, and  
> I'm not
> enthusiastic about things like CPAN or language-specific  
> equivalents. The
> first thing most people using a GNU/Linux or *BSD distribution are  
> likely to
> wonder is, "Where are the Python packages in my package selector?"
>
> There are exceptions, of course. Some people may be sufficiently  
> indoctrinated
> in the ways of Python, which I doubt is the case for a lot of  
> people looking
> for packages. Others may be working in restricted environments  
> where system
> package management tools don't really help. And people coming from  
> Perl might
> wonder where the CPAN equivalent is, but they should also remind  
> themselves
> what the system provides - they have manpages for Perl, after all.
[...]
> I've read through the text that I've mercilessly cut from this  
> response, and I
> admire the scope of this effort, but I do wonder whether we  
> couldn't make use
> of existing projects (as others have noted), and not only at the
> Python-specific level, especially since the user interface to the  
> "egg" tool
> seems to strongly resemble other established tools - as you seem to  
> admit in
> this and later messages, Bjørn.
[...]
> I was thinking of re-using the Debian indexing strategy. It's very  
> simple,
> perhaps almost quaintly so, but a lot of the problems revealed with  
> the
> current strategies around PyPI (not exactly mitigated by bizarre  
> tool-related
> constraints) could be solved by adopting existing well-worn  
> techniques.
[...]
> If I recall correctly, the PEP concerned just "bailed" on the version
> numbering and dependency management issue, despite seeming to be  
> inspired by
> Debian or RPM-style syntax.
[...]
> As I've said before, it's arguably best to work with whatever is  
> already
> there, particularly because of the "interface" issue you mention with
> non-Python packages. I suppose the apparent lack of an open and  
> widespread
> package/dependency management system on Windows (and some UNIX  
> flavours) can
> be used as a justification to write something entirely new, but I  
> imagine
> that only very specific tools need writing in order to make existing
> distribution mechanisms work with Windows - there's no need to  
> duplicate
> existing work from end to end "just because".
[...]
> Agreed. And by adopting existing mechanisms, we can hopefully avoid  
> having to
> reinvent their feature sets, too.
>
> P.S. Sorry if this sounds a bit negative, but I've been reading the  
> archives
> of the catalog-sig for a while now, and it's a bit painful reading  
> about how
> sensitive various projects are to downtime in PyPI, how various  
> workarounds
> have been devised with accompanying whisper campaigns to tell  
> people where
> unofficial mirrors are, all whilst the business of package  
> distribution
> continues uninterrupted in numerous other communities.
>
> If I had a critical need to get Python packages directly from their  
> authors to
> run on a Windows machine, for example, I'd want to know how to do  
> so via a
> Debian package channel or something like that. This isn't original  
> thought:
> I'm sure that Ximian Red Carpet and Red Hat Network address many  
> related
> issues.

There seems to be two issues:

1) Should Python have its own package management system (with  
dependencies etc) in parallel with what's already on many platforms  
(at least Linux and OS X)?  Anyone that has worked with two parallel  
package management systems knows that dependencies are hellish.

   * If you mix and match you often end up with two of everything.

   * It'll be incomplete because you can't easily specify  
dependencies to non-Python packages.

2) If we agree Python should have a package management system, should  
we build or repurpose some other one?

   * I think it's a matter of pride and proof of concept to have one  
written in Python.  That doesn't mean we can't get ideas from others.

   * It's also not that hard to do.  The prototype I threw up took  
one weekend + half a day, and consists of about 500 lines of new  
code.  It could be refactored and made smaller, but even if a  
complete version is ten times the size of that, it's still not a huge  
undertaking.

   * With a Python version we could relatively easily innovate beyond  
what traditional packaging systems do; ports and apt are pretty much  
stagnated.  I think RubyGems seems to have some cool features,  
features that probably wouldn't have happened if they were using  
ports or apt-get (but then they could piggyback on innovations in  
those tools, I guess).  If it works for them, why shouldn't it work  
for us?

   * It would have to be as portable as Python is; many packaging  
systems are by nature relatively platform-specific.

   * If we don't build our own, doesn't that mean we throw out eggs?

   * Packaging systems are useful for mega frameworks like Zope,  
TurboGears, and Django, and slightly less so for projects you roll on  
your own, to manage distribution and installation of plugins and  
addons.  Relying on platform-specific packaging systems for these may  
not work that well.  (But I could be wrong about that.)

That said, it might be possible to do some kind of hybrid, for PyPI  
to be a "meta package" repository that can easily feed into platform  
specific packaging systems.  And to perhaps also have a client-side  
"meta package manager" that will call upon the platform-specific  
package manager to install stuff.

It looks like, for example, ports have targets to build to other  
systems, e.g., pkg, mpkg, dmg, rpm, srpm, dpkg.  So maintaining  
package information in (or compatible with) ports could make it easy  
to feed packages into other package systems.

   * Benefit: We're working with other package systems, just making  
it easier to get Python packages into them.

   * Drawback: They may not want to include all packages, at the  
speed at which we want, or the way we want to.  (I.e., there may  
still be packages you'd want that are only available on PyPI.)

   * Drawback: Some systems don't have package systems.

Which brings me to: If we're just distributing source files why don't  
we use a source control system such as svn, bzr, or hg?  The package  
developers have trunk, PyPI is a branch, the platform-specific  
package maintainers have a branch, and what's installed onto your  
system is in the end a branch (serially connected).  Some systems,  
like Subversion, can also include externals like I did with cliutils  
on the egg package.  Just a thought.

Rgds,
Bjorn