[Catalog-sig] Rewrite PyPI for App Engine?

Ian Bicking ianb at colorstudy.com
Fri Jun 18 18:44:25 CEST 2010

With all the reliability discussion, I thought I'd offer a kind of
counterproposal, that we rewrite PyPI to use App Engine.

Of course, this means writing code, etc., but I believe this is a reasonable
goal.  I think if "we" (Catalog-SIG?  PyPI maintainers?) committed to using
such an implementation (assuming it was of good quality) that we could find
people (probably not on this list) to write and maintain the code.  People
have already rewritten PyPI a couple times, but no one knows what exactly to
*do* with the rewrite so they haven't gone anywhere.  And PyPI is not a
particularly complicated application.  I think we can set the bar high on
the implementation quality and that people will meet it, so long as they
know their effort won't be in vain.

Why App Engine?  The primary reason I'm proposing it is because it will be
much easier to manage.  If it runs out of memory it won't bring down a
machine.  If new people maintain the system it's easy to describe how to do
deployments, for instance.  It's easy for people to install their own PyPI
instances for development and to generate patches.  Hosted services can have
downtimes of course, but unlike currently there are other people (the App
Engine maintainers) who will resolve those problems.  There's still a class
of bugs like badly indexed tables or weird locking issues that could bring
PyPI down and "we" would have to fix it, and with a rewrite there's more of
a risk of that, but... it'll just take some testing to make sure things are

In terms of cost, I expect we can get free hosting, and packages can be
stored directly in the data store.  That doesn't preclude using a CDN like
CloudFront, but that can be handled separately.  Also since the index just
links to packages, packages can be incrementally uploaded to a CDN.

Besides a commitment to using the code (which I think is really important to
motivate people), a scrubbed dump of the database would be really helpful
for development.  I know we've passed around complete dumps to people, but
it contains private information so we can't put it up publicly which creates
a speed bump for developers.

A buzz post where I asked about it:

A PyPI *mirror* written for App Engine:

A PyPI implementation in Django (one is a fork of the other?),
database-backed (would take some work to get it on App Engine):

Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100618/fca35e2c/attachment.html>

More information about the Catalog-SIG mailing list