[Catalog-sig] Rewrite PyPI for App Engine?

Michael Crute mcrute at gmail.com
Fri Jun 18 21:11:29 CEST 2010

On Fri, Jun 18, 2010 at 12:44 PM, Ian Bicking <ianb at colorstudy.com> wrote:
> With all the reliability discussion, I thought I'd offer a kind of
> counterproposal, that we rewrite PyPI to use App Engine.
> Of course, this means writing code, etc., but I believe this is a reasonable
> goal.  I think if "we" (Catalog-SIG?  PyPI maintainers?) committed to using
> such an implementation (assuming it was of good quality) that we could find
> people (probably not on this list) to write and maintain the code.  People
> have already rewritten PyPI a couple times, but no one knows what exactly to
> *do* with the rewrite so they haven't gone anywhere.  And PyPI is not a
> particularly complicated application.  I think we can set the bar high on
> the implementation quality and that people will meet it, so long as they
> know their effort won't be in vain.
> Why App Engine?  The primary reason I'm proposing it is because it will be
> much easier to manage.  If it runs out of memory it won't bring down a
> machine.  If new people maintain the system it's easy to describe how to do
> deployments, for instance.  It's easy for people to install their own PyPI
> instances for development and to generate patches.  Hosted services can have
> downtimes of course, but unlike currently there are other people (the App
> Engine maintainers) who will resolve those problems.  There's still a class
> of bugs like badly indexed tables or weird locking issues that could bring
> PyPI down and "we" would have to fix it, and with a rewrite there's more of
> a risk of that, but... it'll just take some testing to make sure things are
> okay.
> In terms of cost, I expect we can get free hosting, and packages can be
> stored directly in the data store.  That doesn't preclude using a CDN like
> CloudFront, but that can be handled separately.  Also since the index just
> links to packages, packages can be incrementally uploaded to a CDN.
> Besides a commitment to using the code (which I think is really important to
> motivate people), a scrubbed dump of the database would be really helpful
> for development.  I know we've passed around complete dumps to people, but
> it contains private information so we can't put it up publicly which creates
> a speed bump for developers.

I would very much like to see pypi start using chishop. I've been
working to implement the complete set of features that pypi supports
(including the mirroring PEP) for use inside of the company I work
for. The code is in reasonably good shape and I would love to see that
become the official implementation of PyPi. Though I haven't tested it
I don't see any reason that it wouldn't run on AppEngine with no
additional work.

Michael E. Crute

It is a mistake to think you can solve any major problem just with
potatoes. --Douglas Adams

More information about the Catalog-SIG mailing list