[Catalog-sig] Rewrite PyPI for App Engine?

Tarek Ziadé ziade.tarek at gmail.com
Sat Jun 19 01:27:39 CEST 2010

On Fri, Jun 18, 2010 at 6:44 PM, Ian Bicking <ianb at colorstudy.com> wrote:
> With all the reliability discussion, I thought I'd offer a kind of
> counterproposal, that we rewrite PyPI to use App Engine.
> Of course, this means writing code, etc., but I believe this is a reasonable
> goal.  I think if "we" (Catalog-SIG?  PyPI maintainers?) committed to using
> such an implementation (assuming it was of good quality) that we could find
> people (probably not on this list) to write and maintain the code.  People
> have already rewritten PyPI a couple times, but no one knows what exactly to
> *do* with the rewrite so they haven't gone anywhere.  And PyPI is not a
> particularly complicated application.  I think we can set the bar high on
> the implementation quality and that people will meet it, so long as they
> know their effort won't be in vain.

Out of curiosity : have you ever worked with the current implementation ?

I have hard time to understand why some people say it's hard to work with it,
I don't think its a valid argument.

> Why App Engine?  The primary reason I'm proposing it is because it will be
> much easier to manage.  If it runs out of memory it won't bring down a
> machine.  If new people maintain the system it's easy to describe how to do
> deployments, for instance.  It's easy for people to install their own PyPI
> instances for development and to generate patches.  Hosted services can have
> downtimes of course, but unlike currently there are other people (the App
> Engine maintainers) who will resolve those problems.  There's still a class
> of bugs like badly indexed tables or weird locking issues that could bring
> PyPI down and "we" would have to fix it, and with a rewrite there's more of
> a risk of that, but... it'll just take some testing to make sure things are
> okay.
> In terms of cost, I expect we can get free hosting, and packages can be
> stored directly in the data store.  That doesn't preclude using a CDN like
> CloudFront, but that can be handled separately.  Also since the index just
> links to packages, packages can be incrementally uploaded to a CDN.

Even if I don't think its a priority in our concerns (community
mirrors come first), I wouldn't mind having the main PyPI UI in GAE.

Although, if PyPI was to be ported to GAE, couldn't we reuse the
existing code instead of rewriting from scratch ? we would just have
to rewrite the DB layer.

> Besides a commitment to using the code (which I think is really important to
> motivate people), a scrubbed dump of the database would be really helpful
> for development.  I know we've passed around complete dumps to people, but
> it contains private information so we can't put it up publicly which creates
> a speed bump for developers.

Private information could be easily removed from those dumps;

But I don't think it's so helpful since you have all the .sql scripts to create
your own DB. But we could add a script to create some sample data on
the top of those scripts.

> Linkage...
> A buzz post where I asked about it:
> http://www.google.com/buzz/ianbicking/BRWDjsMCGWQ/I-like-the-original-proposal-move-PyPI-stuff-into
> A PyPI *mirror* written for App Engine:
> http://pypi.appspot.com/
> A PyPI implementation in Django (one is a fork of the other?),
> database-backed (would take some work to get it on App Engine):
> http://pypi.python.org/pypi/djangopypi/
> http://github.com/benliles/chishop
> --
> Ian Bicking  |  http://blog.ianbicking.org
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig

Tarek Ziadé | http://ziade.org

More information about the Catalog-SIG mailing list