[Catalog-sig] PyPI down again...

Mathieu Leduc-Hamel marrakis at gmail.com
Mon Jun 14 12:35:08 CEST 2010

To continue the discussion about a rewrite or a cleanup of the Pypi
codebase, I'm from Montreal-Python usergroup and I'm say that yes at the
first the current codebase of pypi seem to be very unclear and difficult to

But it's not an impossible mission and we are currently in the process of:

- Adding functional test. The test coverage is now around 40% percent.
- When we'll reach a more complete coverage, we want to replace the psycopg
api by SQLAlchemy
- Replace many manual manipulation of the metadata by a more robust and
straightforward way of dealing with (distutils2 might be the option there)

At first I was thinking about rewriting everything using the chishop project
(an implementation of PyPi using django). But having the control of the code
source and not dependent of any framework is maybe a better idea.

More than, despite the frequent outage, pypi is working today, then just a
modernization of code base seem to be best idea.

By the wat, after a code review of tarek, a very useful thing might be to
find a better way to deal and implement contributions coming from community.
Right now Tarek is responsible of making the link between our effert and the
work of Martin but we don't have any official public mirror of the source
code and any roadmap.

On Mon, Jun 14, 2010 at 11:27 AM, M.-A. Lemburg <mal at egenix.com> wrote:

> Tarek Ziadé wrote:
> > On Sun, Jun 13, 2010 at 3:11 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> > ...
> >>
> >> We've had some private discussions about this, so I'm just
> >> going to summarize...
> >>
> >> The idea here is not to override the mirror PEP ideas,
> >> but to use the existing PyPI installation and put the
> >> content needed for the most widely distributed package tool
> >> (currently setuptools and zc.buildout) on a content
> >> delivery network (CDN) in order to have it highly available
> >> on a managed edge network.
> >
> > I think it overlaps a bit the PEP goal, which is to set up a network of
> mirrors,
> > and have them listed in the PyPI DNS so clients can switch from one
> mirror
> > to another.(and even do geoloc!)
> >
> > Right now we already have "unofficial mirrors" and the idea of the PEP
> > would be to list them officially at PyPI and to have them collect the
> > stats so we cant count download hits.
> Note that the CDN does not mirror the content of PyPI, it
> just takes care of delivering the requested data to the
> various edge servers and caching it there for a while.
> This is a different concept than that of a full mirror that
> doesn't work like a cache, but instead provides a fully
> functional standalone server.
> I still think that the concept of being able to mirror PyPI
> servers is a useful one.
> >> Amazon Cloudfront is such a CDN and has Python interfaces,
> >> hence the idea to use Cloudfront.
> >>
> >> I asked for volunteers, because I didn't know enough about
> >> Amazon Cloudfront to write up a proposal and don't have
> >> the cycles available to implement such a setup myself.
> >>
> >> In the meantime, I've done some research and now know
> >> enough to write a proposal for the PSF board to consider.
> >> If the board thinks it's a good idea, we'll need to
> >> pursue finding volunteers to implement it.
> >
> > Well maybe this is the best path to follow right now, as it will be done
> faster,
> > without having to interact with much people to set it up, so it's a quick
> win.
> >
> > But it will probably kill the mirroring protocol idea from the PEP in
> > the process,
> > which I think is superior in the long term since it provides a
> > standardized ground
> > for the community to set up mirrors independently from pypi.python.org.
> We'll have to see.
> Note that the CDN will only deal with the static data on PyPI,
> not the RPC or the web GUI access.
> Since static data is all that setuptools et al. currently use
> for fetching the data, we'll see an improved uptime for easy_install
> and esp. zc.buildout which by nature of their concepts rely on having
> a high availability of the PyPI static data resources.
> If, in the future, package tools start to rely on RPC for
> fetching data, the situation will shift towards needing full
> functional mirrors again.
> OTOH, we could also provide a snapshot copy of the database
> data in form of a SQLite database on the CDN for those tools
> to download and use locally... there are lot's of things
> package tools could do :-)
> --
> Marc-Andre Lemburg
> eGenix.com
> Professional Python Services directly from the Source  (#1, Jun 14 2010)
> >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
> >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
> >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
> ________________________________________________________________________
> 2010-07-19: EuroPython 2010, Birmingham, UK                34 days to go
> ::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
>   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>           Registered at Amtsgericht Duesseldorf: HRB 46611
>               http://www.egenix.com/company/contact/
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100614/f061635b/attachment.html>

More information about the Catalog-SIG mailing list