[Distutils] Python people want CPAN and how the latter came about

"Martin v. Löwis" martin at v.loewis.de
Thu Dec 24 12:00:05 CET 2009

> Some reasons to have PyPI host packages have already been mentioned in
> this thread: it makes mirroring easier, and it makes it easier for
> individuals to build new services (web sites primarily) that present new
> interfaces to the Python package collection.  Mirroring for its own sake
> is some use, but being able to grab the entire Python package repository
> easily from a single source is valuable for the second goal, that of
> furnishing the foundation ("shoulders of giants" and all that) for those
> with vision (and round tuits) to take the next step.

That is fairly easily possible today, even without everybody uploading
all files. It isn't easy *per se*, but needs a lot of code. However,
this code has already been written, and using it is fairly easy.

> If I wanted to host a site that (e.g.) indexed Python modules from PyPI
> by module (not package) name, and extracted and provided the
> documentation in HTML format, from what I've been reading I'd have to
> build a scraper or XMLRPC tool to walk PyPI, and then for each package,
> download it from another site (that may not have the uptime or
> scalability of PyPI), a nontrivial burden on aspiring visionaries that
> just want to build an addition and then go have a beer and discuss
> further improvements.

Not at all. You would just use one of the ten or so packages that
already do precisely that, and use it.

> (As a point of practical interest, what _would_ be the most efficient
> way to download the entire set of Python modules listed on PyPI? A
> search comes up with z3c.pypimirror,
> http://pypi.python.org/pypi/z3c.pypimirror; is this the standard tool?)

There are a number of other mirroring tools, such as EggBasket and
collective.eggproxy. For mirroring the whole index, pypimirror is
probably the best starting point.


More information about the Distutils-SIG mailing list