[Distutils] Python people want CPAN and how the latter came about

Sridhar Ratnakumar sridharr at activestate.com
Wed Dec 23 22:19:07 CET 2009


On 12/23/2009 12:18 PM, "Martin v. Löwis" wrote:
>> One solution I can think of is this: make PyPI only do the job of PAUSE
>> >  as it does for CPAN; and implement a CPAN like simple directory
>> >  structure to store packages; make PyPI use that as the package data
>> >  store
> I don't know what PAUSE is, but I think there is what you want at

PAUSE is the web management interface that allows one to upload/manage 
packages in CPAN. http://pause.perl.org/ -- similar to PyPI (minus the 
storage part).

>
> http://pypi.python.org/packages/
>
>> >  deprecate the XmlRpc interface and rely on simple index files -
>> >  such ashttp://www.cpan.org/modules/01modules.mtime.html  - instead
>> >  (consequently rethink PEP 381). This is partially implemented for PyPM
>> >  at ActiveState and I'd be willing to contribute my time towards doing
>> >  this for PyPI (if at all there is an interest).
> I don't think I understand what it is that you want, so I don't know
> whether I'm interested.

What /packages/source/ lacks is:

1/ Missing packages (eg: Twisted is not there); which is why 
easy_install/pip had to resolve to scrapping project webpages for 
guessing download links. In CPAN, almost all module authors upload their 
sources via PAUSE.

2/ No metadata: When only source tarballs are stored 
[pypi.python.org/packages/source/P/Pylons/], what is the reliable way to 
a) get the source for latest version, b) get the source for a particular 
version? In CPAN [cpan.org/modules/by-module/AppConfig/ABW/], each 
tarball has a .meta file describing the module metadata (similar to 
PKG-INFO). I don't want XmlRpc, but just files/directories (note 
simplicity in Steffen's post).

The former is more of a community issue. Often Python package authors 
are not using `sdist upload` (whereas this seems to be the convention in 
the Perl world).

The later is what is most relevant to PyPM (or any thirdparty 
service/tool).

1 and 2 combined makes it possible for anyone willing to write a 
third-party PyPI functionality to simply rsync the entire PyPI store and 
begin implementing the desired features like *.cpan.org (eg: 
test.pypi.org, quality.pypi.org, etc..)

What this means is that PyPI has to serve the purpose of being a central 
package repository (like CPAN) by a) disallowing mere listings (without 
sources) and requiring sources to be stored in the server, b) storing 
the metadata along with the sources (so anyone processing it wouldn't 
have to extract the source and rely on a PKG-INFO file - which may or 
may not exist).

Tools would consequently use the new /packages/sources store (with full 
metadata and all registered package sources) without having to resolve 
to webpage scrapping hacks.

-srid

PS: Our internal mirror is similar to /packages/sources except it (a) 
also contains external packages (eg: Twisted), (b) and pre-extracts 
PKG-INFO and requires.txt out of the source tarballs. Everyday, it uses 
PyPI's XmlRpc interface to re-download (using easy_install scrapping 
logic) the recently releases packages


More information about the Distutils-SIG mailing list