[Catalog-sig] How about a dedicated web service mirror?

"Martin v. Löwis" martin at v.loewis.de
Tue Apr 19 08:09:48 CEST 2011


> Thanks for this link but I'm not sure about what the entries mean.
> Using some entries from SqlAlchemy as an example:
> 
> SQLAlchemy,SQLAlchemy-0.6beta2.tar.gz,z3c.pypimirror/1.0.15.1,3
> SQLAlchemy,SQLAlchemy-0.6beta3.tar.gz,setuptools/0.6c11,1
> SQLAlchemy,SQLAlchemy-0.6beta3.tar.gz,setuptools/0.6c9,1
> SQLAlchemy,SQLAlchemy-0.6beta3.tar.gz,z3c.pypimirror/1.0.15.1,3
> 
> I guess the first column is the package identifier, the second is the
> file + ???, and the fourth column the download count? Am I close?

All correct; the third one is the user-agent that performed the download.

> Is there any formal documentation for this portion of PyPI?

Yes, see http://www.python.org/dev/peps/pep-0381/#statistics-page

> I'm know
> this is easier on PyP's server but I'm hesitant to interact with an
> undocumented data source. If documentation doesn't exist for it, I'm
> happy enough to write the formal documentation once I understand it.

It's certainly an official interface, and this dataset is actually
also intended for applications like yours (the local-stats pages
are only intended for the mirroring infrastructure itself).

> Also, other tools are using the XMLRPC API to make large requests
> against PyPI. http://pypi.python.org/pypi/vanity and
> http://pypi.appspot.com come to mind.

Such applications are encouraged to use batched XML-RPC. As long
as they aren't causing load problems on the server, I'm fine with
the tools doing what they do - it's just that batching would
actually get them the results faster.

However, pypi.appspot.com does *not* make heavy requests through
XML-RPC. It is b.pypi.python.org, and most certainly uses the
journal to only request recent changes.

Regards,
Martin


More information about the Catalog-SIG mailing list