[Catalog-sig] How about a dedicated web service mirror?
"Martin v. Löwis"
martin at v.loewis.de
Tue Apr 19 08:09:48 CEST 2011
> Thanks for this link but I'm not sure about what the entries mean.
> Using some entries from SqlAlchemy as an example:
> I guess the first column is the package identifier, the second is the
> file + ???, and the fourth column the download count? Am I close?
All correct; the third one is the user-agent that performed the download.
> Is there any formal documentation for this portion of PyPI?
Yes, see http://www.python.org/dev/peps/pep-0381/#statistics-page
> I'm know
> this is easier on PyP's server but I'm hesitant to interact with an
> undocumented data source. If documentation doesn't exist for it, I'm
> happy enough to write the formal documentation once I understand it.
It's certainly an official interface, and this dataset is actually
also intended for applications like yours (the local-stats pages
are only intended for the mirroring infrastructure itself).
> Also, other tools are using the XMLRPC API to make large requests
> against PyPI. http://pypi.python.org/pypi/vanity and
> http://pypi.appspot.com come to mind.
Such applications are encouraged to use batched XML-RPC. As long
as they aren't causing load problems on the server, I'm fine with
the tools doing what they do - it's just that batching would
actually get them the results faster.
However, pypi.appspot.com does *not* make heavy requests through
XML-RPC. It is b.pypi.python.org, and most certainly uses the
journal to only request recent changes.
More information about the Catalog-SIG