[Catalog-sig] Thoughts on more detailed stats

Tue Mar 22 10:59:02 CET 2011

Hey,

I find the actual downloads hits to be quite artificial because there
are some build systems out there that are fetching releases all day
long for their work. There are local mirrors of course, but I am
pretty sure projects like zc.buildout are downloaded most of the times
by build scripts. And setuptools is downloaded mostly as a dependency
of other projects.

Those are valid stats of course, but I was wondering if we could
provide more details in why the package was downloaded. e.g. if we're
able to distinguish automated downloads from other downloads.

One way I was thinking of was to tell PyPI at download time if the
download was done as a dependency fetching or was a primary download
(manuall download or "pip install xxx')

Another way would be to ask Continuous Integration systems to use a
specific user agent marker.

In the UI we could then make the distinction in the download hits between:

1/ downloads by the end users to install the project
2/ downloads by build tools.
3/ "indirect" downloads as dependencies

This is still a bit vague in my head, but I think it would be valuable
for people to have such details

Cheers
Tarek

-- 
Tarek Ziadé | http://ziade.org