[Catalog-sig] Thoughts on more detailed stats

Jim Fulton jim at zope.com
Tue Mar 22 11:33:27 CET 2011

On Tue, Mar 22, 2011 at 5:59 AM, Tarek Ziadé <ziade.tarek at gmail.com> wrote:
> Hey,
> I find the actual downloads hits to be quite artificial because there
> are some build systems out there that are fetching releases all day
> long for their work. There are local mirrors of course,

Not just local mirrors, but source releases that include things, download
caches, etc...

> but I am
> pretty sure projects like zc.buildout are downloaded most of the times
> by build scripts. And setuptools is downloaded mostly as a dependency
> of other projects.
> Those are valid stats of course, but I was wondering if we could
> provide more details in why the package was downloaded. e.g. if we're
> able to distinguish automated downloads from other downloads.
> One way I was thinking of was to tell PyPI at download time if the
> download was done as a dependency fetching or was a primary download
> (manuall download or "pip install xxx')

I don't know why downloading something as part of a buildout would be any
different that doing a "pip install".  I almost never download anything except
with buildout.

> Another way would be to ask Continuous Integration systems to use a
> specific user agent marker.
> In the UI we could then make the distinction in the download hits between:
> 1/ downloads by the end users to install the project
> 2/ downloads by build tools.
> 3/ "indirect" downloads as dependencies
> This is still a bit vague in my head, but I think it would be valuable
> for people to have such details

I think it would help to ask what the goals of the statistics are?
The statistics are presumably used to answer some questions. What are
those questions?


Jim Fulton

More information about the Catalog-SIG mailing list