[Catalog-sig] Thoughts on more detailed stats

Jim Fulton jim at zope.com
Tue Mar 22 11:33:27 CET 2011


On Tue, Mar 22, 2011 at 5:59 AM, Tarek Ziadé <ziade.tarek at gmail.com> wrote:
> Hey,
>
> I find the actual downloads hits to be quite artificial because there
> are some build systems out there that are fetching releases all day
> long for their work. There are local mirrors of course,

Not just local mirrors, but source releases that include things, download
caches, etc...

> but I am
> pretty sure projects like zc.buildout are downloaded most of the times
> by build scripts. And setuptools is downloaded mostly as a dependency
> of other projects.
>
> Those are valid stats of course, but I was wondering if we could
> provide more details in why the package was downloaded. e.g. if we're
> able to distinguish automated downloads from other downloads.
>
> One way I was thinking of was to tell PyPI at download time if the
> download was done as a dependency fetching or was a primary download
> (manuall download or "pip install xxx')

I don't know why downloading something as part of a buildout would be any
different that doing a "pip install".  I almost never download anything except
with buildout.


> Another way would be to ask Continuous Integration systems to use a
> specific user agent marker.
>
> In the UI we could then make the distinction in the download hits between:
>
> 1/ downloads by the end users to install the project
> 2/ downloads by build tools.
> 3/ "indirect" downloads as dependencies
>
> This is still a bit vague in my head, but I think it would be valuable
> for people to have such details

I think it would help to ask what the goals of the statistics are?
The statistics are presumably used to answer some questions. What are
those questions?

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton


More information about the Catalog-SIG mailing list