[Catalog-sig] Thoughts on more detailed stats

Fred Drake fdrake at acm.org
Sun Mar 27 17:01:40 CEST 2011

2011/3/27 Alexis Métaireau <alexis at notmyidea.org>:
> Having a user agent defined in the clients connecting PyPI could also allow
> to make statistics on the usage of such tools (xx% of all the downloads on
> pypi.python.org are made by buildout, by the distutils2 index crawler, by
> pip etc.)

I don't object to this, but I don't know that it tells you anything.
It is better than everything showing up as a module from the Python
standard library, which clearly doesn't tell us much.

> I'm +1 on having CI tools using specific HTTP headers in order to avoid
> using those information as "user downloads". We can probably store this
> information in a different place and display clearly what the number of
> downloads for CI tools is on pypi.py.org

I'd be surprised if many CI tools did a lot of downloading; the build
tools are typically responsible for that.  I'd expect them to show up
a lot.

More importantly, I'm not sure what you mean by "user downloads".  If
I cause my build tool to download a package from PyPI, whether once or
a thousand times, that still seems like a user download to me.
(zc.buildout, at least, supports an effective caching strategy, so I
doubt the package download numbers would change all that much.)

If I want to try out a package for some purpose, I'm going to add it
to the build of the project I expect it to be useful for; if it proves
insufficiently useful, I'll remove it.

My point is that if you don't include the downloads from zc.buildout,
or whatever tool someone is using, you're likely to miss them
completely, because what's *not* happening is a browser-based
download.  I can't even remember the last time I've done that for
something available via PyPI; it's been many years.


Fred L. Drake, Jr.    <fdrake at acm.org>
"A storm broke loose in my mind."  --Albert Einstein

More information about the Catalog-SIG mailing list