
At 10:46 AM 4/9/2007 -0400, A.M. Kuchling wrote:
On Sat, Apr 07, 2007 at 12:44:56PM -0400, Phillip J. Eby wrote:
I don't know whether this will actually solve any problems the cheeseshop itself is having; it may be that ill-behaved web spiders are at fault, or something else altogether.
In this recent case, two different spiders were crawling the wiki very quickly, the machine's load average was in the 70s, and the out-of-memory killer was killing off PostgreSQL processes.
I don't think the load caused from people running easy_install is especially high -- it's certainly not a source of problems -- but making static pages would still be good to make mirroring the package archive more useful. Right now people could mirror http://cheeseshop.python.org/packages/, but there's nothing there for easy_install or for human readers; it's just a tree of package directories.
Hm. Well, actually, if that directory structure were something I could code to, easy_install could sure as heck be *made* to use it. The only thing easy_install couldn't get from it was external links to downloads, SVN versions, etc. Notice, for example, that if you use "easy_install -f http://cheeseshop.python.org/packages/source/s/simple_json/ simple_json", easy_install won't look at the main package index, but just download directly. So an automated form of that calculation could easily be added to easy_install. What I had in mind for an easy_install mirror, however, was a script that would just create a /packagename/index.html file with links gathered from all versions of a package on the original Cheeseshop, and with packagename generated as a setuptools "safe name" in lower case. This pattern would allow easy_install to find every possible relevant link in just one (static) web hit.