[Catalog-sig] [Distutils] Why so many zc.buildout versions?

Phillip J. Eby pje at telecommunity.com
Tue Jul 10 17:56:42 CEST 2007


At 10:32 AM 7/10/2007 -0400, Jim Fulton wrote:
>Currently, the only way to support multiple versions with the
>cheeseshop is to unhide past releases.  This has a fairly severe
>effect on performance.  As the example below shows, setuptools will
>fetch the package page and then fetch the pages for each release.
>That's a lot of requests.

This could potentially be fixed in setuptools, so that it only looks 
at release pages that match its requirements, in highest-to-lowest 
version order, stopping as soon as a suitable match is found.  That 
would eliminate the current issue -- but only for new versions of 
setuptools.  So I do like your idea better, since it can be made to 
work for already-deployed clients as well.


>I think it is time to take a step back and think a lot harder about
>how we'd want to structure an index to support setuptools.

+1, as long as somebody's willing to build and host the 
thing.  Please see my earlier comments on the Catalog-Sig about this.


>IMO, a setuptools-aware index would have a single page for each package:
>
>- The single page would be published in a case-insensitive way. It
>would be nice to find a way to avoid this, or maybe we should use a
>windows-based web server. :)  It would also be served very cheaply,
>for example statically.

Apache's CheckSpelling directive does case-insensitivity and 
approximate matching.  Combine that with making the directories be 
based on "safe_name" values to begin with, and you should be all set.


>- The single page would list links for all available distributions,
>which should include all distributions published.  It would also list
>any other URLs that should be scanned for releases, when releases
>aren't all uploaded to PyPI.

The piece you're missing here is direct links to other downloads, 
such as "#egg=project-dev" subversion links.  However, if you 
extracted these from all of the relevant PyPI HTML pages, you could 
certainly do that.


>In addition, the root page with a trailing / would be empty and very
>cheap.

As long as the individual package directories are safe_name based, 
this would work.


>There are a lot of ways we could achieve this pretty cheaply while
>keeping the existing system pretty much as it is.

Of course, there are still other reasons to want to improve the 
Cheeseshop's performance, such as search engines and other bots.


>For example, the current effort to bake static pages could bake these
>pages instead.  We could make the new index available at a different
>URL for people to play with while we worked the kinks out of the
>process.

...and then use a User-Agent rewrite rule to redirect setuptools 
clients to the static piece, as soon as we're satisfied that it works.



More information about the Catalog-SIG mailing list