[Catalog-sig] start on static generation, and caching - apache config.

Phillip J. Eby pje at telecommunity.com
Wed Jul 11 21:03:12 CEST 2007

At 02:25 PM 7/11/2007 -0400, Benji York wrote:
>Martin v. Löwis wrote:
> > Benji York schrieb:
> >> Is your position that PyPI isn't down/very slow on occasion or that when
> >> it is no one complains?
> >
> > Both. I believe it shouldn't be down
>The cheeseshop has provided its own proof that that believe is mistaken
>by being down as I began composing this message. <wink>
> > Jim Fulton complained that it took 0.3s to
> > get a single package's page, which I cannot classify as "very slow".
>During a single run setuptools or zc.buildout may make hundreds of
>requests to the cheeseshop taking a total time in the minutes.  That's
>not fast enough.  I can't see a technical reason why these requests
>couldn't be handled much faster than 3 a second.

An interesting thought for future optimization...  an XML-RPC catalog 
server designed for this use case could in fact do all the 
computation server-side, resolving dependencies and evaluating 
version constraints.  Heck, in theory, it could cache packages' 
external links, and simply hand back to the caller a complete list of 
candidate URLs to choose for downloading.  That way, most activities 
would take only one server round-trip to complete, if the client sent 
a list of everything it expects to need, and the server includes 
everything that the server expects the client to want due to those 
things' dependencies.

The main obstacle to implementing such a service today, is that it 
would have no way of knowing what dependencies to look for, without 
sniffing the contents of .egg files.  But, as long as a superset of 
possible dependencies was listed in PKG-INFO, the server could make 
intelligent guesses about what other packages are likely to be 
needed, and return their version/download info as well.  Returning 
information for packages that turn out not to be needed is likely to 
be far less expensive than having to make round-trip requests.

An alternative to providing that information from metadata, of 
course, would be for the client to include a "referrer" header of 
sorts, saying why it is asking for a package.  The server could then 
simply "learn" the relevant associations.

More information about the Catalog-SIG mailing list