[Catalog-sig] start on static generation, and caching - apache config.

Phillip J. Eby pje at telecommunity.com
Sun Jul 8 19:27:56 CEST 2007

At 01:48 PM 7/7/2007 -0400, Jim Fulton wrote:

>On Jul 7, 2007, at 12:43 PM, Martin v. Löwis wrote:
> > I'm quite skeptical on caching in general (even about the static page
> > generation). It *should* be possible to make it fast enough so that
> > it doesn't need caching.
>Sure, with more hardware than we want to afford.
> > I consider caching a work-around, not a
> > solution - and one with severe drawbacks.
>The pages we're talking about are static.  They change at well-known
>times. IMO, It's crazy to serve static content dynamically when it's
>easy to serve it statically.

If they're effectively static, why can't Apache cache 
them?  Shouldn't we be able to simply add Last-Modified/If-Modified 
support to the PyPI output, and enable Apache's disk caching for 
non-logged-in users?

That is, as long as there is a quick last-modified-time query for a 
package, we can use those to process the If-Modified header.  The 
modification time could even be memcached, so as not to need a 
database hit 99% of the time.

While that's not necessarily as fast as static page generation, it's 
a lot less complex to get right, and it saves the main piece of CPU 
load: i.e., doing SQL queries and actually generating the page.

Pages that pertain to more than one package might be a bit more 
complex to do this on, but if I understand correctly it's mainly the 
package-specific pages we're concerned with here, correct?  Even so, 
it's possible to have any updates also update a global "something's 
changed" time, and use that time as the Last-Modified of those pages.

More information about the Catalog-SIG mailing list