![](https://secure.gravatar.com/avatar/163d3162570cdfe97ccb911a82a44ac9.jpg?s=120&d=mm&r=g)
I wanted to push into the tracker but I'm a bit confused on how to do that with trac, so I resend with the mailing list. This patch implements html caching at the page level with a optional max cache size per page and with optional timeout. Even when the timeout is very small, this makes sure that all blocked waiting clients gets the same copy of the html allowing the server to scale. This only works for dynamic data that is the same for all users, so it's good for the homepage etc... I use it online for months and I had no problems at all. Index: Nevow/nevow/util.py =================================================================== --- Nevow/nevow/util.py (revision 3434) +++ Nevow/nevow/util.py (working copy) @@ -133,6 +133,7 @@ from twisted.python.failure import Failure from twisted.trial.unittest import deferredError from twisted.python import log + from twisted.internet import reactor try: # work with twisted before retrial Index: Nevow/nevow/rend.py =================================================================== --- Nevow/nevow/rend.py (revision 3434) +++ Nevow/nevow/rend.py (working copy) @@ -491,6 +491,56 @@ self.children[name] = child +class PageCache(object): + def __init__(self): + self.__db = {} + def cacheIDX(self, ctx): + return str(url.URL.fromContext(ctx)) + def __storeCache(self, cacheIDX, c): + self.__db[cacheIDX] = c + def __deleteCache(self, cacheIDX): + del self.__db[cacheIDX] + def __deleteCacheData(self, cacheIDX, page): + size = self.__db[cacheIDX][1] + assert len(self.__db[cacheIDX][0]) == size + page.subCacheSize(size) + self.__deleteCache(cacheIDX) + def __lookupCache(self, cacheIDX): + return self.__db.get(cacheIDX) + def getCache(self, ctx): + cacheIDX = self.cacheIDX(ctx) + c = self.__lookupCache(cacheIDX) + + if c is None: + self.__storeCache(cacheIDX, [util.Deferred()]) + return + + if isinstance(c[0], util.Deferred): + d = util.Deferred() + c.append(d) + return d + + return c[0] + def cacheRendered(self, ctx, data, page): + cacheIDX = self.cacheIDX(ctx) + defer_list = self.__lookupCache(cacheIDX) + assert(isinstance(defer_list[0], util.Deferred)) + size = len(data) + if page.canCache(size): + # overwrite the deferred with the data + timer = None + if page.lifetime > 0: + timer = util.reactor.callLater(page.lifetime, + self.__deleteCacheData, cacheIDX, page) + page.addCacheSize(size) + self.__storeCache(cacheIDX, (data, size, timer, )) + else: + self.__deleteCache(cacheIDX) + for d in defer_list: + d.callback(data) + +_CACHE = PageCache() + class Page(Fragment, ConfigurableFactory, ChildLookupMixin): """A page is the main Nevow resource and renders a document loaded via the document factory (docFactory). @@ -504,8 +554,27 @@ afterRender = None addSlash = None + cache = False + lifetime = 0 + max_cache_size = None + __cache_size = 0 + flattenFactory = lambda self, *args: flat.flattenFactory(*args) + def hasCache(self, ctx): + if not self.cache: + return + return _CACHE.getCache(ctx) + def addCacheSize(self, size): + assert self.canCache(size) + self.__cache_size += size + def subCacheSize(self, size): + self.__cache_size -= size + assert self.__cache_size >= 0 + def canCache(self, size): + return self.max_cache_size is None or \ + self.__cache_size + size <= self.max_cache_size + def renderHTTP(self, ctx): if self.beforeRender is not None: return util.maybeDeferred(self.beforeRender,ctx).addCallback( @@ -530,11 +599,20 @@ if self.afterRender is not None: return util.maybeDeferred(self.afterRender,ctx) - if self.buffered: + c = self.hasCache(ctx) + if c is not None: + assert self.afterRender is None + finishRequest() + return c + + if self.buffered or self.cache: io = StringIO() writer = io.write def finisher(result): - request.write(io.getvalue()) + c = io.getvalue() + if self.cache: + _CACHE.cacheRendered(ctx, c, self) + request.write(c) return util.maybeDeferred(finishRequest).addCallback(lambda r: result) else: writer = request.write
![](https://secure.gravatar.com/avatar/10aa56aec887e973085025c6ddfdc46b.jpg?s=120&d=mm&r=g)
On Tue, Dec 13, 2005 at 11:43:02PM +0100, Andrea Arcangeli wrote:
I wanted to push into the tracker but I'm a bit confused on how to do that with trac, so I resend with the mailing list.
This patch implements html caching at the page level with a optional max cache size per page and with optional timeout.
Even when the timeout is very small, this makes sure that all blocked waiting clients gets the same copy of the html allowing the server to scale.
This only works for dynamic data that is the same for all users, so it's good for the homepage etc...
I use it online for months and I had no problems at all.
This patch is indeed robust and works as expected. But page caching in this way should not be done at Nevow level because it's slower than it could be. Unfortunately Nevow overrides part of twisted.web request API and thus makes this distinction a bit blurry. The right strategy would be to write something similar for twisted.web2 using filters maybe. This would make the implementation significantly easier to maintain and probably shorter. Nevow would be the place of fragment caches (for which there is an already implemented API as you may recall). -- Valentino Volonghi aka Dialtone Now Running MacOSX 10.4 Blog: http://vvolonghi.blogspot.com http://weever.berlios.de
![](https://secure.gravatar.com/avatar/163d3162570cdfe97ccb911a82a44ac9.jpg?s=120&d=mm&r=g)
On Wed, Dec 14, 2005 at 12:48:00AM +0100, Valentino Volonghi wrote:
The right strategy would be to write something similar for twisted.web2 using filters maybe. This would make the implementation significantly easier to maintain and probably shorter.
So, we should defer it to the time nevow will switch over web2?
Nevow would be the place of fragment caches (for which there is an already implemented API as you may recall).
Yep, at some point I started using it but then it complicated things too much and I removed them. The benefit wasn't big enough. Those caches require explicit coding, while the PageCache is instead totally transparent and it only requires a two liner patch to be enabled and the patched code runs backwards compatible if run on a pristine nevow tree. So to me the PageCache is more important and it does 99% of the work.
![](https://secure.gravatar.com/avatar/163d3162570cdfe97ccb911a82a44ac9.jpg?s=120&d=mm&r=g)
On Wed, Dec 14, 2005 at 02:47:12AM +0100, Andrea Arcangeli wrote:
On Wed, Dec 14, 2005 at 12:48:00AM +0100, Valentino Volonghi wrote:
The right strategy would be to write something similar for twisted.web2 using filters maybe. This would make the implementation significantly easier to maintain and probably shorter.
So, we should defer it to the time nevow will switch over web2?
Nevow would be the place of fragment caches (for which there is an already implemented API as you may recall).
Yep, at some point I started using it but then it complicated things too much and I removed them. The benefit wasn't big enough. Those caches require explicit coding, while the PageCache is instead totally transparent and it only requires a two liner patch to be enabled and the patched code runs backwards compatible if run on a pristine nevow tree. So to me the PageCache is more important and it does 99% of the work.
Can we focus on the API of the app? That's the most important thing after all, it's the only one that must not change. My current API is this: class forever_cached_page_class(rend.Page): cache = True class cached_page_class(forever_cached_page_class): lifetime = LIFETIME class cached_page_class(rend.Page): cache = True lifetime = LIFETIME max_cache_size = 20*1024*1024 (LIFETIME in seconds) If this API is good, then I suggest adding my patch now even if it's not the long term implementation, and then once nevow will switch over to web2, it'll transparently support the same API with web2 support instead of the PageCache inside nevow. The PageCache class is absolutely invisible to the web application code and as such it doesn't need to be a long term implementation. The only thing we have to focus before including that code, is the API provided to the nevow application. Perhaps lifetime should be renamed to cache_lifetime or do you have other suggestions? Thanks.
participants (2)
-
Andrea Arcangeli
-
Valentino Volonghi aka Dialtone