![](https://secure.gravatar.com/avatar/15fa47f2847592672210af8a25cd1f34.jpg?s=120&d=mm&r=g)
On Sep 28, 2004, at 5:53 PM, Selwyn McCracken wrote:
I am having trouble modifying the twisted-based rss aggregator from the python cookbook so that feedparser can make use of the update related arguments of 'etag' and 'modified' to save bandwith. (see http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/277099)
I realise that the problem is deferred related, but I can't seem to resolve the problem, even after reading the deferred documentation.
Not particularly deferred related, more t.w.client related. I assume what's happening is that feedparser.parse() can either take a URL or a file-like-object. If it takes a URL, it uses its internal HTTP getting method, which is synchronous. Twisted's HTTP client is asynchronous, so you want to use that. So what you need to know how to do is send the etag/modified information to Twisted's HTTP client. You want something like: def getPage(self, data, args): #args is the rss feed link return client.getPage(args,timeout=TIMEOUT, headers={'If-None-Match': '"xyzzy"', 'If-Modified-Since': 'Sun, 09 Sep 2001 01:46:40 GMT'}) However, client.getPage doesn't leave you with any way to get at the response headers (so you can save the etag and last modified responses for the next request), so you'll need to use HTTPClientFactory directly (cribbing from the code in client.getPage). Basically, after the deferred fires, factory.response_headers will have the data you want, so you just need to keep a reference to factory around. James