Re: [Twisted-Python] Re-working a synchronous iterator to use Twisted
Following the massive interest in my earlier postings on this thread, I'm following up to myself again :-) Here's what I was trying to do:
In case it wasn't clear before, you're pulling "results" (e.g., from a search engine) in off the web. Each results pages comes with an indicator to tell you whether there are more results. I wanted to write a function (see processResults below) that, when called, would call the process function below on each result, all done asynchronously.
I posted some cumbersome code to roughly do that. I've since been thinking about this on and off, with help from Esteve Fernandez, and we've made the code quite a bit simpler. I think there's a general pattern here that's worth thinking about. Roughly: the above need is like the Twisted analogy of using iterators in regular synchronous programming. By that I mean that the normal pattern of Twisted usage is: a single event is anticipated (by the programmer), it occurs once, and its result is passed down a call/errback chain. That's roughly like a single function call in synchronous code. But if you are expecting a sequence of external events to occur and you want to asynchronously pass their results in turn down a call/errback chain. The need to do this in synchronous code can be filled with a simple iterator. But doing this asynchronously (when the fetch of the next batch of results might take a while) doesn't seem to fit easily into the single-shot asynchronous Twisted paradigm. I thought about modifying defer.py to allow a callback chain to be called multiple times (and to have the "normal" single-shot chain be a special case). But that was clearly going to get messy. BTW, I find defer.py is really elegant. After more thinking about how to make my previously posted code simpler, Esteve and I came up with what you'll find at http://python.pastebin.com/f7df56752 (code) and http://python.pastebin.com/f1e582264 (simple tests) The idea is that you provide a result fetcher function to the TwIterator class. This function will be called repeatedly, as needed, to get more results. It returns a deferred whose callback it should call with a list of next results (which may be empty), a bool to indicate whether to re-call the function, and a dict of args to pass to it next time. The TwIterator class provides you with a list() method that you can use almost like an iterator: @inlineCallbacks def printer(results): for x in results: print (yield x) fetcher.list().addCallback(printer) This is in some sense like a general asynchronous iterator for Twisted. The printer function receives an iterator, each element of which is a deferred, and when that deferred fires it produces the next result. The test code gives 4 simple example result-fetching functions, and calls them all asynchronously. If you run it you'll see the results coming out in a somewhat random order. I wont go into more detail, given that no-one responded to the first two postings. It's still possible that I'm trying to solve a problem that can already be done by some standard Twisted module. I don't know enough about Twisted to know for sure. Terry
participants (1)
-
Terry Jones