[Python-Dev] microthreading vs. async io
dustin at v.igoro.us
dustin at v.igoro.us
Thu Feb 15 17:36:21 CET 2007
On Thu, Feb 15, 2007 at 04:51:30PM +0100, Joachim K?nig-Baltes wrote:
> The style used in asyncore, inheriting from a class and calling return
> in a method
> and being called later at a different location (different method) just
> interrupts the
> sequential flow of operations and makes it harder to understand. The same is
> true for all other strategies using callbacks or similar mechanisms.
>
> All this can be achieved with a multilevel yield() that is hidden in a
> function call.
> So the task does a small step down (wait) in order to jump up (yield) to
> the scheduler
> without disturbing the eye of the beholder.
I agree -- I find that writing continuations or using asyncore's
structure makes spaghetti out of functionality that requires multiple
blocking operations inside looping or conditional statements. The best
example, for me, was writing a complex site-specific web spider that had
to fetch 5-10 pages in a certain sequence, where each step in that
sequence depended on the results of the previous fetches. I wrote it in
Twisted, but the proliferation of nested callback functions and chained
deferreds made my head explode while trying to debug it. With a decent
microthreading library, that could look like:
def fetchSequence(...):
fetcher = Fetcher()
yield fetcher.fetchHomepage()
firstData = yield fetcher.fetchPage('http://...')
if someCondition(firstData):
while True:
secondData = yield fetcher.fetchPage('http://...')
# ...
if someOtherCondition(secondData): break
else:
# ...
which is *much* easier to read and debug. (FWIW, after I put my head
back together, I rewrote the app with threads, and it now looks like the
above, without the yields. Problem is, throttlling on fetches means 99%
of my threads are blocked on sleep() at any given time, which is just
silly).
All that said, I continue to contend that the microthreading and async
IO operations are separate. The above could be implemented relatively
easily in Twisted with a variant of the microthreading module Phillip
posted earlier. It could also be implemented atop a bare-bones
microthreading module with Fetcher using asyncore on the backend, or
even scheduler urllib.urlopen() calls into OS threads. Presumably, it
could run in NanoThreads and Kamaelia too, among others.
What I want is a consistent syntax for microthreaded code, so that I
could write my function once and run it in *all* of those circumstances.
Dustin
P.S. For the record -- I've written lots of other apps in Twisted with
great success; this one just wasn't a good fit.
More information about the Python-Dev
mailing list