Re: [Python-ideas] asyncore: included batteries don't fit

Hi python-ideas, I'm jumping in to this thread on behalf of Tornado. I think there are actually two separate issues here and it's important to keep them distinct: at a low level, there is a need for a standardized event loop, while at a higher level there is a question of what asynchronous code should look like. This thread so far has been more about the latter, but the need for standardization is more acute for the core event loop. I've written a bridge between Tornado and Twisted so libraries written for both event loops can coexist, but obviously that wouldn't scale if there were a proliferation of event loop implementations out there. I'd be in favor of a simple event loop interface in the standard library, with reference implementation(s) (select, epoll, kqueue, iocp) and some means of configuring the global (or thread-local) singleton. My preference is to keep the interface fairly low-level and close to the underlying mechanisms (i.e. like IReactorFDSet instead of IReactor{TCP,UDP,SSL,etc}), so that different interfaces like Tornado's IOStream or Twisted's protocols can be built on top of it. As for the higher-level question of what asynchronous code should look like, there's a lot more room for spirited debate, and I don't think there's enough consensus to declare a One True Way. Personally, I'm -1 on greenlets as a general solution (what if you have to call MySQLdb or getaddrinfo?), although they can be useful in particular cases to convert well-behaved synchronous code into async (as in Motor: http://emptysquare.net/blog/introducing-motor-an-asynchronous-mongodb-driver...). I like Futures, though, and I find that they work well in asynchronous code. The use of the result() method to encapsulate both successful responses and exceptions is especially nice with generator coroutines. FWIW, here's the interface I'm moving towards for async code. From the caller's perspective, asynchronous functions return a Future (the future has to be constructed by hand since there is no Executor involved), and also take an optional callback argument (mainly for consistency with currently-prevailing patterns for async code; if the callback is given it is simply added to the Future with add_done_callback). In Tornado the Future is created by a decorator and hidden from the asynchronous function (it just sees the callback), although this relies on some Tornado-specific magic for exception handling. In a coroutine, the decorator recognizes Futures and resumes execution when the future is done. With these decorators asynchronous code looks almost like synchronous code, except for the "yield" keyword before each asynchronous call. -Ben

On Sun, Oct 7, 2012 at 6:41 PM, Ben Darnell <ben@bendarnell.com> wrote:
Hi python-ideas,
I'm jumping in to this thread on behalf of Tornado.
Welcome!
Yes, yes. I tried to bring up thing distinction. I'm glad I didn't completely fail.
As long as it's not so low-level that other people shy away from it. I also have a feeling that one way or another this will require cooperation between the Twisted and Tornado developers in order to come up with a compromise that both are willing to conform to in a meaningful way. (Unfortunately I don't know how to define "meaningful way" more precisely here. I guess the idea is that almost all things *using* an event loop use the standardized abstract API without caring whether underneath it's Tornado, Twisted, or some simpler thing in the stdlib.
Agreed on both counts.
Yay!
Ditto for NDB (though there's a decorator that often takes care of the future construction).
That's interesting. I haven't found the need for this yet. Is it really so common that you can't write this as a Future() constructor plus a call to add_done_callback()? Or is there some subtle semantic difference?
In Tornado the Future is created by a decorator and hidden from the asynchronous function (it just sees the callback),
Hm, interesting. NDB goes the other way, the callbacks are mostly used to make Futures work, and most code (including large swaths of internal code) uses Futures. I think NDB is similar to monocle here. In NDB, you can do f = <some function returning a Future> r = yield f where "yield f" is mostly equivalent to f.result(), except it gives better opportunity for concurrency.
Yes! Same here. I am currently trying to understand if using "yield from" (and returning a value from a generator) will simplify things. For example maybe the need for a special decorator might go away. But I keep getting headaches -- perhaps there's a Monad involved. :-) -- --Guido van Rossum (python.org/~guido)

On Sun, Oct 7, 2012 at 7:01 PM, Guido van Rossum <guido@python.org> wrote:
As long as it's not so low-level that other people shy away from it.
That depends on the target audience. The low-level IOLoop and Reactor are pretty similar -- you can implement one in terms of the other -- but as you move up the stack cross-compatibility becomes harder. For example, if I wanted to implement tornado's IOStreams in twisted, I wouldn't start with the analogous class in twisted (Protocol?), I'd go down to the Reactor and build from there, so putting something IOStream or Protocol in asycore2 wouldn't do much to unify the two worlds. (it would help people build async stuff with the stdlib alone, but at that point it becomes more like a peer or competitor to tornado and twisted instead of a bridge between them)
I'd phrase the goal as being able to run both Tornado and Twisted in the same thread without any piece of code needing to know about both systems. I think that's achievable as far as core functionality goes. I expect both sides have some lesser-used functionality that might not make it into the stdlib version, but as long as it's possible to plug in a "real" IOLoop or Reactor when needed it should be OK.
It's a Future constructor, a (conditional) add_done_callback, plus the calls to set_result or set_exception and the with statement for error handling. In full: def future_wrap(f): @functools.wraps(f) def wrapper(*args, **kwargs): future = Future() if kwargs.get('callback') is not None: future.add_done_callback(kwargs.pop('callback')) kwargs['callback'] = future.set_result def handle_error(typ, value, tb): future.set_exception(value) return True with ExceptionStackContext(handle_error): f(*args, **kwargs) return future return wrapper
Yes, tornado's gen.engine does the same thing here. However, the stakes are higher than "better opportunity for concurrency" - in an event loop if you call future.result() without yielding, you'll deadlock if that Future's task needs to run on the same event loop.
I think if you build generator handling directly into the event loop and use "yield from" for calls from one async function to another then you can get by without any decorators. But I'm not sure if you can do that and maintain any compatibility with existing non-generator async code. I think the ability to return from a generator is actually a bigger deal than "yield from" (and I only learned about it from another python-ideas thread today). The only reason a generator decorated with @tornado.gen.engine needs a callback passed in to it is to act as a psuedo-return, and a real return would prevent the common mistake of running the callback then falling through to the rest of the function. For concreteness, here's a crude sketch of what the APIs I'm talking about would look like in use (in a hypothetical future version of tornado). @future_wrap @gen.engine def async_http_client(url, callback): parsed_url = urlparse.urlsplit(url) # works the same whether the future comes from a thread pool or @future_wrap addrinfo = yield g_thread_pool.submit(socket.getaddrinfo, parsed_url.hostname, parsed_url.port) stream = IOStream(socket.socket()) yield stream.connect((addrinfo[0][-1])) stream.write('GET %s HTTP/1.0' % parsed_url.path) header_data = yield stream.read_until('\r\n\r\n') headers = parse_headers(header_data) body_data = yield stream.read_bytes(int(headers['Content-Length'])) stream.close() callback(body_data) # another function to demonstrate composability @future_wrap @gen.engine def fetch_some_urls(url1, url2, url3, callback): body1 = yield async_http_client(url1) # yield a list of futures for concurrency future2 = yield async_http_client(url2) future3 = yield async_http_client(url3) body2, body3 = yield [future2, future3] callback((body1, body2, body3)) One hole in this design is how to deal with callbacks that are run multiple times. For example, the IOStream read methods take both a regular callback and an optional streaming_callback (which is called with each chunk of data as it arrives). I think this needs to be modeled as something like an iterator of Futures, but I haven't worked out the details yet. -Ben
-- --Guido van Rossum (python.org/~guido)

On Sun, Oct 7, 2012 at 9:44 PM, Ben Darnell <ben@bendarnell.com> wrote:
Sure. And of course we can't expect Twisted and Tornado to just merge projects. They each have different strengths and weaknesses and they each have strong opinions on how things should be done. I do get your point that none of that is incompatible with a shared reactor specification.
Sounds good. I think a reactor is always going to be an extension of the shared spec. [...]
Hmm... I *think* it automatically adds a special keyword 'callback' to the *call* site so that you can do things like fut = some_wrapped_func(blah, callback=my_callback) and then instead of using yield to wait for the callback, put the continuation of your code in the my_callback() function. But it also seems like it passes callback=future.set_result as the callback to the wrapped function, which looks to me like that function was apparently written before Futures were widely used. This seems pretty impure to me and I'd like to propose a "future" where such functions either be given the Future where the result is expected, or (more commonly) the function would create the Future itself. Unless I'm totally missing the programming model here. PS. I'd like to learn more about ExceptionStackContext() -- I've struggled somewhat with getting decent tracebacks in NDB.
That would depend on the semantics of the event loop implementation. In NDB's event loop, such a .result() call would just recursively enter the event loop, and you'd only deadlock if you actually have two pieces of code waiting for each other's completion. [...]
Ah, so you didn't come up with the clever hack of raising an exception to signify the return value. In NDB, you raise StopIteration (though it is given the alias 'Return' for clarity) with an argument, and the wrapper code that is responsible for the Future takes the value from the StopIteration exception and passes it to the Future's set_result().
And you need the thread pool because there's no async version of getaddrinfo(), right?
Why no yield in front of the write() call?
This second one is nearly identical to the way we it's done in NDB. However I think you have a typo -- I doubt that there should be yields on the lines creating future2 and future3.
Ah. Yes, that's a completely different kind of thing, and probably needs to be handled in a totally different way. I think it probably needs to be modeled more like an infinite loop where at the blocking point (e.g. a low-level read() or accept() call) you yield a Future. Although I can see that this doesn't work well with the IOLoop's concept of file descriptor (or other event source) registration. -- --Guido van Rossum (python.org/~guido)

On Mon, Oct 8, 2012 at 8:30 AM, Guido van Rossum <guido@python.org> wrote:
Yes. Note that if you're passing in a callback you're probably going to just ignore the return value. The callback argument and the future return value are essentially two alternative interfaces; it probably doesn't make sense to use both at once (but as a library author it's useful to provide both).
Yes, it's impure and based on pre-Future patterns. The caller's callback argument and the inner function's callback not really related any more (they were the same in pre-Future async code of course). They should probably have different names, although if the inner function's return value were passed via exception (StopIteration or return) the inner callback argument can just go away.
StackContext doesn't quite give you better tracebacks, although I think it could be adapted to do that. ExceptionStackContext is essentially a try/except block that follows you around across asynchronous operations - on entry it sets a thread-local state, and all the tornado asynchronous functions know to save this state when they are passed a callback, and restore it when they execute it. This has proven to be extremely helpful in ensuring that all exceptions get caught by something that knows how to do the appropriate cleanup (i.e. an asynchronous web page serves an error instead of just spinning forever), although it has turned out to be a little more intrusive and magical than I had originally anticipated. https://github.com/facebook/tornado/blob/master/tornado/stack_context.py
Hmm, I think I'd rather deadlock. :) If the event loop is reentrant then the application code has be coded defensively as if it were preemptively multithreaded, which introduces the possibility of deadlock or (probably) more subtle/less frequent errors. Reentrancy has been a significant problem in my experience, so I've been moving towards a policy where methods in Tornado that take a callback never run it immediately; callbacks are always scheduled on the next iteration of the IOLoop with IOLoop.add_callback.
I think I may have thought about "raise Return(x)" and dismissed it as too weird. But then, I'm abnormally comfortable with asynchronous code that passes callbacks around.
Right.
Because we don't need to wait for the write to complete before we continue to the next statement. write() doesn't return anything; it just succeeds or fails, and if it fails the next read_until will fail too. (although in this case it wouldn't hurt to have the yield either)
Right.
It works just fine at the IOLoop level: you call IOLoop.add_handler(fd, func, READ), and you'll get read events whenever there's new data until you call remove_handler(fd) (or update_handler). If you're passing callbacks around explicitly it's pretty straightforward (as much as anything ever is in that style) to allow for those callbacks to be run more than once. The problem is that generators more or less require that each callback be run exactly once. That's a generally desirable property, but the mismatch between the two layers can be difficult to deal with. -Ben

Ben Darnell wrote:
This is something that generator-based coroutines using yield-from ought to handle a lot more cleanly. You should be able to just use an ordinary try-except block in your generator code and have it do the right thing. I hope that the new async core will be designed so that generator-based coroutines can be plugged into it directly and efficiently, without the need for a lot of decorators, callbacks, Futures, etc. in between. -- Greg

On Tue, Oct 9, 2012 at 2:11 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Indeed, in NDB this works great. However tracebacks don't work so great: If you don't catch the exception right away, it takes work to make the tracebacks look right when you catch it a few generator calls down on the (conceptual) stack. I fixed this to some extent in NDB, by passing the traceback explicitly along when setting an exception on a Future; before I did this, tracebacks looked awful. But there are still StackContextquite a few situations in NDB where an uncaught exception prints a baffling traceback, showing lots of frames from the event loop and other async machinery but not the user code that was actually waiting for anything. I have to study Tornado's to see if there are ideas there for improving this.
That has been my hope too. But so far when thinking about this recently I have found the goal elusive -- somehow it seems there *has* to be a distinction between an operation you just *yield* (this would be waiting for a specific low-level I/O operation) and something you use with yield-from, which returns a value through StopIteration. I keep getting a headache when I think about this, so there must be a Monad in there somewhere... :-( Perhaps you can clear things up by showing some detailed (but still simple enough) example code to handle e.g. a simple web client? -- --Guido van Rossum (python.org/~guido)

Guido van Rossum wrote:
Was this before or after the recent change that was supposed to improve tracebacks from yield-fram chains? If there's still a problem after that, maybe exception handling in yield-from requires some more work.
But so far when thinking about this recently I have found the goal elusive --
You might like to take a look at this, where I develop a series of examples culminating in a simple multi-threaded server: http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/yf_current/Exa... Code here: http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/yf_current/Exa...
It may be worth noting that nothing in my server example uses 'yield' to send or receive values -- yield is only used without argument as a suspension point. But the functions containing the yields *are* called with yield-from and may return values via StopIteration. So I think there are (at least) two distinct ways of using generators, but the distinction isn't quite the one you're making. Rather, we have "coroutines" (don't yield values, do return values) and "iterators" (do yield values, don't return values). Moreover, it's *only* the "coroutine" variety that we need to cater for when designing an async event system. Does that help to alleviate any of your monad-induced headaches? -- Greg

On Tue, Oct 9, 2012 at 5:44 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Thanks for this link, it was very helpful to see it all come together from scratch. And I think the most compelling thing about it is something that I hadn't picked up on when I looked at "yield from" before, that it naturally preserves the call stack for exception handling. That's a big deal, and may be worth the requirement of 3.3+ since the tricks we've used to get better exception handling in earlier pythons have been pretty ugly. On the other hand, it does mean starting from scratch with a new asynchronous world that's not directly compatible with the existing Twisted or Tornado ecosystems. -Ben

Tue, Oct 9, 2012 at 5:44 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Sadly it was with Python 2.5/2.7...
Definitely very enlightening. Though I think you should not use 'thread' since that term is already reserved for OS threads as supported by the threading module. In NDB I chose to use 'tasklet' -- while that also has other meanings, its meaning isn't fixed in core Python. You could also use task, which also doesn't have a core Python meaning. Just don't call it "process", never mind that Erlang uses this (a number of other languages rooted in old traditions do too, I believe). Also I think you can now revisit it and rewrite the code to use Python 3.3.
Code here:
http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/yf_current/Exa...
It does bother me somehow that you're not using .send() and yield arguments at all. I notice that you have a lot ofthree-line code blocks like this: block_for_reading(sock) yield data = sock.recv(1024) The general form seems to be: arrange for a callback when some operation can be done without blocking yield do the operation This seems to be begging to be collapsed into a single line, e.g. data = yield sock.recv_async(1024) (I would also prefer to see the socket wrapped in an object that makes it hard to accidentally block.)
Yeah, but see my remark above...
But surely there's still a place for send() and other PEP 342 features?
Not entirely, no. I now have a fair amount experience writing an async system and helping users make sense of its error messages, and there are some practical considerations. E.g. my users sometimes want to treat something as a coroutine but they don't have any yields in it (perhaps they are writing skeleton code and plan to fill in the I/O later). Example: def caller(): data = yield from reader() def reader(): return 'dummy' yield works, but if you drop the yield it doesn't work. With a decorator I know how to make it work either way. -- --Guido van Rossum (python.org/~guido)

On 10/11/2012 2:45 PM, Guido van Rossum wrote:
Tue, Oct 9, 2012 at 5:44 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I read through this also and agree that using 'thread' for 'task', 'tasklet', 'micrethread', or whatever is distracting. Part of the point, to me, is that the code does *not* use (OS) threads and the thread module. Tim Peters intended iterators, including generators, to be an alternative to what he viewed as 'inside-out' callback code. The idea was that pausing where appropriate allowed code that belongs together to be kept together. I find generator-based event loops to be somewhat easier to understand than callback-based loops. I certainly was more comfortable with Greg's example than what I have read about twisted. So I would like to see a generator-based system in the stdlib. -- Terry Jan Reedy

Guido van Rossum wrote:
Both good ideas. I'll see about publishing an updated version.
I wouldn't say I have a "lot". In the spamserver, there are really only three -- one for accepting a connection, one for reading from a socket, and one for writing to a socket. These are primitive operations that would be provided by an async socket library. Generally, all the yields would be hidden inside primitives like this. Normally, user code would never need to use 'yield', only 'yield from'. This probably didn't come through as clearly as it might have in my tutorial. Part of the reason is that at the time I wrote it, I was having to manually expand yield-froms into for-loops, so I was reluctant to use any more of them than I needed to. Also, yield-from was a new and unfamiliar concept, and I didn't want to scare people by overusing it. These considerations led me to push some of the yields slightly further up the layer stack than they could be.
I'm not sure how you're imagining that would work, but whatever it is, it's wrong -- that just doesn't make sense. What *would* make sense is data = yield from sock.recv_async(1024) with sock.recv_async() being a primitive that encapsulates the block/yield/process triplet.
(I would also prefer to see the socket wrapped in an object that makes it hard to accidentally block.)
It would be straightforward to make the primitives be methods of a socket wrapper object. I only used functions in the tutorial in the interests of keeping the amount of machinery to a bare minimum.
But surely there's still a place for send() and other PEP 342 features?
In the wider world of generator usage, yes. If you have a generator that it makes sense to send() things into, for example, and you want to factor part of it out into another function, the fact that yield-from passes through sent values is useful. But we're talking about a very specialised use of generators here, and so far I haven't thought of a use for sent or yielded values in this context that can't be done in a more straightforward way by other means. Keep in mind that a value yielded by a generator being used as part of a coroutine is *not* seen by code calling it with yield-from. Rather, it comes out in the inner loop of the scheduler, from the next() call being used to resume the coroutine. Likewise, any send() call would have to be made by the scheduler, not the yield-from caller. So, the send/yield channel is exclusively for communication with the *scheduler* and nothing else. Under the old way of doing generator-based coroutines, this channel was used to simulate a call stack by yielding 'call' and 'return' instructions that the scheduler interpreted. But all that is now taken care of by the yield-from mechanism, and there is nothing left for the send/yield channel to do.
If you're talking about a decorator that turns a function into a generator, I can't see anything particularly headachish about that. If you mean something else, you'll have to elaborate. -- Greg

Guido van Rossum wrote:
I just tried an experiment using Python 3.3. I modified the parse_request() function of my spamserver example to raise an exception that isn't caught anywhere: def parse_request(line): tokens = line.split() print(tokens) if tokens and tokens[0] == b"EGGS": raise ValueError("Server is allergic to eggs") ... The resulting traceback looks like this. The last two lines show very clearly where abouts the exception occurred in user code. So it all seems to work quite happily. Traceback (most recent call last): File "spamserver.py", line 73, in <module> run2() File "/Local/Projects/D/Python/YieldFrom/3.3/Examples/Scheduler/scheduler.py", line 109, in run2 run() File "/Local/Projects/D/Python/YieldFrom/3.3/Examples/Scheduler/scheduler.py", line 53, in run next(g) File "spamserver.py", line 50, in handler n = parse_request(line) File "spamserver.py", line 61, in parse_request raise ValueError("Server is allergic to eggs") ValueError: Server is allergic to eggs -- Greg

On Mon, Oct 8, 2012 at 10:12 PM, Ben Darnell <ben@bendarnell.com> wrote:
Definitely sounds like something that could be simplified if you didn't have backward compatibility baggage...
Heh. I'll try to mine it for gems.
The latter is a good tactic and I'm also using it. (Except for some reason we had to add the concept of "immediate callbacks" to our Future class, and those are run inside the set_result() call. But most callbacks don't use that feature.) I don't have a choice about making the event loop reentrant -- App Engine's underlying RPC multiplexing implementation *is* reentrant, and there is a large set of "classic" APIs that I cannot stop the user from calling that reenter it. But even if my hand wasn't forced, I'm not sure if I would make your choice. In NDB, there is a full complement of synchronous APIs that exactly matches the async APIs, and users are free to use the synchronous APIs in parts of their code where they don't need concurrency. Hence, every sychronous API just calls its async sibling and immediately waits for its result, which implicitly invokes the event loop. Of course, I have it easy -- multiple incoming requests are dispatched to separate threads by the App Engine runtime, so I don't have to worry about multiplexing at that level at all -- just end user code that is essentially single-threaded unless they go out of their way. I did end up debugging one user's problem where they were making a synchronous call inside an async handler, and -- very rarely! -- the recursive event loop calls kept stacking up until they hit a StackOverflowError. So I would agree that async code shouldn't make synchronous API calls; but I haven't heard yet from anyone who was otherwise hurt by the recursive event loop invocations -- in particular, nobody has requested locks. Still, this sounds like an important issue to revisit when discussing a standard reactor API as part of Lourens's PEP offensive.
As I thought about the issue of how to spell "return a value" and looked at various approaches, I decided I definitely didn't like what monocle does: they let you say "yield X" where X is a non-Future value; and I saw some other solution (Twisted? Phillip Eby?) that simply called a function named something like returnValue(X). But I also wanted it to look like a control statement that ends a block (so auto-indenting editors would auto-dedent the next line), and that means there are only four choices: continue, break, raise or return. Three of those are useless... So the only choice really was which exception to raise. FOrtunately I had the advantage of knowing that PEP 380 was going to implement "return X" from a generator as "raise StopIteration(X)" so I decided to be compatible with that.
I guess you have a certain kind of buffering built in to your stream? So if you make two write() calls without waiting in quick succession, does the system collapse these into one, or does it end up making two system calls, or what? In NDB, there's a similar issue with multiple RPCs that can be batched. I ended up writing an abstraction that automatically combines these; the call isn't actually made until there are no other runnable tasks. I've had to explain this a few times to users who try to get away with overlapping CPU work and I/O, but otherwise it's worked quite well.
Okay, I see that these are useful. However they feel as two very different classes of callbacks -- one that is called when a *specific* piece of I/O that was previously requested is done; another that will be called *whenever* a certain condition becomes true on a certain channel. The former would correspond to e.g. completion of the headers of an incoming HTTP request); the latter might correspond to a "listening" socket receiving another connection. -- --Guido van Rossum (python.org/~guido)

On Thu, Oct 11, 2012 at 3:28 PM, Guido van Rossum <guido@python.org> wrote:
Probably, although I still feel like callback-passing has its place. For example, I think the Tornado chat demo (https://github.com/facebook/tornado/blob/master/demos/chat/chatdemo.py) would be less clear with coroutines and Futures than it is now (although it would fit better into Greg's schedule/unschedule style). That doesn't mean that every method has to take a callback, but I'd be reluctant to get rid of them until we have more experience with the generator/future-focused style.
Tornado has a synchronous HTTPClient that does the same thing, although each fetch creates and runs its own IOLoop rather than spinning the top-level IOLoop. (This means it doesn't really make sense to run it when there is a top-level IOLoop; it's provided as a convenience for scripts and multi-threaded apps who want an HTTPRequest interface consistent with the async version).
I think that's because you don't have file descriptor support. In a (level-triggered) event loop if you don't drain the socket before reentering the loop then your read handler will be called again, which generally makes a mess. I suppose with coroutines you'd want edge-triggered instead of level-triggered though, which might make this problem go away.
Yes, IOStream does buffering for you. Each IOStream.write() call will generally result in a syscall, but once the outgoing socket buffer is full subsequent writes will be buffered in the IOStream and written when the IOLoop says the socket is writable. (the callback argument to write() can be used for flow control in this case) I used to defer the syscall until the IOLoop was idle to batch things up, but it turns out to be more efficient in practice to just write things out each time and let the higher level do its own buffering when appropriate. -Ben

On Thu, Oct 11, 2012 at 5:41 PM, Ben Darnell <ben@bendarnell.com> wrote:
Hmm... That's an interesting challenge. I can't quite say I understand that whole program yet, but I'd like to give it a try. I think it can be made clearer than Tornado with Futures and coroutines -- it all depends on how you define your primitives.
Totally understood. Though the nice thing of Futures is that you can tie callbacks to them *or* use them in coroutines.
I see. Yet another possible design choice.
Ah, good terminology. Coroutines definitely like being edge-triggered.
Makes sense. I think different people might want to implement slightly different IOStream-like abstractions; this would be a good test of the infrastructure. You should be able to craft one from scratch out of sockets and Futures, but there should be one or two standard ones as well, and they should all happily mix and match using the same reactor. -- --Guido van Rossum (python.org/~guido)

I'm not quite sure why Deferreds + @inlineCallbacks is more complicated than Futures + coroutines. They seem, at least from a high level perspective, quite similar. You mention that you can both attach callbacks and use them in coroutines: deferreds do pretty much exactly the same thing (that is, at least there's something to translate your coroutine into a sequence of callbacks/errbacks). If the arcane part of deferreds is from people writing ridiculous errback/callback chains, then I understand. Unfortunately people will write terrible code. cheers lvh

On Sun, Oct 7, 2012 at 9:01 PM, Guido van Rossum <guido@python.org> wrote:
Perhaps this is obvious to others, but (like hinted at above) there seem to be two primary issues with event handlers: 1) event handlers for the machine-program interface (ex. network I/O) 2) event handlers for the program-user interface (ex. mouse I/O) While similar, my gut tell me they have to be handled in completely different way in order to preserve order (i.e. sanity). This issue, for me, has come up with wanting to make a p2p network application with VPython. MarkJ

On Mon, Oct 8, 2012 at 12:20 PM, Mark Adam <dreamingforward@gmail.com> wrote:
Interesting. I agree that these are different in nature, but I think it would still be useful to have a single event loop ("reactor") that can multiplex them together. I think where the paths diverge is when it comes to the signature of the callback; for GUI events there is certain standard structure that must be passed to the callback and which isn't readily available when you *specify* the callback. OTOH for your typical socket event the callback can just call the appropriate method on the socket once it knows the socket is ready. But still, in many cases I would like to see these all serialized in the same thread and multiplexed according to some kind of assigned or implied priorities, and IIRC, GUI events often are "collapsed" (e.g. multple redraw events for the same window, or multiple mouse motion events). I also imagine the typical GUI event loop has hooks for integrating file descriptor polling, or perhaps it gives you a file descriptor to add to your select/poll/etc. map. Also, doesn't the Windows IOCP unify the two? -- --Guido van Rossum (python.org/~guido)

Mark Adam wrote:
They can't be *completely* different, because deep down there has to be a single event loop that can handle all kinds of asynchronous events. Upper layers can provide different APIs for them, but there has to be some commonality in the lowest layers. -- Greg

On Mon, Oct 8, 2012 at 10:56 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
There doesn't *have* to be - you could run a network event loop in one thread and a GUI event loop in another and pass control back and forth via methods like IOLoop.add_callback or Reactor.callFromThread. However, Twisted has Reactor implementations that are integrated with several different GUI toolkit's event loops, and while I haven't worked with such a beast my gut instinct is that in most cases a single shared event loop is the way to go. -Ben

On Tue, Oct 9, 2012 at 1:53 AM, Ben Darnell <ben@bendarnell.com> wrote:
No, this won't work. The key FAIL in that sentence is "...and pass control", because the O.S. has to be in charge of things that happen in user space. And everything in Python happens in user space. (hence my suggestion of creating a Python O.S.). MarkJ

On Wed, Oct 10, 2012 at 9:56 AM, Mark Adam <dreamingforward@gmail.com> wrote:
Letting the OS/GUI library have control of the UI thread is exactly the point I was making. Perhaps "pass control" was a little vague, but what I meant is that you'd have two threads, one for UI and one for networking. When you need to start a network operation from the UI thread you'd use IOLoop.add_callback() to pass a function to the network thread, and then when the network operation completes you'd use the analogous function from the UI library to send the response back and update the interface from the UI thread. -Ben

Hi Ben, Am 08.10.2012 03:41, schrieb Ben Darnell:
Python's standard library doesn't contain in interface to I/O Completion Ports. I think a common event loop system is a good reason to add IOCP if somebody is up for the challenge. Would you prefer an IOCP wrapper in the stdlib or your own version? Twisted has its own Cython based wrapper, some other libraries use a libevent-based solution. Christian

On Mon, 8 Oct 2012 13:04:00 -0400 Mike Graham <mikegraham@gmail.com> wrote:
Except that it's not exactly an equivalent, it's a whole different programming model ;) (but I understand what you mean: it allows to do non-blocking I/O on an arbitrary number of objects in parallel) Regards Antoine. -- Software development and contracting: http://pro.pitrou.net

On Mon, Oct 8, 2012 at 11:36 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Now I know what it is I think that (a) the abstract reactor design should support IOCP, and (b) the stdlib should have enabled by default IOCP when on Windows. -- --Guido van Rossum (python.org/~guido)

Am 08.10.2012 20:40, schrieb Guido van Rossum:
I've created a ticket for the topic: http://bugs.python.org/issue16175 Christian

Am 08.10.2012 17:35, schrieb Guido van Rossum:
I/O Completion Ports, http://en.wikipedia.org/wiki/IOCP It's a Windows (and apparently also Solaris) API for async IO that can handle multiple threads. Christian

On Mon, Oct 08, 2012 at 05:13:03PM -0700, Christian Heimes wrote:
And AIX, too. For every OS IOCP implementation, there's a corresponding Snakebite box :-)
API for async IO that can handle multiple threads.
I find it helps to think of it in terms of a half-sync/half-async pattern. The half-async part handles the I/O; the OS wakes up one of your "I/O" threads upon incoming I/O. The job of such threads is really just to pull/push the bytes from/to kernel/user space as quickly as it can. (Since Vista, Windows has provided a corresponding thread pool API that gels really well with IOCP. Windows will optimally manage threads based on incoming I/O; spawning/destroying threads as per necessary. You can even indicate to Windows whether your threads will be "compute" or I/O bound, which it uses to optimize its scheduling algorithm.) The half-sync part is the event-loop part of your app, which simply churns away on the data prepared for it by the async threads. What would be neat is if the half-async path could be run outside the GIL. They would need to be able to allocate memory that could then be "owned" by the GIL-holding half-sync part. You could leverage this with kqueue and epoll; have similar threads set up to simply process I/O independent of the GIL, using the same facilities that would be used by IOCP-processing threads. Then the "asyncore" event-loop simply becomes the half-sync part of the pattern, enumerating over all the I/O requests queued up for it by all the GIL-independent half-async threads. Trent.

On Wed, 10 Oct 2012 20:55:23 -0400 Trent Nelson <trent@snakebite.org> wrote:
Would you really win anything by doing I/O in separate threads, while doing normal request processing in the main thread? That said, the idea of a common API architected around async I/O, rather than non-blocking I/O, sounds interesting at least theoretically. Maybe all those outdated Snakebite Operating Systems are useful for something after all. ;-P cheers Antoine. -- Software development and contracting: http://pro.pitrou.net

On Thu, Oct 11, 2012 at 07:40:43AM -0700, Antoine Pitrou wrote:
If the I/O threads can run independent of the GIL, yes, definitely. The whole premise of IOCP is that the kernel takes care of waking one of your I/O handlers when data is ready. IOCP allows that to happen completely independent of your application's event loop. It really is the best way to do I/O. The Windows NT design team got it right from the start. The AIX and Solaris implementations are semantically equivalent to Windows, without the benefit of automatic thread pool management (and a few other optimisations). On Linux and BSD, you could get similar functionality by spawning I/O threads that could also run independent of the GIL. They would differ from the IOCP worker threads in the sense that they all have their own little event loops around epoll/kqueue+timeout. i.e. they have to continually ask "is there anything to do with this set of fds", then process the results, then manage set synchronisation. IOCP threads, on the other hand, wait for completion of something that has already been requested. The thread body implementation is significantly simpler, and no synchronisation primitives are needed.
That said, the idea of a common API architected around async I/O, rather than non-blocking I/O, sounds interesting at least theoretically.
It's the best way to do it. There should really be a libevent-type library (libiocp?) that leverages IOCP where possible, and fakes it when not using a half-sync/half-async pattern with threads and epoll or kqueue on Linux and FreeBSD, falling back to processes and poll on everything else (NetBSD, OpenBSD and HP-UX (the former two not having robust-enough pthread implementations, the latter not having anything better than select or poll)). However, given that the best IOCP implementations are a) Windows by a huge margin, and then b) Solaris and AIX in equal, distant second place, I can't see that happening any time soon. (Trying to use IOCP in the reactor fashion described above for epoll and kqueue is far more limiting than having an IOCP-oriented API and faking it for platforms where native support isn't available.)
Maybe all those outdated Snakebite Operating Systems are useful for something after all. ;-P
All the operating systems are the latest version available! In addition, there's also a Solaris 9 and HP-UX 11iv2 box. The hardware, on the other hand... not so new in some cases. Trent.

On 08/10/2012 03:41 Ben Darnell wrote:
The caller of such a potentially blocking function could: * spawn a new thread for the call * call the function inside the thread and collect return value or exception * register the thread (id) to inform the event loop (scheduler) it's waiting for it's completion * yield (aka "switch" in greenlet) to the event loop / scheduler * upon continuation either continue with the result or reraise the exception that happened in the thread Unfortunately on Unix systems select/poll/kqueue cannot specify threads as event resources, so an additional pipe descriptor would be needed for the scheduler to detect thread completions without blocking (threads would write to the pipe upon completion), not elegant but doable. Joachim

On Mon, Oct 8, 2012 at 6:34 AM, Joachim König <him@online.de> wrote:
Ben just posted an example of how to do exactly that for getaddrinfo().
However it must be done this seems a useful thing to solve once and for all in a standard reactor specification and stdlib implementation. (Ditto for signal handlers BTW.) -- --Guido van Rossum (python.org/~guido)

On Sun, Oct 7, 2012 at 6:41 PM, Ben Darnell <ben@bendarnell.com> wrote:
Hi python-ideas,
I'm jumping in to this thread on behalf of Tornado.
Welcome!
Yes, yes. I tried to bring up thing distinction. I'm glad I didn't completely fail.
As long as it's not so low-level that other people shy away from it. I also have a feeling that one way or another this will require cooperation between the Twisted and Tornado developers in order to come up with a compromise that both are willing to conform to in a meaningful way. (Unfortunately I don't know how to define "meaningful way" more precisely here. I guess the idea is that almost all things *using* an event loop use the standardized abstract API without caring whether underneath it's Tornado, Twisted, or some simpler thing in the stdlib.
Agreed on both counts.
Yay!
Ditto for NDB (though there's a decorator that often takes care of the future construction).
That's interesting. I haven't found the need for this yet. Is it really so common that you can't write this as a Future() constructor plus a call to add_done_callback()? Or is there some subtle semantic difference?
In Tornado the Future is created by a decorator and hidden from the asynchronous function (it just sees the callback),
Hm, interesting. NDB goes the other way, the callbacks are mostly used to make Futures work, and most code (including large swaths of internal code) uses Futures. I think NDB is similar to monocle here. In NDB, you can do f = <some function returning a Future> r = yield f where "yield f" is mostly equivalent to f.result(), except it gives better opportunity for concurrency.
Yes! Same here. I am currently trying to understand if using "yield from" (and returning a value from a generator) will simplify things. For example maybe the need for a special decorator might go away. But I keep getting headaches -- perhaps there's a Monad involved. :-) -- --Guido van Rossum (python.org/~guido)

On Sun, Oct 7, 2012 at 7:01 PM, Guido van Rossum <guido@python.org> wrote:
As long as it's not so low-level that other people shy away from it.
That depends on the target audience. The low-level IOLoop and Reactor are pretty similar -- you can implement one in terms of the other -- but as you move up the stack cross-compatibility becomes harder. For example, if I wanted to implement tornado's IOStreams in twisted, I wouldn't start with the analogous class in twisted (Protocol?), I'd go down to the Reactor and build from there, so putting something IOStream or Protocol in asycore2 wouldn't do much to unify the two worlds. (it would help people build async stuff with the stdlib alone, but at that point it becomes more like a peer or competitor to tornado and twisted instead of a bridge between them)
I'd phrase the goal as being able to run both Tornado and Twisted in the same thread without any piece of code needing to know about both systems. I think that's achievable as far as core functionality goes. I expect both sides have some lesser-used functionality that might not make it into the stdlib version, but as long as it's possible to plug in a "real" IOLoop or Reactor when needed it should be OK.
It's a Future constructor, a (conditional) add_done_callback, plus the calls to set_result or set_exception and the with statement for error handling. In full: def future_wrap(f): @functools.wraps(f) def wrapper(*args, **kwargs): future = Future() if kwargs.get('callback') is not None: future.add_done_callback(kwargs.pop('callback')) kwargs['callback'] = future.set_result def handle_error(typ, value, tb): future.set_exception(value) return True with ExceptionStackContext(handle_error): f(*args, **kwargs) return future return wrapper
Yes, tornado's gen.engine does the same thing here. However, the stakes are higher than "better opportunity for concurrency" - in an event loop if you call future.result() without yielding, you'll deadlock if that Future's task needs to run on the same event loop.
I think if you build generator handling directly into the event loop and use "yield from" for calls from one async function to another then you can get by without any decorators. But I'm not sure if you can do that and maintain any compatibility with existing non-generator async code. I think the ability to return from a generator is actually a bigger deal than "yield from" (and I only learned about it from another python-ideas thread today). The only reason a generator decorated with @tornado.gen.engine needs a callback passed in to it is to act as a psuedo-return, and a real return would prevent the common mistake of running the callback then falling through to the rest of the function. For concreteness, here's a crude sketch of what the APIs I'm talking about would look like in use (in a hypothetical future version of tornado). @future_wrap @gen.engine def async_http_client(url, callback): parsed_url = urlparse.urlsplit(url) # works the same whether the future comes from a thread pool or @future_wrap addrinfo = yield g_thread_pool.submit(socket.getaddrinfo, parsed_url.hostname, parsed_url.port) stream = IOStream(socket.socket()) yield stream.connect((addrinfo[0][-1])) stream.write('GET %s HTTP/1.0' % parsed_url.path) header_data = yield stream.read_until('\r\n\r\n') headers = parse_headers(header_data) body_data = yield stream.read_bytes(int(headers['Content-Length'])) stream.close() callback(body_data) # another function to demonstrate composability @future_wrap @gen.engine def fetch_some_urls(url1, url2, url3, callback): body1 = yield async_http_client(url1) # yield a list of futures for concurrency future2 = yield async_http_client(url2) future3 = yield async_http_client(url3) body2, body3 = yield [future2, future3] callback((body1, body2, body3)) One hole in this design is how to deal with callbacks that are run multiple times. For example, the IOStream read methods take both a regular callback and an optional streaming_callback (which is called with each chunk of data as it arrives). I think this needs to be modeled as something like an iterator of Futures, but I haven't worked out the details yet. -Ben
-- --Guido van Rossum (python.org/~guido)

On Sun, Oct 7, 2012 at 9:44 PM, Ben Darnell <ben@bendarnell.com> wrote:
Sure. And of course we can't expect Twisted and Tornado to just merge projects. They each have different strengths and weaknesses and they each have strong opinions on how things should be done. I do get your point that none of that is incompatible with a shared reactor specification.
Sounds good. I think a reactor is always going to be an extension of the shared spec. [...]
Hmm... I *think* it automatically adds a special keyword 'callback' to the *call* site so that you can do things like fut = some_wrapped_func(blah, callback=my_callback) and then instead of using yield to wait for the callback, put the continuation of your code in the my_callback() function. But it also seems like it passes callback=future.set_result as the callback to the wrapped function, which looks to me like that function was apparently written before Futures were widely used. This seems pretty impure to me and I'd like to propose a "future" where such functions either be given the Future where the result is expected, or (more commonly) the function would create the Future itself. Unless I'm totally missing the programming model here. PS. I'd like to learn more about ExceptionStackContext() -- I've struggled somewhat with getting decent tracebacks in NDB.
That would depend on the semantics of the event loop implementation. In NDB's event loop, such a .result() call would just recursively enter the event loop, and you'd only deadlock if you actually have two pieces of code waiting for each other's completion. [...]
Ah, so you didn't come up with the clever hack of raising an exception to signify the return value. In NDB, you raise StopIteration (though it is given the alias 'Return' for clarity) with an argument, and the wrapper code that is responsible for the Future takes the value from the StopIteration exception and passes it to the Future's set_result().
And you need the thread pool because there's no async version of getaddrinfo(), right?
Why no yield in front of the write() call?
This second one is nearly identical to the way we it's done in NDB. However I think you have a typo -- I doubt that there should be yields on the lines creating future2 and future3.
Ah. Yes, that's a completely different kind of thing, and probably needs to be handled in a totally different way. I think it probably needs to be modeled more like an infinite loop where at the blocking point (e.g. a low-level read() or accept() call) you yield a Future. Although I can see that this doesn't work well with the IOLoop's concept of file descriptor (or other event source) registration. -- --Guido van Rossum (python.org/~guido)

On Mon, Oct 8, 2012 at 8:30 AM, Guido van Rossum <guido@python.org> wrote:
Yes. Note that if you're passing in a callback you're probably going to just ignore the return value. The callback argument and the future return value are essentially two alternative interfaces; it probably doesn't make sense to use both at once (but as a library author it's useful to provide both).
Yes, it's impure and based on pre-Future patterns. The caller's callback argument and the inner function's callback not really related any more (they were the same in pre-Future async code of course). They should probably have different names, although if the inner function's return value were passed via exception (StopIteration or return) the inner callback argument can just go away.
StackContext doesn't quite give you better tracebacks, although I think it could be adapted to do that. ExceptionStackContext is essentially a try/except block that follows you around across asynchronous operations - on entry it sets a thread-local state, and all the tornado asynchronous functions know to save this state when they are passed a callback, and restore it when they execute it. This has proven to be extremely helpful in ensuring that all exceptions get caught by something that knows how to do the appropriate cleanup (i.e. an asynchronous web page serves an error instead of just spinning forever), although it has turned out to be a little more intrusive and magical than I had originally anticipated. https://github.com/facebook/tornado/blob/master/tornado/stack_context.py
Hmm, I think I'd rather deadlock. :) If the event loop is reentrant then the application code has be coded defensively as if it were preemptively multithreaded, which introduces the possibility of deadlock or (probably) more subtle/less frequent errors. Reentrancy has been a significant problem in my experience, so I've been moving towards a policy where methods in Tornado that take a callback never run it immediately; callbacks are always scheduled on the next iteration of the IOLoop with IOLoop.add_callback.
I think I may have thought about "raise Return(x)" and dismissed it as too weird. But then, I'm abnormally comfortable with asynchronous code that passes callbacks around.
Right.
Because we don't need to wait for the write to complete before we continue to the next statement. write() doesn't return anything; it just succeeds or fails, and if it fails the next read_until will fail too. (although in this case it wouldn't hurt to have the yield either)
Right.
It works just fine at the IOLoop level: you call IOLoop.add_handler(fd, func, READ), and you'll get read events whenever there's new data until you call remove_handler(fd) (or update_handler). If you're passing callbacks around explicitly it's pretty straightforward (as much as anything ever is in that style) to allow for those callbacks to be run more than once. The problem is that generators more or less require that each callback be run exactly once. That's a generally desirable property, but the mismatch between the two layers can be difficult to deal with. -Ben

Ben Darnell wrote:
This is something that generator-based coroutines using yield-from ought to handle a lot more cleanly. You should be able to just use an ordinary try-except block in your generator code and have it do the right thing. I hope that the new async core will be designed so that generator-based coroutines can be plugged into it directly and efficiently, without the need for a lot of decorators, callbacks, Futures, etc. in between. -- Greg

On Tue, Oct 9, 2012 at 2:11 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Indeed, in NDB this works great. However tracebacks don't work so great: If you don't catch the exception right away, it takes work to make the tracebacks look right when you catch it a few generator calls down on the (conceptual) stack. I fixed this to some extent in NDB, by passing the traceback explicitly along when setting an exception on a Future; before I did this, tracebacks looked awful. But there are still StackContextquite a few situations in NDB where an uncaught exception prints a baffling traceback, showing lots of frames from the event loop and other async machinery but not the user code that was actually waiting for anything. I have to study Tornado's to see if there are ideas there for improving this.
That has been my hope too. But so far when thinking about this recently I have found the goal elusive -- somehow it seems there *has* to be a distinction between an operation you just *yield* (this would be waiting for a specific low-level I/O operation) and something you use with yield-from, which returns a value through StopIteration. I keep getting a headache when I think about this, so there must be a Monad in there somewhere... :-( Perhaps you can clear things up by showing some detailed (but still simple enough) example code to handle e.g. a simple web client? -- --Guido van Rossum (python.org/~guido)

Guido van Rossum wrote:
Was this before or after the recent change that was supposed to improve tracebacks from yield-fram chains? If there's still a problem after that, maybe exception handling in yield-from requires some more work.
But so far when thinking about this recently I have found the goal elusive --
You might like to take a look at this, where I develop a series of examples culminating in a simple multi-threaded server: http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/yf_current/Exa... Code here: http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/yf_current/Exa...
It may be worth noting that nothing in my server example uses 'yield' to send or receive values -- yield is only used without argument as a suspension point. But the functions containing the yields *are* called with yield-from and may return values via StopIteration. So I think there are (at least) two distinct ways of using generators, but the distinction isn't quite the one you're making. Rather, we have "coroutines" (don't yield values, do return values) and "iterators" (do yield values, don't return values). Moreover, it's *only* the "coroutine" variety that we need to cater for when designing an async event system. Does that help to alleviate any of your monad-induced headaches? -- Greg

On Tue, Oct 9, 2012 at 5:44 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Thanks for this link, it was very helpful to see it all come together from scratch. And I think the most compelling thing about it is something that I hadn't picked up on when I looked at "yield from" before, that it naturally preserves the call stack for exception handling. That's a big deal, and may be worth the requirement of 3.3+ since the tricks we've used to get better exception handling in earlier pythons have been pretty ugly. On the other hand, it does mean starting from scratch with a new asynchronous world that's not directly compatible with the existing Twisted or Tornado ecosystems. -Ben

Tue, Oct 9, 2012 at 5:44 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Sadly it was with Python 2.5/2.7...
Definitely very enlightening. Though I think you should not use 'thread' since that term is already reserved for OS threads as supported by the threading module. In NDB I chose to use 'tasklet' -- while that also has other meanings, its meaning isn't fixed in core Python. You could also use task, which also doesn't have a core Python meaning. Just don't call it "process", never mind that Erlang uses this (a number of other languages rooted in old traditions do too, I believe). Also I think you can now revisit it and rewrite the code to use Python 3.3.
Code here:
http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/yf_current/Exa...
It does bother me somehow that you're not using .send() and yield arguments at all. I notice that you have a lot ofthree-line code blocks like this: block_for_reading(sock) yield data = sock.recv(1024) The general form seems to be: arrange for a callback when some operation can be done without blocking yield do the operation This seems to be begging to be collapsed into a single line, e.g. data = yield sock.recv_async(1024) (I would also prefer to see the socket wrapped in an object that makes it hard to accidentally block.)
Yeah, but see my remark above...
But surely there's still a place for send() and other PEP 342 features?
Not entirely, no. I now have a fair amount experience writing an async system and helping users make sense of its error messages, and there are some practical considerations. E.g. my users sometimes want to treat something as a coroutine but they don't have any yields in it (perhaps they are writing skeleton code and plan to fill in the I/O later). Example: def caller(): data = yield from reader() def reader(): return 'dummy' yield works, but if you drop the yield it doesn't work. With a decorator I know how to make it work either way. -- --Guido van Rossum (python.org/~guido)

On 10/11/2012 2:45 PM, Guido van Rossum wrote:
Tue, Oct 9, 2012 at 5:44 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I read through this also and agree that using 'thread' for 'task', 'tasklet', 'micrethread', or whatever is distracting. Part of the point, to me, is that the code does *not* use (OS) threads and the thread module. Tim Peters intended iterators, including generators, to be an alternative to what he viewed as 'inside-out' callback code. The idea was that pausing where appropriate allowed code that belongs together to be kept together. I find generator-based event loops to be somewhat easier to understand than callback-based loops. I certainly was more comfortable with Greg's example than what I have read about twisted. So I would like to see a generator-based system in the stdlib. -- Terry Jan Reedy

Guido van Rossum wrote:
Both good ideas. I'll see about publishing an updated version.
I wouldn't say I have a "lot". In the spamserver, there are really only three -- one for accepting a connection, one for reading from a socket, and one for writing to a socket. These are primitive operations that would be provided by an async socket library. Generally, all the yields would be hidden inside primitives like this. Normally, user code would never need to use 'yield', only 'yield from'. This probably didn't come through as clearly as it might have in my tutorial. Part of the reason is that at the time I wrote it, I was having to manually expand yield-froms into for-loops, so I was reluctant to use any more of them than I needed to. Also, yield-from was a new and unfamiliar concept, and I didn't want to scare people by overusing it. These considerations led me to push some of the yields slightly further up the layer stack than they could be.
I'm not sure how you're imagining that would work, but whatever it is, it's wrong -- that just doesn't make sense. What *would* make sense is data = yield from sock.recv_async(1024) with sock.recv_async() being a primitive that encapsulates the block/yield/process triplet.
(I would also prefer to see the socket wrapped in an object that makes it hard to accidentally block.)
It would be straightforward to make the primitives be methods of a socket wrapper object. I only used functions in the tutorial in the interests of keeping the amount of machinery to a bare minimum.
But surely there's still a place for send() and other PEP 342 features?
In the wider world of generator usage, yes. If you have a generator that it makes sense to send() things into, for example, and you want to factor part of it out into another function, the fact that yield-from passes through sent values is useful. But we're talking about a very specialised use of generators here, and so far I haven't thought of a use for sent or yielded values in this context that can't be done in a more straightforward way by other means. Keep in mind that a value yielded by a generator being used as part of a coroutine is *not* seen by code calling it with yield-from. Rather, it comes out in the inner loop of the scheduler, from the next() call being used to resume the coroutine. Likewise, any send() call would have to be made by the scheduler, not the yield-from caller. So, the send/yield channel is exclusively for communication with the *scheduler* and nothing else. Under the old way of doing generator-based coroutines, this channel was used to simulate a call stack by yielding 'call' and 'return' instructions that the scheduler interpreted. But all that is now taken care of by the yield-from mechanism, and there is nothing left for the send/yield channel to do.
If you're talking about a decorator that turns a function into a generator, I can't see anything particularly headachish about that. If you mean something else, you'll have to elaborate. -- Greg

Guido van Rossum wrote:
I just tried an experiment using Python 3.3. I modified the parse_request() function of my spamserver example to raise an exception that isn't caught anywhere: def parse_request(line): tokens = line.split() print(tokens) if tokens and tokens[0] == b"EGGS": raise ValueError("Server is allergic to eggs") ... The resulting traceback looks like this. The last two lines show very clearly where abouts the exception occurred in user code. So it all seems to work quite happily. Traceback (most recent call last): File "spamserver.py", line 73, in <module> run2() File "/Local/Projects/D/Python/YieldFrom/3.3/Examples/Scheduler/scheduler.py", line 109, in run2 run() File "/Local/Projects/D/Python/YieldFrom/3.3/Examples/Scheduler/scheduler.py", line 53, in run next(g) File "spamserver.py", line 50, in handler n = parse_request(line) File "spamserver.py", line 61, in parse_request raise ValueError("Server is allergic to eggs") ValueError: Server is allergic to eggs -- Greg

On Mon, Oct 8, 2012 at 10:12 PM, Ben Darnell <ben@bendarnell.com> wrote:
Definitely sounds like something that could be simplified if you didn't have backward compatibility baggage...
Heh. I'll try to mine it for gems.
The latter is a good tactic and I'm also using it. (Except for some reason we had to add the concept of "immediate callbacks" to our Future class, and those are run inside the set_result() call. But most callbacks don't use that feature.) I don't have a choice about making the event loop reentrant -- App Engine's underlying RPC multiplexing implementation *is* reentrant, and there is a large set of "classic" APIs that I cannot stop the user from calling that reenter it. But even if my hand wasn't forced, I'm not sure if I would make your choice. In NDB, there is a full complement of synchronous APIs that exactly matches the async APIs, and users are free to use the synchronous APIs in parts of their code where they don't need concurrency. Hence, every sychronous API just calls its async sibling and immediately waits for its result, which implicitly invokes the event loop. Of course, I have it easy -- multiple incoming requests are dispatched to separate threads by the App Engine runtime, so I don't have to worry about multiplexing at that level at all -- just end user code that is essentially single-threaded unless they go out of their way. I did end up debugging one user's problem where they were making a synchronous call inside an async handler, and -- very rarely! -- the recursive event loop calls kept stacking up until they hit a StackOverflowError. So I would agree that async code shouldn't make synchronous API calls; but I haven't heard yet from anyone who was otherwise hurt by the recursive event loop invocations -- in particular, nobody has requested locks. Still, this sounds like an important issue to revisit when discussing a standard reactor API as part of Lourens's PEP offensive.
As I thought about the issue of how to spell "return a value" and looked at various approaches, I decided I definitely didn't like what monocle does: they let you say "yield X" where X is a non-Future value; and I saw some other solution (Twisted? Phillip Eby?) that simply called a function named something like returnValue(X). But I also wanted it to look like a control statement that ends a block (so auto-indenting editors would auto-dedent the next line), and that means there are only four choices: continue, break, raise or return. Three of those are useless... So the only choice really was which exception to raise. FOrtunately I had the advantage of knowing that PEP 380 was going to implement "return X" from a generator as "raise StopIteration(X)" so I decided to be compatible with that.
I guess you have a certain kind of buffering built in to your stream? So if you make two write() calls without waiting in quick succession, does the system collapse these into one, or does it end up making two system calls, or what? In NDB, there's a similar issue with multiple RPCs that can be batched. I ended up writing an abstraction that automatically combines these; the call isn't actually made until there are no other runnable tasks. I've had to explain this a few times to users who try to get away with overlapping CPU work and I/O, but otherwise it's worked quite well.
Okay, I see that these are useful. However they feel as two very different classes of callbacks -- one that is called when a *specific* piece of I/O that was previously requested is done; another that will be called *whenever* a certain condition becomes true on a certain channel. The former would correspond to e.g. completion of the headers of an incoming HTTP request); the latter might correspond to a "listening" socket receiving another connection. -- --Guido van Rossum (python.org/~guido)

On Thu, Oct 11, 2012 at 3:28 PM, Guido van Rossum <guido@python.org> wrote:
Probably, although I still feel like callback-passing has its place. For example, I think the Tornado chat demo (https://github.com/facebook/tornado/blob/master/demos/chat/chatdemo.py) would be less clear with coroutines and Futures than it is now (although it would fit better into Greg's schedule/unschedule style). That doesn't mean that every method has to take a callback, but I'd be reluctant to get rid of them until we have more experience with the generator/future-focused style.
Tornado has a synchronous HTTPClient that does the same thing, although each fetch creates and runs its own IOLoop rather than spinning the top-level IOLoop. (This means it doesn't really make sense to run it when there is a top-level IOLoop; it's provided as a convenience for scripts and multi-threaded apps who want an HTTPRequest interface consistent with the async version).
I think that's because you don't have file descriptor support. In a (level-triggered) event loop if you don't drain the socket before reentering the loop then your read handler will be called again, which generally makes a mess. I suppose with coroutines you'd want edge-triggered instead of level-triggered though, which might make this problem go away.
Yes, IOStream does buffering for you. Each IOStream.write() call will generally result in a syscall, but once the outgoing socket buffer is full subsequent writes will be buffered in the IOStream and written when the IOLoop says the socket is writable. (the callback argument to write() can be used for flow control in this case) I used to defer the syscall until the IOLoop was idle to batch things up, but it turns out to be more efficient in practice to just write things out each time and let the higher level do its own buffering when appropriate. -Ben

On Thu, Oct 11, 2012 at 5:41 PM, Ben Darnell <ben@bendarnell.com> wrote:
Hmm... That's an interesting challenge. I can't quite say I understand that whole program yet, but I'd like to give it a try. I think it can be made clearer than Tornado with Futures and coroutines -- it all depends on how you define your primitives.
Totally understood. Though the nice thing of Futures is that you can tie callbacks to them *or* use them in coroutines.
I see. Yet another possible design choice.
Ah, good terminology. Coroutines definitely like being edge-triggered.
Makes sense. I think different people might want to implement slightly different IOStream-like abstractions; this would be a good test of the infrastructure. You should be able to craft one from scratch out of sockets and Futures, but there should be one or two standard ones as well, and they should all happily mix and match using the same reactor. -- --Guido van Rossum (python.org/~guido)

I'm not quite sure why Deferreds + @inlineCallbacks is more complicated than Futures + coroutines. They seem, at least from a high level perspective, quite similar. You mention that you can both attach callbacks and use them in coroutines: deferreds do pretty much exactly the same thing (that is, at least there's something to translate your coroutine into a sequence of callbacks/errbacks). If the arcane part of deferreds is from people writing ridiculous errback/callback chains, then I understand. Unfortunately people will write terrible code. cheers lvh

On Sun, Oct 7, 2012 at 9:01 PM, Guido van Rossum <guido@python.org> wrote:
Perhaps this is obvious to others, but (like hinted at above) there seem to be two primary issues with event handlers: 1) event handlers for the machine-program interface (ex. network I/O) 2) event handlers for the program-user interface (ex. mouse I/O) While similar, my gut tell me they have to be handled in completely different way in order to preserve order (i.e. sanity). This issue, for me, has come up with wanting to make a p2p network application with VPython. MarkJ

On Mon, Oct 8, 2012 at 12:20 PM, Mark Adam <dreamingforward@gmail.com> wrote:
Interesting. I agree that these are different in nature, but I think it would still be useful to have a single event loop ("reactor") that can multiplex them together. I think where the paths diverge is when it comes to the signature of the callback; for GUI events there is certain standard structure that must be passed to the callback and which isn't readily available when you *specify* the callback. OTOH for your typical socket event the callback can just call the appropriate method on the socket once it knows the socket is ready. But still, in many cases I would like to see these all serialized in the same thread and multiplexed according to some kind of assigned or implied priorities, and IIRC, GUI events often are "collapsed" (e.g. multple redraw events for the same window, or multiple mouse motion events). I also imagine the typical GUI event loop has hooks for integrating file descriptor polling, or perhaps it gives you a file descriptor to add to your select/poll/etc. map. Also, doesn't the Windows IOCP unify the two? -- --Guido van Rossum (python.org/~guido)

Mark Adam wrote:
They can't be *completely* different, because deep down there has to be a single event loop that can handle all kinds of asynchronous events. Upper layers can provide different APIs for them, but there has to be some commonality in the lowest layers. -- Greg

On Mon, Oct 8, 2012 at 10:56 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
There doesn't *have* to be - you could run a network event loop in one thread and a GUI event loop in another and pass control back and forth via methods like IOLoop.add_callback or Reactor.callFromThread. However, Twisted has Reactor implementations that are integrated with several different GUI toolkit's event loops, and while I haven't worked with such a beast my gut instinct is that in most cases a single shared event loop is the way to go. -Ben

On Tue, Oct 9, 2012 at 1:53 AM, Ben Darnell <ben@bendarnell.com> wrote:
No, this won't work. The key FAIL in that sentence is "...and pass control", because the O.S. has to be in charge of things that happen in user space. And everything in Python happens in user space. (hence my suggestion of creating a Python O.S.). MarkJ

On Wed, Oct 10, 2012 at 9:56 AM, Mark Adam <dreamingforward@gmail.com> wrote:
Letting the OS/GUI library have control of the UI thread is exactly the point I was making. Perhaps "pass control" was a little vague, but what I meant is that you'd have two threads, one for UI and one for networking. When you need to start a network operation from the UI thread you'd use IOLoop.add_callback() to pass a function to the network thread, and then when the network operation completes you'd use the analogous function from the UI library to send the response back and update the interface from the UI thread. -Ben

Hi Ben, Am 08.10.2012 03:41, schrieb Ben Darnell:
Python's standard library doesn't contain in interface to I/O Completion Ports. I think a common event loop system is a good reason to add IOCP if somebody is up for the challenge. Would you prefer an IOCP wrapper in the stdlib or your own version? Twisted has its own Cython based wrapper, some other libraries use a libevent-based solution. Christian

On Mon, 8 Oct 2012 13:04:00 -0400 Mike Graham <mikegraham@gmail.com> wrote:
Except that it's not exactly an equivalent, it's a whole different programming model ;) (but I understand what you mean: it allows to do non-blocking I/O on an arbitrary number of objects in parallel) Regards Antoine. -- Software development and contracting: http://pro.pitrou.net

On Mon, Oct 8, 2012 at 11:36 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Now I know what it is I think that (a) the abstract reactor design should support IOCP, and (b) the stdlib should have enabled by default IOCP when on Windows. -- --Guido van Rossum (python.org/~guido)

Am 08.10.2012 20:40, schrieb Guido van Rossum:
I've created a ticket for the topic: http://bugs.python.org/issue16175 Christian

Am 08.10.2012 17:35, schrieb Guido van Rossum:
I/O Completion Ports, http://en.wikipedia.org/wiki/IOCP It's a Windows (and apparently also Solaris) API for async IO that can handle multiple threads. Christian

On Mon, Oct 08, 2012 at 05:13:03PM -0700, Christian Heimes wrote:
And AIX, too. For every OS IOCP implementation, there's a corresponding Snakebite box :-)
API for async IO that can handle multiple threads.
I find it helps to think of it in terms of a half-sync/half-async pattern. The half-async part handles the I/O; the OS wakes up one of your "I/O" threads upon incoming I/O. The job of such threads is really just to pull/push the bytes from/to kernel/user space as quickly as it can. (Since Vista, Windows has provided a corresponding thread pool API that gels really well with IOCP. Windows will optimally manage threads based on incoming I/O; spawning/destroying threads as per necessary. You can even indicate to Windows whether your threads will be "compute" or I/O bound, which it uses to optimize its scheduling algorithm.) The half-sync part is the event-loop part of your app, which simply churns away on the data prepared for it by the async threads. What would be neat is if the half-async path could be run outside the GIL. They would need to be able to allocate memory that could then be "owned" by the GIL-holding half-sync part. You could leverage this with kqueue and epoll; have similar threads set up to simply process I/O independent of the GIL, using the same facilities that would be used by IOCP-processing threads. Then the "asyncore" event-loop simply becomes the half-sync part of the pattern, enumerating over all the I/O requests queued up for it by all the GIL-independent half-async threads. Trent.

On Wed, 10 Oct 2012 20:55:23 -0400 Trent Nelson <trent@snakebite.org> wrote:
Would you really win anything by doing I/O in separate threads, while doing normal request processing in the main thread? That said, the idea of a common API architected around async I/O, rather than non-blocking I/O, sounds interesting at least theoretically. Maybe all those outdated Snakebite Operating Systems are useful for something after all. ;-P cheers Antoine. -- Software development and contracting: http://pro.pitrou.net

On Thu, Oct 11, 2012 at 07:40:43AM -0700, Antoine Pitrou wrote:
If the I/O threads can run independent of the GIL, yes, definitely. The whole premise of IOCP is that the kernel takes care of waking one of your I/O handlers when data is ready. IOCP allows that to happen completely independent of your application's event loop. It really is the best way to do I/O. The Windows NT design team got it right from the start. The AIX and Solaris implementations are semantically equivalent to Windows, without the benefit of automatic thread pool management (and a few other optimisations). On Linux and BSD, you could get similar functionality by spawning I/O threads that could also run independent of the GIL. They would differ from the IOCP worker threads in the sense that they all have their own little event loops around epoll/kqueue+timeout. i.e. they have to continually ask "is there anything to do with this set of fds", then process the results, then manage set synchronisation. IOCP threads, on the other hand, wait for completion of something that has already been requested. The thread body implementation is significantly simpler, and no synchronisation primitives are needed.
That said, the idea of a common API architected around async I/O, rather than non-blocking I/O, sounds interesting at least theoretically.
It's the best way to do it. There should really be a libevent-type library (libiocp?) that leverages IOCP where possible, and fakes it when not using a half-sync/half-async pattern with threads and epoll or kqueue on Linux and FreeBSD, falling back to processes and poll on everything else (NetBSD, OpenBSD and HP-UX (the former two not having robust-enough pthread implementations, the latter not having anything better than select or poll)). However, given that the best IOCP implementations are a) Windows by a huge margin, and then b) Solaris and AIX in equal, distant second place, I can't see that happening any time soon. (Trying to use IOCP in the reactor fashion described above for epoll and kqueue is far more limiting than having an IOCP-oriented API and faking it for platforms where native support isn't available.)
Maybe all those outdated Snakebite Operating Systems are useful for something after all. ;-P
All the operating systems are the latest version available! In addition, there's also a Solaris 9 and HP-UX 11iv2 box. The hardware, on the other hand... not so new in some cases. Trent.

On 08/10/2012 03:41 Ben Darnell wrote:
The caller of such a potentially blocking function could: * spawn a new thread for the call * call the function inside the thread and collect return value or exception * register the thread (id) to inform the event loop (scheduler) it's waiting for it's completion * yield (aka "switch" in greenlet) to the event loop / scheduler * upon continuation either continue with the result or reraise the exception that happened in the thread Unfortunately on Unix systems select/poll/kqueue cannot specify threads as event resources, so an additional pipe descriptor would be needed for the scheduler to detect thread completions without blocking (threads would write to the pipe upon completion), not elegant but doable. Joachim

On Mon, Oct 8, 2012 at 6:34 AM, Joachim König <him@online.de> wrote:
Ben just posted an example of how to do exactly that for getaddrinfo().
However it must be done this seems a useful thing to solve once and for all in a standard reactor specification and stdlib implementation. (Ditto for signal handlers BTW.) -- --Guido van Rossum (python.org/~guido)
participants (12)
-
Antoine Pitrou
-
Ben Darnell
-
Christian Heimes
-
Greg Ewing
-
Guido van Rossum
-
Joachim König
-
Laurens Van Houtven
-
Mark Adam
-
Mike Graham
-
Paul Moore
-
Terry Reedy
-
Trent Nelson