The async API of the future: Some thoughts from an ignorant Tornado user

(This is a response to GVR's Google+ post asking for ideas; I apologize in advance if I come off as an ignorant programming newbie) I am the author of Gate One (https://github.com/liftoff/GateOne/) which makes extensive use of Tornado's asynchronous capabilities. It also uses multiprocessing and threading to a lesser extent. The biggest issue I've had trying to write asynchronous code for Gate One is complexity. Complexity creates problems with expressiveness which results in code that, to me, feels un-Pythonic. For evidence of this I present the following example: The retrieve_log_playback() function: http://bit.ly/W532m6 (link goes to Github) All the function does is generate and return (to the client browser) an HTML playback of their terminal session recording. To do it efficiently without blocking the event loop or slowing down all other connected clients required loads of complexity (or maybe I'm just ignorant of "a better way"--feel free to enlighten me). In an ideal world I could have just done something like this: import async # The API of the future ;) async.async_call(retrieve_log_playback, settings, tws, mechanism=multiprocessing) # tws == instance of tornado.web.WebSocketHandler that holds the open connection ...but instead I had to create an entirely separate function to act as the multiprocessing.Process(), create a multiprocessing.Queue() to shuffle data back and forth, watch a special file descriptor for updates (so I can tell when the task is complete), and also create a closure because the connection instance (aka 'tws') isn't pickleable. After reading through these threads I feel much of the discussion is over my head but as someone who will ultimately become a *user* of the "async API of the future" I would like to share my thoughts... My opinion is that the goal of any async module that winds up in Python's standard library should be simplicity and portability. In terms of features, here's my 'async wishlist': * I should not have to worry about what is and isn't pickleable when I decide that a task should be performed asynchronously. * I should be able to choose the type of event loop/async mechanism that is appropriate for the task: For CPU-bound tasks I'll probably want to use multiprocessing. For IO-bound tasks I might want to use threading. For a multitude of tasks that "just need to be async" (by nature) I'll want to use an event loop. * Any async module should support 'basics' like calling functions at an interval and calling functions after a timeout occurs (with the ability to cancel). * Asynchronous tasks should be able to access the same namespace as everything else. Maybe wishful thinking. * It should support publish/subscribe-style events (i.e. an event dispatcher). For example, the ability to watch a file descriptor or socket for changes in state and call a function when that happens. Preferably with the flexibility to define custom events (i.e don't have it tied to kqueue/epoll-specific events). Thanks for your consideration; and thanks for the awesome language. -- Dan McDougall - Chief Executive Officer and Developer Liftoff Software ✈ Your flight to the cloud is now boarding. 904-446-8323

On Sun, Oct 14, 2012 at 12:27 AM, Daniel McDougall < daniel.mcdougall@liftoffsoftware.com> wrote:
(This is a response to GVR's Google+ post asking for ideas; I apologize in advance if I come off as an ignorant programming newbie)
-- snip snip snip -- import async # The API of the future ;)
Is this a CPU-bound problem? My opinion is that the goal of any async module that winds up in
Certainly. My above question is important, because this should only matter for IPC.
Ehhh, maybe. This sounds like it confounds the tools for different use cases. You can quite easily have threads and processes on top of an event loop; that works out particularly nicely for processes because you still have to talk to your processes. Examples: twisted.internet.reactor.spawnProcess (local processes) twisted.internet.threads.deferToThread (local threads) ampoule (remote processes) It's quite easy to do blocking IO in a thread with deferToThread; in fact, that's how twisted's adbapi, an async wrapper to dbapi, works. * Any async module should support 'basics' like calling functions at
With twisted, this is already the case; general caveats for shared mutable state across threads of course still apply. Fortunately in most Twisted apps, that's a tiny fraction of the total code, and they tend to be fractions that are well-isolated or at least easily isolatable.
Like connectionMade, connectionLost, dataReceived etc?
-- cheers lvh

On Sun, Oct 14, 2012 at 5:32 AM, Laurens Van Houtven <_@lvh.cc> wrote:
It depends on the host. On embedded platforms (e.g. the BeagleBone) it is more IO-bound than CPU bound (fast CPU but slow disk and slow memory). On regular x86 systems it is mostly CPU-bound.
As I understand it, twisted.internet.reactor.spawnProcess is all about spawning subprocesses akin to subprocess.Popen(). Also, it requires writing a sophisticated ProcessProtocol. It seems to be completely unrelated and wickedly complicated. The complete opposite of what I would consider ideal for an asynchronous library since it is anything but simple. I mean, I could write a separate program to generate HTML playback files from logs, spawn a subprocess in an asynchronous fashion, then watch it for completion but I could do that with termio.Multiplex (see: https://github.com/liftoff/GateOne/blob/master/gateone/termio.py) . deferToThread() does what one would expect but in many situations I'd prefer something like deferToMultiprocessing().
Oh there's a hundred different ways to fire and catch events. I'll let the low-level async experts decide which is best. Having said that, it would be nice if the interface didn't use such network-specific naming conventions. I would prefer something more generic. It is fine if it uses sockets and whatnot in the background. -- Dan McDougall - Chief Executive Officer and Developer Liftoff Software ✈ Your flight to the cloud is now boarding.

On Sun, Oct 14, 2012 at 6:03 PM, Daniel McDougall < daniel.mcdougall@liftoffsoftware.com> wrote:
deferToThread() does what one would expect but in many situations I'd prefer something like deferToMultiprocessing().
Twisted sort of has that with ampoule. The main issue is that arbitrary object serialization is pretty much impossible. Within threads, you sidestep that issue completely; across processes, you have to do deal with serialization, leading to the issues with pickle you've mentioned. I would prefer something more generic.
So maybe something like is popular in JS, where you subscribe to events by some string identifier? I personally use and like AngularJS' $broadcast, $emit and $on -- quite nice, but depedant on a hierarchical structure that seems to be missing here.
-- cheers lvh

On Sun, Oct 14, 2012 at 5:32 AM, Laurens Van Houtven <_@lvh.cc> wrote:
It depends on the host. On embedded platforms (e.g. the BeagleBone) it is more IO-bound than CPU bound (fast CPU but slow disk and slow memory). On regular x86 systems it is mostly CPU-bound.
As I understand it, twisted.internet.reactor.spawnProcess is all about spawning subprocesses akin to subprocess.Popen(). Also, it requires writing a sophisticated ProcessProtocol. It seems to be completely unrelated and wickedly complicated. The complete opposite of what I would consider ideal for an asynchronous library since it is anything but simple. I mean, I could write a separate program to generate HTML playback files from logs, spawn a subprocess in an asynchronous fashion, then watch it for completion but I could do that with termio.Multiplex (see: https://github.com/liftoff/GateOne/blob/master/gateone/termio.py) .
-- Dan McDougall - Chief Executive Officer and Developer Liftoff Software ✈ Your flight to the cloud is now boarding. 904-446-8323

On Sat, Oct 13, 2012 at 3:27 PM, Daniel McDougall <daniel.mcdougall@liftoffsoftware.com> wrote:
What you've described is very similar the the concurrent.futures.Executor.submit() method. ProcessPoolExecutor still has multiprocessing's pickle-related limitations, but other than that you're free to create ProcessPoolExecutors and/or ThreadPoolExecutors and submit work to them. Your retrieve_log_playback function could become: # create a global/singleton ProcessPoolExecutor executor = concurrent.futures.ProcessPoolExecutor() def retrieve_log_playback(settings, tws=None): # set up settings dict just like the original io_loop = tornado.ioloop.IOLoop.instance() future = executor.submit(_retrieve_log_playback, settings) def send_message(future): tws.write_message(future.result()) future.add_done_callback(lambda future: io_loop.add_callback(send_message) In Tornado 3.0 there will be some native support for Futures - the last line will probably become "io_loop.add_future(future, send_message)". In _retrieve_log_playback you no longer have a queue argument, and instead just return the result normally. It's also possible to do this just using multiprocessing instead of concurrent.futures - see multiprocessing.Pool.apply_async. -Ben

On Sun, Oct 14, 2012 at 12:27 AM, Daniel McDougall < daniel.mcdougall@liftoffsoftware.com> wrote:
(This is a response to GVR's Google+ post asking for ideas; I apologize in advance if I come off as an ignorant programming newbie)
-- snip snip snip -- import async # The API of the future ;)
Is this a CPU-bound problem? My opinion is that the goal of any async module that winds up in
Certainly. My above question is important, because this should only matter for IPC.
Ehhh, maybe. This sounds like it confounds the tools for different use cases. You can quite easily have threads and processes on top of an event loop; that works out particularly nicely for processes because you still have to talk to your processes. Examples: twisted.internet.reactor.spawnProcess (local processes) twisted.internet.threads.deferToThread (local threads) ampoule (remote processes) It's quite easy to do blocking IO in a thread with deferToThread; in fact, that's how twisted's adbapi, an async wrapper to dbapi, works. * Any async module should support 'basics' like calling functions at
With twisted, this is already the case; general caveats for shared mutable state across threads of course still apply. Fortunately in most Twisted apps, that's a tiny fraction of the total code, and they tend to be fractions that are well-isolated or at least easily isolatable.
Like connectionMade, connectionLost, dataReceived etc?
-- cheers lvh

On Sun, Oct 14, 2012 at 5:32 AM, Laurens Van Houtven <_@lvh.cc> wrote:
It depends on the host. On embedded platforms (e.g. the BeagleBone) it is more IO-bound than CPU bound (fast CPU but slow disk and slow memory). On regular x86 systems it is mostly CPU-bound.
As I understand it, twisted.internet.reactor.spawnProcess is all about spawning subprocesses akin to subprocess.Popen(). Also, it requires writing a sophisticated ProcessProtocol. It seems to be completely unrelated and wickedly complicated. The complete opposite of what I would consider ideal for an asynchronous library since it is anything but simple. I mean, I could write a separate program to generate HTML playback files from logs, spawn a subprocess in an asynchronous fashion, then watch it for completion but I could do that with termio.Multiplex (see: https://github.com/liftoff/GateOne/blob/master/gateone/termio.py) . deferToThread() does what one would expect but in many situations I'd prefer something like deferToMultiprocessing().
Oh there's a hundred different ways to fire and catch events. I'll let the low-level async experts decide which is best. Having said that, it would be nice if the interface didn't use such network-specific naming conventions. I would prefer something more generic. It is fine if it uses sockets and whatnot in the background. -- Dan McDougall - Chief Executive Officer and Developer Liftoff Software ✈ Your flight to the cloud is now boarding.

On Sun, Oct 14, 2012 at 6:03 PM, Daniel McDougall < daniel.mcdougall@liftoffsoftware.com> wrote:
deferToThread() does what one would expect but in many situations I'd prefer something like deferToMultiprocessing().
Twisted sort of has that with ampoule. The main issue is that arbitrary object serialization is pretty much impossible. Within threads, you sidestep that issue completely; across processes, you have to do deal with serialization, leading to the issues with pickle you've mentioned. I would prefer something more generic.
So maybe something like is popular in JS, where you subscribe to events by some string identifier? I personally use and like AngularJS' $broadcast, $emit and $on -- quite nice, but depedant on a hierarchical structure that seems to be missing here.
-- cheers lvh

On Sun, Oct 14, 2012 at 5:32 AM, Laurens Van Houtven <_@lvh.cc> wrote:
It depends on the host. On embedded platforms (e.g. the BeagleBone) it is more IO-bound than CPU bound (fast CPU but slow disk and slow memory). On regular x86 systems it is mostly CPU-bound.
As I understand it, twisted.internet.reactor.spawnProcess is all about spawning subprocesses akin to subprocess.Popen(). Also, it requires writing a sophisticated ProcessProtocol. It seems to be completely unrelated and wickedly complicated. The complete opposite of what I would consider ideal for an asynchronous library since it is anything but simple. I mean, I could write a separate program to generate HTML playback files from logs, spawn a subprocess in an asynchronous fashion, then watch it for completion but I could do that with termio.Multiplex (see: https://github.com/liftoff/GateOne/blob/master/gateone/termio.py) .
-- Dan McDougall - Chief Executive Officer and Developer Liftoff Software ✈ Your flight to the cloud is now boarding. 904-446-8323

On Sat, Oct 13, 2012 at 3:27 PM, Daniel McDougall <daniel.mcdougall@liftoffsoftware.com> wrote:
What you've described is very similar the the concurrent.futures.Executor.submit() method. ProcessPoolExecutor still has multiprocessing's pickle-related limitations, but other than that you're free to create ProcessPoolExecutors and/or ThreadPoolExecutors and submit work to them. Your retrieve_log_playback function could become: # create a global/singleton ProcessPoolExecutor executor = concurrent.futures.ProcessPoolExecutor() def retrieve_log_playback(settings, tws=None): # set up settings dict just like the original io_loop = tornado.ioloop.IOLoop.instance() future = executor.submit(_retrieve_log_playback, settings) def send_message(future): tws.write_message(future.result()) future.add_done_callback(lambda future: io_loop.add_callback(send_message) In Tornado 3.0 there will be some native support for Futures - the last line will probably become "io_loop.add_future(future, send_message)". In _retrieve_log_playback you no longer have a queue argument, and instead just return the result normally. It's also possible to do this just using multiprocessing instead of concurrent.futures - see multiprocessing.Pool.apply_async. -Ben
participants (3)
-
Ben Darnell
-
Daniel McDougall
-
Laurens Van Houtven