Async context managers and iterators with tulip

On Sat, Dec 22, 2012 at 4:17 PM, guido.van.rossum <python-checkins@python.org> wrote:
Actually, I just realised that the following can work if the async lock is defined appropriately: with yield from async_lock: ... The secret is that async_lock would need to be a coroutine rather than a context manager. *Calling* the coroutine would acquire the lock (potentially registering a callback that is scheduled when the lock is released) and return a context manager that released the lock. The async_lock itself wouldn't be a context manager, so you'd get an immediate error if you left out the "yield from". We'd be heading even further down the path of two-languages-for-the-price-of-one if we did that, though (by which I mean the fact that async code and synchronous code exist in parallel universes - one, more familiar one, where the ability to block is assumed, as is the fact that any operation may give concurrent code the chance to execute, and the universe of Twisted, tulip, et al, where possible suspension points are required to be explicitly marked in the function where they occur). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

I can't quite tell by the wording if you consider two-languages-for-the-price-of-one a good thing or a bad thing; but I can tell you that at least in Twisted, explicit suspension points have been a definite boon :) While it may lead to issues in some things (e.g. new users using blocking urllib calls in a callback), I find the net result much easier to read and reason about. cheers, lvh

On Sat, Dec 22, 2012 at 10:26 PM, Laurens Van Houtven <_@lvh.cc> wrote:
On balance, I consider it better than offering only greenlet-style implicit switching (which is effectively equivalent to preemptive threading, since any function call or operator may suspend the task). I'm also a lot happier about it since realising that the model of emitting futures and using "yield from f" where synchronous code would use "f.result()" helps unify the two worlds. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, Dec 22, 2012 at 4:57 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I wouldn't go so far as to call that unifying, but it definitely helps people transition. Still, from experience with introducing NDB's async in some internal App Engine software, it takes some getting used to even for the best of developers. But it is worth it. -- --Guido van Rossum (python.org/~guido)

On Sat, Dec 22, 2012 at 1:14 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Syntactically you'd have to say with (yield from async_lock): ....
Very nice.
It's inevitable that some patterns work well together while others don't. I see no big philosophical problem with this. Pragmatically, we'll have plenty of places where existing stdlib modules can't be used with tulip, and the tulip-compatible upgrade will have a different API. (The trickiest part will be that the classic code, e.g. urllib, must work in any thread and cannot rely on the existence of an event loop. *Maybe* you can get by with get_event_loop().run_until_complete(<future>) but that might still depend on the default event loop policy. Food for thought.) -- --Guido van Rossum (python.org/~guido)

On Sat, Dec 22, 2012 at 10:14 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: [...]
The two languages/parallel universes (sync and asyc) is a big concern IMHO. I looked at a greenlet based program that I'm writing and i'm using call stacks that are 10 deep or so. I would need to change all these layers from the scheduler down to use yield-from to make my program async. The higher levels are typically application specific and could decide to either be sync or async. For the lower levels (e.g. transports and protocols): those are typically library code and you'd need two versions. The latter can amount to quite a bit of duplication: there's a lot of protocol code currently in the standard library. I wonder if the greenlet idea was thrown out too early. If I understand the discussion correctly, the #1 disadvantage that was identified is that calling code does not know if called code will switch or not. Therefore it doesn't know whether to lock, and where. What about the following (straw man) approach to fix that issue using greenlets: functions can state if they are safe with regards to switching using a decorator. The default is off (non-safe). When at some point in the call graph you need to switch, you only to this if all frames starting from the current one up to the scheduler are async-safe. This should be achievable without any language changes. Usually the upper layers in a concurrent program are connection handlers. These can be marked safe quite easily as they usually only use local stated tied to the connection and are not called from other connections. Any code that they call would need to be explicitly marked async-safe otherwise it could block. I think the straw man above is identical to the current yield-from approach in safety because there is no automatic asynchronicity. However, this approach it has the benefit that there can be one implementation of lower layers (protocols and transports) that supports both sync and async, and higher layers can use the natural calling syntax that they are currently used to. Also making a program async can be an incremental process, and you could use e.g. a sys.settrace() handler to identify spots where safe code calls into unsafe code. Regards, Geert

On Sun, 23 Dec 2012 12:06:31 +0100 Geert Jansen <geertj@gmail.com> wrote:
Protocols written using a callback style (data_received(), etc.), as pointed by Laurens, can be used with both blocking and non-blocking coding styles. Only the transports would need to be duplicated, but that's expected. Regards Antoine.

On Sun, Dec 23, 2012 at 9:06 PM, Geert Jansen <geertj@gmail.com> wrote:
Greenlets aren't going anywhere. The thing is that "asynchronous programming" is used to describe both an execution model that's limited by the number of concurrent I/O operations rather than the number of OS level threads as well as a programming model based on cooperative (rather than preemptive) multi-threading. Greenlets are designed to provide the scaling benefits of I/O limited concurrency while continuing to use a preemptive multi-threading programming model where any operation is permitted to block the thread of execution (implicitly switching to another thread at the lowest layer). That's *wonderful* for getting the scaling benefits of the execution model without needing to rewrite a program to use a drastically different programming model. PEP 3156, on the other hand, is about providing the cooperative multi-threading *programming* model. Greenlets can't do that, because they're not intended to. However, gevent/greenlets will still benefit from the explicit asynchronous APIs in the future, as those protocols and transports will be usable by the *networking* side of gevent. And that's a ley part of the aim here - reducing the duplication of effort between gevent/Twisted/Tornado/et al by eventually allowing them to share more of the event driven protocol stacks. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

I can't quite tell by the wording if you consider two-languages-for-the-price-of-one a good thing or a bad thing; but I can tell you that at least in Twisted, explicit suspension points have been a definite boon :) While it may lead to issues in some things (e.g. new users using blocking urllib calls in a callback), I find the net result much easier to read and reason about. cheers, lvh

On Sat, Dec 22, 2012 at 10:26 PM, Laurens Van Houtven <_@lvh.cc> wrote:
On balance, I consider it better than offering only greenlet-style implicit switching (which is effectively equivalent to preemptive threading, since any function call or operator may suspend the task). I'm also a lot happier about it since realising that the model of emitting futures and using "yield from f" where synchronous code would use "f.result()" helps unify the two worlds. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, Dec 22, 2012 at 4:57 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I wouldn't go so far as to call that unifying, but it definitely helps people transition. Still, from experience with introducing NDB's async in some internal App Engine software, it takes some getting used to even for the best of developers. But it is worth it. -- --Guido van Rossum (python.org/~guido)

On Sat, Dec 22, 2012 at 1:14 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Syntactically you'd have to say with (yield from async_lock): ....
Very nice.
It's inevitable that some patterns work well together while others don't. I see no big philosophical problem with this. Pragmatically, we'll have plenty of places where existing stdlib modules can't be used with tulip, and the tulip-compatible upgrade will have a different API. (The trickiest part will be that the classic code, e.g. urllib, must work in any thread and cannot rely on the existence of an event loop. *Maybe* you can get by with get_event_loop().run_until_complete(<future>) but that might still depend on the default event loop policy. Food for thought.) -- --Guido van Rossum (python.org/~guido)

On Sat, Dec 22, 2012 at 10:14 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: [...]
The two languages/parallel universes (sync and asyc) is a big concern IMHO. I looked at a greenlet based program that I'm writing and i'm using call stacks that are 10 deep or so. I would need to change all these layers from the scheduler down to use yield-from to make my program async. The higher levels are typically application specific and could decide to either be sync or async. For the lower levels (e.g. transports and protocols): those are typically library code and you'd need two versions. The latter can amount to quite a bit of duplication: there's a lot of protocol code currently in the standard library. I wonder if the greenlet idea was thrown out too early. If I understand the discussion correctly, the #1 disadvantage that was identified is that calling code does not know if called code will switch or not. Therefore it doesn't know whether to lock, and where. What about the following (straw man) approach to fix that issue using greenlets: functions can state if they are safe with regards to switching using a decorator. The default is off (non-safe). When at some point in the call graph you need to switch, you only to this if all frames starting from the current one up to the scheduler are async-safe. This should be achievable without any language changes. Usually the upper layers in a concurrent program are connection handlers. These can be marked safe quite easily as they usually only use local stated tied to the connection and are not called from other connections. Any code that they call would need to be explicitly marked async-safe otherwise it could block. I think the straw man above is identical to the current yield-from approach in safety because there is no automatic asynchronicity. However, this approach it has the benefit that there can be one implementation of lower layers (protocols and transports) that supports both sync and async, and higher layers can use the natural calling syntax that they are currently used to. Also making a program async can be an incremental process, and you could use e.g. a sys.settrace() handler to identify spots where safe code calls into unsafe code. Regards, Geert

On Sun, 23 Dec 2012 12:06:31 +0100 Geert Jansen <geertj@gmail.com> wrote:
Protocols written using a callback style (data_received(), etc.), as pointed by Laurens, can be used with both blocking and non-blocking coding styles. Only the transports would need to be duplicated, but that's expected. Regards Antoine.

On Sun, Dec 23, 2012 at 9:06 PM, Geert Jansen <geertj@gmail.com> wrote:
Greenlets aren't going anywhere. The thing is that "asynchronous programming" is used to describe both an execution model that's limited by the number of concurrent I/O operations rather than the number of OS level threads as well as a programming model based on cooperative (rather than preemptive) multi-threading. Greenlets are designed to provide the scaling benefits of I/O limited concurrency while continuing to use a preemptive multi-threading programming model where any operation is permitted to block the thread of execution (implicitly switching to another thread at the lowest layer). That's *wonderful* for getting the scaling benefits of the execution model without needing to rewrite a program to use a drastically different programming model. PEP 3156, on the other hand, is about providing the cooperative multi-threading *programming* model. Greenlets can't do that, because they're not intended to. However, gevent/greenlets will still benefit from the explicit asynchronous APIs in the future, as those protocols and transports will be usable by the *networking* side of gevent. And that's a ley part of the aim here - reducing the duplication of effort between gevent/Twisted/Tornado/et al by eventually allowing them to share more of the event driven protocol stacks. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (5)
-
Antoine Pitrou
-
Geert Jansen
-
Guido van Rossum
-
Laurens Van Houtven
-
Nick Coghlan