PEP draft - Composable futures for reactive programming

Hi all, I would very much appreciate your opinion on my proposal for improvement of *concurrent.futures* package. Comparing to other languages such as Scala and C#, Python’s futures significantly fall behind in functionality especially in ability to chain computations and compose different futures without blocking and waiting for result. New packages continue to emerge (*asyncio*) which provide their own futures implementation, making composition even more difficult. Proposed improvement implements Scala-like Future as a monadic construct. It allows performing multiple kinds of operations on Future’s result without blocking, enabling reactive programming in Python. It implements common pattern separating *Future* and *Promise* interface, making it very easy for 3rd party systems to use futures in their API. Please have a look at this PEP draft<https://rawgithub.com/mikhtonyuk/rxpython/master/pep-0000.html>, and reference implementation <https://github.com/mikhtonyuk/rxpython> (as separate library). I’m very interested in: - How PEPable is this? - What are your thoughts on backward compatibility (current implementation does not sacrifice any design points for it, but better compatibility can be achieved)? - Thoughts on Future-based APIs in other packages? Thanks, Sergii

Hi! On Sat, Dec 21, 2013 at 03:14:05PM +0200, Sergii Mikhtoniuk <mikhtonyuk@gmail.com> wrote:
Please have a look at this PEP draft<https://rawgithub.com/mikhtonyuk/rxpython/master/pep-0000.html>,
It would be nice to have a text version of the PEP posted to the mailing list.
and reference implementation <https://github.com/mikhtonyuk/rxpython> (as separate library).
May I advice to add it to the References section of the PEP? Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On 21 December 2013 23:14, Sergii Mikhtoniuk <mikhtonyuk@gmail.com> wrote:
This looks like a really interesting idea, and well worth pursuing as a PEP for 3.5 (the failure to make the Future/Promise split was actually noted as a flaw in the concurrent.futures design at the time, but wasn't seen as serious enough to prevent inclusion of the module. However, this draft PEP does a decent job of showing the kinds of operations that the current combined class design can make difficult)
Yes, I think better compatibility is needed. One possible option would be to take the path of defining a "concurrent.futures.abc" module that provides this higher level API, and largely leave the existing concrete classes alone (aside from adjusting them to fit the ABC). The main possible issue I see is with the result setting APIs on the existing Future objects. There are also a lot of staticmethod declarations that look like they should probably be classmethod declarations. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

I'm pretty worried about this proposal. It seems to encourage callback-style programming, whereas I would like to get away from it (that's the main thrust of asyncio/PEP 3156). The proposal reminds me of Twisted Deferred, which is too complicated for my taste. Note that asyncio has a composition operation of its own, gather(), but encourages using standard try/except/else/finally syntax for error handling, rather than adding Twisted-style errbacks. Nevertheless it is possible that I am misunderstanding the proposal (e.g. the distinction between Future and Promise wasn't clear from reading the draft PEP, nor how it would interact with asyncio). I recommend that the author work find a way to improve the English grammar of the proposal, e.g. by working with a native speaker. On Sat, Dec 21, 2013 at 6:12 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

Hello Guido, Nick, Thanks a lot for your valuable feedback. Indeed there is a lot of overlap between *asyncio* and *concurrent.futures*packages, so it would be very interesting to hear your overall thoughts on role/future of *concurrent* package. Do you consider it rudimentary and replaceable by *asyncio.Future* completely? I think the question is even not much about which is your preferred implementation, but rather do we want having futures as stand-alone package or not. Do you see all these implementations converging in future? To me *Future* is a very simple and self-contained primitive, independent of thread pools, processes, IO, and networking. One thing *concurrent.futures* package does a very good job at is defining an isolated namespace for futures, stressing out this clear boundary (let’s disregard here *ThreadPoolExecutor* and *ProcessPoolExecutor* classes which for some reason ended up in it too). So when I think of *schedulers* and *event loops* implementations I see them as ones that build on top of the *Future* primitive, not providing it as part of their implementation. What I think is important is that having unified *Future* class simplifies interoperability between different kinds of schedulers, not necessarily associated with *asyncio* event loops (process pools for example). *Futures* are definitely not the final solution for concurrency problem but rather a well-established utility for representing async operations. It is a job of higher layer systems to provide more convenient ways for dealing with asyncs (*yield from* coroutines, *async/await* rewrites etc.), so I would not say that futures encourage callback-style programming, it’s simply a lower-layer functionality. On the contrary, monadic methods for futures composition (e.g. *map()*, *all()*, *first()* etc.) ensure that no errors would be lost in the process, so I think they would complement* yield from* model quite nicely by hiding complexity of state maintenance from user and reducing the number of back-and-forth communications between event loop and coroutines. Besides *Futures*, reactive programming has more utilities to offer, such as *Observables* (representing asynchronous streams of values). It is also a very useful abstraction with a rich set of composition strategies (merging, concatenation, grouping), and may deserve its place in separate package. Hope to hear back from you to get better picture on overall design direction here before jumping to implementation details. Brief follow-up to your questions: - Idea behind the Future/Promise separation is to draw a clean line between client-facing and scheduler-side APIs respectively. Futures should not expose any completion methods, which clients should not call anyway. - Completely agree with you that Twisted-style callbacks are evil and it is better to have single code path for getting result or raising exception - Sorry for bad grammar in the proposal, it’s an early draft written in 3 AM, so I will definitely improve on it if we decide to move forward. Thanks, Sergii On Sat, Dec 21, 2013 at 7:59 PM, Guido van Rossum <guido@python.org> wrote:

On Sat, Dec 21, 2013 at 2:48 PM, Sergii Mikhtoniuk <mikhtonyuk@gmail.com> wrote:
Hello Guido, Nick,
Thanks a lot for your valuable feedback.
You're welcome.
They don't really compare. concurrent.futures is about *threads*. asyncio.Future is about *avoiding* threads in favor of more lightweight "tasks" and "coroutines", which in turn are built on top of lower-level callbacks. (While I want to get away from callbacks as a programming paradigm, asyncio uses them at the lower levels both because they are a logical low-level building block and for interoperability with other frameworks like Tornado and Twisted.)
I see them as not even competing. They use different paradigms and apply to different use cases.
To me Future is a very simple and self-contained primitive, independent of thread pools, processes, IO, and networking.
(Agreed on the I/O and networking part only.)
Actually the executor API is an important and integral part of that package, and threads underlie everything you can do with its Futures.
The interoperability is completely missing. I can't tell if you've used asyncio at all, but the important operation of *waiting* for a result is fundamentally different there than in concurrent.futures. In the latter, you just write "x = f.result()" and your thread blocks until the result is available. In asyncio, you have to write "x = yield from f.result()" which is a coroutine block that lets other tasks run in the same thread. ("yield from" in this case is how Python spells the operation that C# calls "await").
But unless you are proposing some kind of radical change to add compile-time type checking/inference to Python, the rewrite option is unavailable in Python.
so I would not say that futures encourage callback-style programming,
The concurrent.futures.Future class does not. But unless I misread your proposal, your extensions do.
it’s simply a lower-layer functionality. On the contrary, monadic
(Say that word one more time and everyone tunes out. :-)
I'm not sure I follow. Again, I'm not sure if you've actually written any code using asyncio. TBH I've written a fair number of example programs for asyncio and I've very rarely felt the need for these composition functions. The main composition primitive I tend to use is "yield from".
Besides Futures, reactive programming
What *is* reactive programming? If you're talking about http://en.wikipedia.org/wiki/Reactive_programming, I'm not sure that it maps well to Python.
It all sounds very abstract and academic. :-)
Given that Future and Promise are often used synonymously, using them to make this distinction sounds confusing. I agree that Futures have two different APIs, one for the consumer and another for the producer. But I'm not sure that it's necessary to separate them more strictly -- convention seems good enough to me here. (It's the same with many communication primitives, like queues and even threads.) (The one thing that trips people up frequently is that, while set_result() and set_exception() are producer APIs, cancel() is a consumer API, and the cancellation signal travels from the consumer to the producer.)
No problem!
Thanks, Sergii
-- --Guido van Rossum (python.org/~guido)

On Sat, Dec 21, 2013 at 7:26 PM, Guido van Rossum <guido@python.org> wrote:
concurrent.futures.ThreadPoolExecutor is about threads; the Future class itself is broader. When I integrated Futures into Tornado I used concurrent.futures.Future directly (when available). asyncio.Future is just an optimized version of c.f.Future (the optimization comes from assuming single-threaded usage). There should at least be a common ABC between them.
The way I see it, the fundamental operation on Futures is add_done_callback. We then have various higher-level operations that let us get away from using callbacks directly. One of these happens to be a method on Future: the blocking mode of Future.result(). Another is implemented in asyncio and Tornado, in the ability to "yield" a Future. The blocking mode of result() is just a convenience; if asyncio-style futures had existed first then we could instead have a function like "x = concurrent.futures.wait_for(f)". In fact, you could write this wait_for function today in a way that works for both concurrent and asyncio futures. This is already interoperable: Tornado's Resolver interface ( https://github.com/facebook/tornado/blob/master/tornado/netutil.py#L184) returns a Future, which may be generated by a ThreadPoolExecutor or an asynchronous wrapper around pycares or twisted. In the other direction I've worked on hybrid apps that have one Tornado thread alongside a bunch of Django threads; in these apps it would work to have a Django thread block on f.result() for a Future returned by Tornado's AsyncHTTPClient. -Ben

There's still the issue that in the threading version, you wait for a Future by blocking the current thread, while in the asyncio version, you must use "yield from" to block. For interoperability you would have to refrain from *any* blocking operations (including "yield from") so you would only be able to use callbacks. But whether you had to write "x = f.result()" or "x = concurrent.futures.wait_for(f)", either way you'd implicitly be blocking the current thread. Yes, a clever scheduler could run other callbacks while blocking, but that's not a complete solution, because another callback might do a similar blocking operation, and whatever that waited for could hold up the earlier blocking operation, causing an ever-deeper recursion and of event loop invocations that might never complete. (I've had to debug this in production code.) To cut through that you'd have to have some kind of stack-swapping coroutine implementation like gevent, or a syntactic preprocessor that inserts yield or yield-from operations (I've heard from people who do this), or you'd need a clairvoyant scheduler that would know which callbacks won't block. I like the C# solution, but it depends on static typing so a compiler can know when to emit the coroutine interactions. That wouldn't work in Python, unless you made the compiler recognizing the wait_for() operation by name, which feels unsavory (although we do it for super() :-). I guess for extreme interop, callbacks that never block is your only option anyway, but I'd be sad if we had to to recommend this as the preferred paradigm, or claim that it is all you need. --Guido (if I don't respond to this thread for the next two weeks, it's because I'm on vacation :-) On Sat, Dec 21, 2013 at 5:37 PM, Ben Darnell <ben@bendarnell.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On Sun, Dec 22, 2013 at 2:45 PM, Guido van Rossum <guido@python.org> wrote:
To cut through that you'd have to have some kind of stack-swapping coroutine implementation like gevent...
Forgive the stupid question, but how is stack-swapping during blocking calls materially different from threads? ChrisA

On Sat, Dec 21, 2013 at 7:53 PM, Chris Angelico <rosuav@gmail.com> wrote:
It's also known as "green threads". The gevent folks and the Stackless folks (and a few others) do this and claim it is vastly superior to OS threads. I believe the main difference is that an OS thread takes up a relatively large amount of resources in the kernel as well as in user space (for the stack) while a green thread takes up a comparatively much smaller amount of space, all in user space -- with the result that you can have many more green threads than you could have OS threads, and switching between them will be much faster. The price you pay is that the kernel doesn't know what you're doing and you have to intercept all system-call-level I/O to make it non-blocking -- if you accidentally make a blocking syscall, no other green thread will run. This may sound like a pure implementation-level distinction, but implementation is what makes things practical (otherwise we'd all be using Turing machines or lambda calculus :-). -- --Guido van Rossum (python.org/~guido)

On Sun, Dec 22, 2013 at 3:01 PM, Guido van Rossum <guido@python.org> wrote:
Gotcha. I grew up on OS/2 where the threading was lean and mean, so I just used it. I'd be mildly curious to know how different implementations of threads compare, and how many of them actually warrant a "lighter-weight thread" feature like this, but for something that aims to be cross-platform, I can see the value in doing it. ChrisA

Even OS/2 can't do thousands of threads, so if you want to write a server with one thread per client (or two) you'd still need green threads. Meanwhile, Windows, Linux, and OS X all have pretty fast thread startup and decent schedulers, but the need for a static-sized stack still means you can't do thousands even on today's computers--especially in 32 bit land (which is depressingly still common on Windows). Sent from a random iPhone On Dec 21, 2013, at 20:25, Chris Angelico <rosuav@gmail.com> wrote:

On Sat, Dec 21, 2013 at 10:45 PM, Guido van Rossum <guido@python.org> wrote:
Threaded *consumers* of Futures wait for them by blocking, while asynchronous consumers wait for them by yielding. It doesn't matter whether the *producer* of the Future is threaded or asynchronous (except that if you know you won't be using threads you can use a faster thread-unsafe Future implementation). -Ben

Thanks everyone for your feedback. Taking all your suggestions into account I have revised my proposal<https://rawgithub.com/mikhtonyuk/rxpython/asyncio/pep-0000.html> . In short, it’s now: - defines separate Future classes for cooperative and multithreaded cases in concurrent.futures package - multithreaded implementation adds thread-safety to basic implementation, so in cooperative concurrency case there is absolutely no overhead - cooperative future’s interface is identical to asyncio.future - asyncio.Future inherits from concurrent.futures.cooperative.Future adding only methods specific to `yield from` - adds common composition methods for futures (intended to replace and enhance asyncio.wait/gather and concurrent.futures.wait) There’s still some work to be done for backward compatibility of concurrent.futures.Future, but implementation is almost ready<https://github.com/mikhtonyuk/rxpython/tree/asyncio> . Would really appreciate if you could take a look. Thanks, Sergii On Mon, Dec 23, 2013 at 12:42 AM, Guido van Rossum <guido@python.org> wrote:

Hi Sergii, I'm trying to give some constructive criticism here, please bear with me. The biggest issue perhaps seems to me that the unification between concurrent and threaded Futures still feels uncomfortable to me. A symptom is the completely different semantics of result() -- when the result isn't ready yet, this either raises an exception or blocks the current thread, and that makes reasoning about what will happen difficult. A lesser issue is naming -- I read some earlier example code you posted, and I couldn't understand it, because the names for the new operations you added are pretty arbitrary. Especially grating is your reusing some well-known names of built-in Python functions for different purposes, the worst offender being map(), but all() isn't so great either. In general your FutureBaseExt class (also an awkward name IMO) introduces a bunch of new functions with a wide variety of functionality that seems to have little logic to it. Why this set of functions and not another? A separate question is why the distinction between FutureBase and FutureBaseExt. It seems you copied some phrases from the asyncio docs or PEP 3156 -- e.g. add_done_callback() references call_soon(); this seems incorrect for threaded Futures. The definition of an Executor seems incomplete (SynchronousExecutor is referenced but not defined), and very vague -- I don't believe that making it just a callable suffices for the functionality. There is also a mention of global configuration of a default executor by assigning to config.Default.CALLBACK_EXECUTOR, which seems a bad idea -- I'm sure a lot of code will in practice depend on the choice of executor. Another issue: why the try_* methods? Finally, I'm not sure I am convinced by your motivation section. Or, at least, I'd like you to address how your proposal addresses each of the bullets in your motivation, with some examples. (I may have more, but at the current rate it would take me a day per paragraph, so I'll get to more later.) --Guido On Tue, Dec 24, 2013 at 8:56 AM, Sergii Mikhtoniuk <mikhtonyuk@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

A few things I'd like to see. Your Future uses a condition variable. That could be a big hit for single threaded uses, and since one of your goals is to make asyncio use the same futures as threaded executors, that might not be acceptable. How hard would it be to allow passing a class (or other factory callable) in place of the default condition when constructing a Promise, or just a "threaded=True" flag. (I realize this is more complicated than it sounds--an event loop can always push callbacks onto a thread pool, or just use a thread pool to implement something that's hard to do in a single thread, like DNS lookup--so you probably also need a way to upgrade a single threaded promise to a thread safe one.) There's a lot of rationale about why separate futures and promises are important, but no rationale for the specific design. If you explained where you deviated from twisted Deferred, JS Promises/A, etc., and why, it would be a lot easier to evaluate your design. It's not documented how chaining multiple callbacks in sequence works. Does the second callback get the return value of the first one, or the original one? Can you return a new future (possibly with callbacks already attached, possibly already completed)? Sent from a random iPhone On Dec 21, 2013, at 5:14, Sergii Mikhtoniuk <mikhtonyuk@gmail.com> wrote:

Hi! On Sat, Dec 21, 2013 at 03:14:05PM +0200, Sergii Mikhtoniuk <mikhtonyuk@gmail.com> wrote:
Please have a look at this PEP draft<https://rawgithub.com/mikhtonyuk/rxpython/master/pep-0000.html>,
It would be nice to have a text version of the PEP posted to the mailing list.
and reference implementation <https://github.com/mikhtonyuk/rxpython> (as separate library).
May I advice to add it to the References section of the PEP? Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On 21 December 2013 23:14, Sergii Mikhtoniuk <mikhtonyuk@gmail.com> wrote:
This looks like a really interesting idea, and well worth pursuing as a PEP for 3.5 (the failure to make the Future/Promise split was actually noted as a flaw in the concurrent.futures design at the time, but wasn't seen as serious enough to prevent inclusion of the module. However, this draft PEP does a decent job of showing the kinds of operations that the current combined class design can make difficult)
Yes, I think better compatibility is needed. One possible option would be to take the path of defining a "concurrent.futures.abc" module that provides this higher level API, and largely leave the existing concrete classes alone (aside from adjusting them to fit the ABC). The main possible issue I see is with the result setting APIs on the existing Future objects. There are also a lot of staticmethod declarations that look like they should probably be classmethod declarations. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

I'm pretty worried about this proposal. It seems to encourage callback-style programming, whereas I would like to get away from it (that's the main thrust of asyncio/PEP 3156). The proposal reminds me of Twisted Deferred, which is too complicated for my taste. Note that asyncio has a composition operation of its own, gather(), but encourages using standard try/except/else/finally syntax for error handling, rather than adding Twisted-style errbacks. Nevertheless it is possible that I am misunderstanding the proposal (e.g. the distinction between Future and Promise wasn't clear from reading the draft PEP, nor how it would interact with asyncio). I recommend that the author work find a way to improve the English grammar of the proposal, e.g. by working with a native speaker. On Sat, Dec 21, 2013 at 6:12 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

Hello Guido, Nick, Thanks a lot for your valuable feedback. Indeed there is a lot of overlap between *asyncio* and *concurrent.futures*packages, so it would be very interesting to hear your overall thoughts on role/future of *concurrent* package. Do you consider it rudimentary and replaceable by *asyncio.Future* completely? I think the question is even not much about which is your preferred implementation, but rather do we want having futures as stand-alone package or not. Do you see all these implementations converging in future? To me *Future* is a very simple and self-contained primitive, independent of thread pools, processes, IO, and networking. One thing *concurrent.futures* package does a very good job at is defining an isolated namespace for futures, stressing out this clear boundary (let’s disregard here *ThreadPoolExecutor* and *ProcessPoolExecutor* classes which for some reason ended up in it too). So when I think of *schedulers* and *event loops* implementations I see them as ones that build on top of the *Future* primitive, not providing it as part of their implementation. What I think is important is that having unified *Future* class simplifies interoperability between different kinds of schedulers, not necessarily associated with *asyncio* event loops (process pools for example). *Futures* are definitely not the final solution for concurrency problem but rather a well-established utility for representing async operations. It is a job of higher layer systems to provide more convenient ways for dealing with asyncs (*yield from* coroutines, *async/await* rewrites etc.), so I would not say that futures encourage callback-style programming, it’s simply a lower-layer functionality. On the contrary, monadic methods for futures composition (e.g. *map()*, *all()*, *first()* etc.) ensure that no errors would be lost in the process, so I think they would complement* yield from* model quite nicely by hiding complexity of state maintenance from user and reducing the number of back-and-forth communications between event loop and coroutines. Besides *Futures*, reactive programming has more utilities to offer, such as *Observables* (representing asynchronous streams of values). It is also a very useful abstraction with a rich set of composition strategies (merging, concatenation, grouping), and may deserve its place in separate package. Hope to hear back from you to get better picture on overall design direction here before jumping to implementation details. Brief follow-up to your questions: - Idea behind the Future/Promise separation is to draw a clean line between client-facing and scheduler-side APIs respectively. Futures should not expose any completion methods, which clients should not call anyway. - Completely agree with you that Twisted-style callbacks are evil and it is better to have single code path for getting result or raising exception - Sorry for bad grammar in the proposal, it’s an early draft written in 3 AM, so I will definitely improve on it if we decide to move forward. Thanks, Sergii On Sat, Dec 21, 2013 at 7:59 PM, Guido van Rossum <guido@python.org> wrote:

On Sat, Dec 21, 2013 at 2:48 PM, Sergii Mikhtoniuk <mikhtonyuk@gmail.com> wrote:
Hello Guido, Nick,
Thanks a lot for your valuable feedback.
You're welcome.
They don't really compare. concurrent.futures is about *threads*. asyncio.Future is about *avoiding* threads in favor of more lightweight "tasks" and "coroutines", which in turn are built on top of lower-level callbacks. (While I want to get away from callbacks as a programming paradigm, asyncio uses them at the lower levels both because they are a logical low-level building block and for interoperability with other frameworks like Tornado and Twisted.)
I see them as not even competing. They use different paradigms and apply to different use cases.
To me Future is a very simple and self-contained primitive, independent of thread pools, processes, IO, and networking.
(Agreed on the I/O and networking part only.)
Actually the executor API is an important and integral part of that package, and threads underlie everything you can do with its Futures.
The interoperability is completely missing. I can't tell if you've used asyncio at all, but the important operation of *waiting* for a result is fundamentally different there than in concurrent.futures. In the latter, you just write "x = f.result()" and your thread blocks until the result is available. In asyncio, you have to write "x = yield from f.result()" which is a coroutine block that lets other tasks run in the same thread. ("yield from" in this case is how Python spells the operation that C# calls "await").
But unless you are proposing some kind of radical change to add compile-time type checking/inference to Python, the rewrite option is unavailable in Python.
so I would not say that futures encourage callback-style programming,
The concurrent.futures.Future class does not. But unless I misread your proposal, your extensions do.
it’s simply a lower-layer functionality. On the contrary, monadic
(Say that word one more time and everyone tunes out. :-)
I'm not sure I follow. Again, I'm not sure if you've actually written any code using asyncio. TBH I've written a fair number of example programs for asyncio and I've very rarely felt the need for these composition functions. The main composition primitive I tend to use is "yield from".
Besides Futures, reactive programming
What *is* reactive programming? If you're talking about http://en.wikipedia.org/wiki/Reactive_programming, I'm not sure that it maps well to Python.
It all sounds very abstract and academic. :-)
Given that Future and Promise are often used synonymously, using them to make this distinction sounds confusing. I agree that Futures have two different APIs, one for the consumer and another for the producer. But I'm not sure that it's necessary to separate them more strictly -- convention seems good enough to me here. (It's the same with many communication primitives, like queues and even threads.) (The one thing that trips people up frequently is that, while set_result() and set_exception() are producer APIs, cancel() is a consumer API, and the cancellation signal travels from the consumer to the producer.)
No problem!
Thanks, Sergii
-- --Guido van Rossum (python.org/~guido)

On Sat, Dec 21, 2013 at 7:26 PM, Guido van Rossum <guido@python.org> wrote:
concurrent.futures.ThreadPoolExecutor is about threads; the Future class itself is broader. When I integrated Futures into Tornado I used concurrent.futures.Future directly (when available). asyncio.Future is just an optimized version of c.f.Future (the optimization comes from assuming single-threaded usage). There should at least be a common ABC between them.
The way I see it, the fundamental operation on Futures is add_done_callback. We then have various higher-level operations that let us get away from using callbacks directly. One of these happens to be a method on Future: the blocking mode of Future.result(). Another is implemented in asyncio and Tornado, in the ability to "yield" a Future. The blocking mode of result() is just a convenience; if asyncio-style futures had existed first then we could instead have a function like "x = concurrent.futures.wait_for(f)". In fact, you could write this wait_for function today in a way that works for both concurrent and asyncio futures. This is already interoperable: Tornado's Resolver interface ( https://github.com/facebook/tornado/blob/master/tornado/netutil.py#L184) returns a Future, which may be generated by a ThreadPoolExecutor or an asynchronous wrapper around pycares or twisted. In the other direction I've worked on hybrid apps that have one Tornado thread alongside a bunch of Django threads; in these apps it would work to have a Django thread block on f.result() for a Future returned by Tornado's AsyncHTTPClient. -Ben

There's still the issue that in the threading version, you wait for a Future by blocking the current thread, while in the asyncio version, you must use "yield from" to block. For interoperability you would have to refrain from *any* blocking operations (including "yield from") so you would only be able to use callbacks. But whether you had to write "x = f.result()" or "x = concurrent.futures.wait_for(f)", either way you'd implicitly be blocking the current thread. Yes, a clever scheduler could run other callbacks while blocking, but that's not a complete solution, because another callback might do a similar blocking operation, and whatever that waited for could hold up the earlier blocking operation, causing an ever-deeper recursion and of event loop invocations that might never complete. (I've had to debug this in production code.) To cut through that you'd have to have some kind of stack-swapping coroutine implementation like gevent, or a syntactic preprocessor that inserts yield or yield-from operations (I've heard from people who do this), or you'd need a clairvoyant scheduler that would know which callbacks won't block. I like the C# solution, but it depends on static typing so a compiler can know when to emit the coroutine interactions. That wouldn't work in Python, unless you made the compiler recognizing the wait_for() operation by name, which feels unsavory (although we do it for super() :-). I guess for extreme interop, callbacks that never block is your only option anyway, but I'd be sad if we had to to recommend this as the preferred paradigm, or claim that it is all you need. --Guido (if I don't respond to this thread for the next two weeks, it's because I'm on vacation :-) On Sat, Dec 21, 2013 at 5:37 PM, Ben Darnell <ben@bendarnell.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On Sun, Dec 22, 2013 at 2:45 PM, Guido van Rossum <guido@python.org> wrote:
To cut through that you'd have to have some kind of stack-swapping coroutine implementation like gevent...
Forgive the stupid question, but how is stack-swapping during blocking calls materially different from threads? ChrisA

On Sat, Dec 21, 2013 at 7:53 PM, Chris Angelico <rosuav@gmail.com> wrote:
It's also known as "green threads". The gevent folks and the Stackless folks (and a few others) do this and claim it is vastly superior to OS threads. I believe the main difference is that an OS thread takes up a relatively large amount of resources in the kernel as well as in user space (for the stack) while a green thread takes up a comparatively much smaller amount of space, all in user space -- with the result that you can have many more green threads than you could have OS threads, and switching between them will be much faster. The price you pay is that the kernel doesn't know what you're doing and you have to intercept all system-call-level I/O to make it non-blocking -- if you accidentally make a blocking syscall, no other green thread will run. This may sound like a pure implementation-level distinction, but implementation is what makes things practical (otherwise we'd all be using Turing machines or lambda calculus :-). -- --Guido van Rossum (python.org/~guido)

On Sun, Dec 22, 2013 at 3:01 PM, Guido van Rossum <guido@python.org> wrote:
Gotcha. I grew up on OS/2 where the threading was lean and mean, so I just used it. I'd be mildly curious to know how different implementations of threads compare, and how many of them actually warrant a "lighter-weight thread" feature like this, but for something that aims to be cross-platform, I can see the value in doing it. ChrisA

Even OS/2 can't do thousands of threads, so if you want to write a server with one thread per client (or two) you'd still need green threads. Meanwhile, Windows, Linux, and OS X all have pretty fast thread startup and decent schedulers, but the need for a static-sized stack still means you can't do thousands even on today's computers--especially in 32 bit land (which is depressingly still common on Windows). Sent from a random iPhone On Dec 21, 2013, at 20:25, Chris Angelico <rosuav@gmail.com> wrote:

On Sat, Dec 21, 2013 at 10:45 PM, Guido van Rossum <guido@python.org> wrote:
Threaded *consumers* of Futures wait for them by blocking, while asynchronous consumers wait for them by yielding. It doesn't matter whether the *producer* of the Future is threaded or asynchronous (except that if you know you won't be using threads you can use a faster thread-unsafe Future implementation). -Ben

Thanks everyone for your feedback. Taking all your suggestions into account I have revised my proposal<https://rawgithub.com/mikhtonyuk/rxpython/asyncio/pep-0000.html> . In short, it’s now: - defines separate Future classes for cooperative and multithreaded cases in concurrent.futures package - multithreaded implementation adds thread-safety to basic implementation, so in cooperative concurrency case there is absolutely no overhead - cooperative future’s interface is identical to asyncio.future - asyncio.Future inherits from concurrent.futures.cooperative.Future adding only methods specific to `yield from` - adds common composition methods for futures (intended to replace and enhance asyncio.wait/gather and concurrent.futures.wait) There’s still some work to be done for backward compatibility of concurrent.futures.Future, but implementation is almost ready<https://github.com/mikhtonyuk/rxpython/tree/asyncio> . Would really appreciate if you could take a look. Thanks, Sergii On Mon, Dec 23, 2013 at 12:42 AM, Guido van Rossum <guido@python.org> wrote:

Hi Sergii, I'm trying to give some constructive criticism here, please bear with me. The biggest issue perhaps seems to me that the unification between concurrent and threaded Futures still feels uncomfortable to me. A symptom is the completely different semantics of result() -- when the result isn't ready yet, this either raises an exception or blocks the current thread, and that makes reasoning about what will happen difficult. A lesser issue is naming -- I read some earlier example code you posted, and I couldn't understand it, because the names for the new operations you added are pretty arbitrary. Especially grating is your reusing some well-known names of built-in Python functions for different purposes, the worst offender being map(), but all() isn't so great either. In general your FutureBaseExt class (also an awkward name IMO) introduces a bunch of new functions with a wide variety of functionality that seems to have little logic to it. Why this set of functions and not another? A separate question is why the distinction between FutureBase and FutureBaseExt. It seems you copied some phrases from the asyncio docs or PEP 3156 -- e.g. add_done_callback() references call_soon(); this seems incorrect for threaded Futures. The definition of an Executor seems incomplete (SynchronousExecutor is referenced but not defined), and very vague -- I don't believe that making it just a callable suffices for the functionality. There is also a mention of global configuration of a default executor by assigning to config.Default.CALLBACK_EXECUTOR, which seems a bad idea -- I'm sure a lot of code will in practice depend on the choice of executor. Another issue: why the try_* methods? Finally, I'm not sure I am convinced by your motivation section. Or, at least, I'd like you to address how your proposal addresses each of the bullets in your motivation, with some examples. (I may have more, but at the current rate it would take me a day per paragraph, so I'll get to more later.) --Guido On Tue, Dec 24, 2013 at 8:56 AM, Sergii Mikhtoniuk <mikhtonyuk@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

A few things I'd like to see. Your Future uses a condition variable. That could be a big hit for single threaded uses, and since one of your goals is to make asyncio use the same futures as threaded executors, that might not be acceptable. How hard would it be to allow passing a class (or other factory callable) in place of the default condition when constructing a Promise, or just a "threaded=True" flag. (I realize this is more complicated than it sounds--an event loop can always push callbacks onto a thread pool, or just use a thread pool to implement something that's hard to do in a single thread, like DNS lookup--so you probably also need a way to upgrade a single threaded promise to a thread safe one.) There's a lot of rationale about why separate futures and promises are important, but no rationale for the specific design. If you explained where you deviated from twisted Deferred, JS Promises/A, etc., and why, it would be a lot easier to evaluate your design. It's not documented how chaining multiple callbacks in sequence works. Does the second callback get the return value of the first one, or the original one? Can you return a new future (possibly with callbacks already attached, possibly already completed)? Sent from a random iPhone On Dec 21, 2013, at 5:14, Sergii Mikhtoniuk <mikhtonyuk@gmail.com> wrote:
participants (8)
-
Andrew Barnert
-
Antoine Pitrou
-
Ben Darnell
-
Chris Angelico
-
Guido van Rossum
-
Nick Coghlan
-
Oleg Broytman
-
Sergii Mikhtoniuk