[Python-ideas] PEP draft - Composable futures for reactive programming

Sergii Mikhtoniuk mikhtonyuk at gmail.com
Tue Dec 24 17:56:23 CET 2013


Thanks everyone for your feedback.

Taking all your suggestions into account I have revised my
proposal<https://rawgithub.com/mikhtonyuk/rxpython/asyncio/pep-0000.html>
.

In short, it’s now:
- defines separate Future classes for cooperative and multithreaded cases
in concurrent.futures package
- multithreaded implementation adds thread-safety to basic implementation,
so in cooperative concurrency case there is absolutely no overhead
- cooperative future’s interface is identical to asyncio.future
- asyncio.Future inherits from concurrent.futures.cooperative.Future adding
only methods specific to `yield from`
- adds common composition methods for futures (intended to replace and
enhance asyncio.wait/gather and concurrent.futures.wait)

There’s still some work to be done for backward compatibility of
concurrent.futures.Future, but implementation is almost
ready<https://github.com/mikhtonyuk/rxpython/tree/asyncio>
.

Would really appreciate if you could take a look.


Thanks,
Sergii



On Mon, Dec 23, 2013 at 12:42 AM, Guido van Rossum <guido at python.org> wrote:

> Aha. That is clever. I will have to look into the details more, but the
> idea is promising. Sorry I didn't see tty his before.
> On Dec 22, 2013 12:30 PM, "Ben Darnell" <ben at bendarnell.com> wrote:
>
>> On Sat, Dec 21, 2013 at 10:45 PM, Guido van Rossum <guido at python.org>wrote:
>>
>>> There's still the issue that in the threading version, you wait for a
>>> Future by blocking the current thread, while in the asyncio version,
>>> you must use "yield from" to block. For interoperability you would
>>> have to refrain from *any* blocking operations (including "yield
>>> from") so you would only be able to use callbacks. But whether you had
>>> to write "x = f.result()" or "x = concurrent.futures.wait_for(f)",
>>> either way you'd implicitly be blocking the current thread.
>>>
>>
>> Threaded *consumers* of Futures wait for them by blocking, while
>> asynchronous consumers wait for them by yielding.  It doesn't matter
>> whether the *producer* of the Future is threaded or asynchronous (except
>> that if you know you won't be using threads you can use a faster
>> thread-unsafe Future implementation).
>>
>> -Ben
>>
>>
>>>
>>> Yes, a clever scheduler could run other callbacks while blocking, but
>>> that's not a complete solution, because another callback might do a
>>> similar blocking operation, and whatever that waited for could hold up
>>> the earlier blocking operation, causing an ever-deeper recursion and
>>> of event loop invocations that might never complete. (I've had to
>>> debug this in production code.) To cut through that you'd have to have
>>> some kind of stack-swapping coroutine implementation like gevent, or a
>>> syntactic preprocessor that inserts yield or yield-from operations
>>> (I've heard from people who do this), or you'd need a clairvoyant
>>> scheduler that would know which callbacks won't block.
>>>
>>> I like the C# solution, but it depends on static typing so a compiler
>>> can know when to emit the coroutine interactions. That wouldn't work
>>> in Python, unless you made the compiler recognizing the wait_for()
>>> operation by name, which feels unsavory (although we do it for super()
>>> :-).
>>>
>>> I guess for extreme interop, callbacks that never block is your only
>>> option anyway, but I'd be sad if we had to to recommend this as the
>>> preferred paradigm, or claim that it is all you need.
>>>
>>> --Guido (if I don't respond to this thread for the next two weeks,
>>> it's because I'm on vacation :-)
>>>
>>>
>>> On Sat, Dec 21, 2013 at 5:37 PM, Ben Darnell <ben at bendarnell.com> wrote:
>>> > On Sat, Dec 21, 2013 at 7:26 PM, Guido van Rossum <guido at python.org>
>>> wrote:
>>> >>
>>> >> On Sat, Dec 21, 2013 at 2:48 PM, Sergii Mikhtoniuk <
>>> mikhtonyuk at gmail.com>
>>> >> wrote:
>>> >> > Indeed there is a lot of overlap between asyncio and
>>> concurrent.futures
>>> >> > packages, so it would be very interesting to hear your overall
>>> thoughts
>>> >> > on
>>> >> > role/future of concurrent package. Do you consider it rudimentary
>>> and
>>> >> > replaceable by asyncio.Future completely?
>>> >>
>>> >> They don't really compare. concurrent.futures is about *threads*.
>>> >> asyncio.Future is about *avoiding* threads in favor of more
>>> >> lightweight "tasks" and "coroutines", which in turn are built on top
>>> >> of lower-level callbacks.
>>> >
>>> >
>>> > concurrent.futures.ThreadPoolExecutor is about threads; the Future
>>> class
>>> > itself is broader.  When I integrated Futures into Tornado I used
>>> > concurrent.futures.Future directly (when available).  asyncio.Future
>>> is just
>>> > an optimized version of c.f.Future (the optimization comes from
>>> assuming
>>> > single-threaded usage).  There should at least be a common ABC between
>>> them.
>>> >
>>> >>
>>> >>
>>> >> (While I want to get away from callbacks as a programming paradigm,
>>> >> asyncio uses them at the lower levels both because they are a logical
>>> >> low-level building block and for interoperability with other
>>> >> frameworks like Tornado and Twisted.)
>>> >>
>>> >> > I think the question is even not much about which is your preferred
>>> >> > implementation, but rather do we want having futures as stand-alone
>>> >> > package
>>> >> > or not. Do you see all these implementations converging in future?
>>> >>
>>> >> I see them as not even competing. They use different paradigms and
>>> >> apply to different use cases.
>>> >>
>>> >> > To me Future is a very simple and self-contained primitive,
>>> independent
>>> >> > of
>>> >> > thread pools, processes, IO, and networking.
>>> >>
>>> >> (Agreed on the I/O and networking part only.)
>>> >>
>>> >> > One thing concurrent.futures
>>> >> > package does a very good job at is defining an isolated namespace
>>> for
>>> >> > futures, stressing out this clear boundary (let’s disregard here
>>> >> > ThreadPoolExecutor and ProcessPoolExecutor classes which for some
>>> reason
>>> >> > ended up in it too).
>>> >>
>>> >> Actually the executor API is an important and integral part of that
>>> >> package, and threads underlie everything you can do with its Futures.
>>> >>
>>> >> > So when I think of schedulers and event loops implementations I see
>>> them
>>> >> > as
>>> >> > ones that build on top of the Future primitive, not providing it as
>>> part
>>> >> > of
>>> >> > their implementation. What I think is important is that having
>>> unified
>>> >> > Future class simplifies interoperability between different kinds of
>>> >> > schedulers, not necessarily associated with asyncio event loops
>>> (process
>>> >> > pools for example).
>>> >>
>>> >> The interoperability is completely missing. I can't tell if you've
>>> >> used asyncio at all, but the important operation of *waiting* for a
>>> >> result is fundamentally different there than in concurrent.futures. In
>>> >> the latter, you just write "x = f.result()" and your thread blocks
>>> >> until the result is available. In asyncio, you have to write "x =
>>> >> yield from f.result()" which is a coroutine block that lets other
>>> >> tasks run in the same thread. ("yield from" in this case is how Python
>>> >> spells the operation that C# calls "await").
>>> >
>>> >
>>> > The way I see it, the fundamental operation on Futures is
>>> add_done_callback.
>>> > We then have various higher-level operations that let us get away from
>>> using
>>> > callbacks directly.  One of these happens to be a method on Future: the
>>> > blocking mode of Future.result().  Another is implemented in asyncio
>>> and
>>> > Tornado, in the ability to "yield" a Future.  The blocking mode of
>>> result()
>>> > is just a convenience; if asyncio-style futures had existed first then
>>> we
>>> > could instead have a function like "x =
>>> concurrent.futures.wait_for(f)".  In
>>> > fact, you could write this wait_for function today in a way that works
>>> for
>>> > both concurrent and asyncio futures.
>>> >
>>> > This is already interoperable:  Tornado's Resolver interface
>>> > (
>>> https://github.com/facebook/tornado/blob/master/tornado/netutil.py#L184)
>>> > returns a Future, which may be generated by a ThreadPoolExecutor or an
>>> > asynchronous wrapper around pycares or twisted.  In the other
>>> direction I've
>>> > worked on hybrid apps that have one Tornado thread alongside a bunch of
>>> > Django threads; in these apps it would work to have a Django thread
>>> block on
>>> > f.result() for a Future returned by Tornado's AsyncHTTPClient.
>>> >
>>> > -Ben
>>> >
>>> >>
>>> >> > Futures are definitely not the final solution for concurrency
>>> problem
>>> >> > but
>>> >> > rather a well-established utility for representing async
>>> operations. It
>>> >> > is a
>>> >> > job of higher layer systems to provide more convenient ways for
>>> dealing
>>> >> > with
>>> >> > asyncs (yield from coroutines, async/await rewrites etc.),
>>> >>
>>> >> But unless you are proposing some kind of radical change to add
>>> >> compile-time type checking/inference to Python, the rewrite option is
>>> >> unavailable in Python.
>>> >>
>>> >> > so I would not
>>> >> > say that futures encourage callback-style programming,
>>> >>
>>> >> The concurrent.futures.Future class does not. But unless I misread
>>> >> your proposal, your extensions do.
>>> >>
>>> >> > it’s simply a
>>> >> > lower-layer functionality. On the contrary, monadic
>>> >>
>>> >> (Say that word one more time and everyone tunes out. :-)
>>> >>
>>> >> > methods for futures
>>> >> > composition (e.g. map(), all(), first() etc.) ensure that no errors
>>> >> > would be
>>> >> > lost in the process, so I think they would complement yield from
>>> model
>>> >> > quite
>>> >> > nicely by hiding complexity of state maintenance from user and
>>> reducing
>>> >> > the
>>> >> > number of back-and-forth communications between event loop and
>>> >> > coroutines.
>>> >>
>>> >> I'm not sure I follow. Again, I'm not sure if you've actually written
>>> >> any code using asyncio.
>>> >>
>>> >> TBH I've written a fair number of example programs for asyncio and
>>> >> I've very rarely felt the need for these composition functions. The
>>> >> main composition primitive I tend to use is "yield from".
>>> >>
>>> >> > Besides Futures, reactive programming
>>> >>
>>> >> What *is* reactive programming? If you're talking about
>>> >> http://en.wikipedia.org/wiki/Reactive_programming,
>>> >> I'm not sure that it maps well to Python.
>>> >>
>>> >> > has more utilities to offer, such as
>>> >> > Observables (representing asynchronous streams of values). It is
>>> also a
>>> >> > very
>>> >> > useful abstraction with a rich set of composition strategies
>>> (merging,
>>> >> > concatenation, grouping), and may deserve its place in separate
>>> package.
>>> >>
>>> >> It all sounds very abstract and academic. :-)
>>> >>
>>> >> > Hope to hear back from you to get better picture on overall design
>>> >> > direction
>>> >> > here before jumping to implementation details.
>>> >> >
>>> >> > Brief follow-up to your questions:
>>> >> >  - Idea behind the Future/Promise separation is to draw a clean line
>>> >> > between
>>> >> > client-facing and scheduler-side APIs respectively. Futures should
>>> not
>>> >> > expose any completion methods, which clients should not call anyway.
>>> >>
>>> >> Given that Future and Promise are often used synonymously, using them
>>> >> to make this distinction sounds confusing. I agree that Futures have
>>> >> two different APIs, one for the consumer and another for the producer.
>>> >> But I'm not sure that it's necessary to separate them more strictly --
>>> >> convention seems good enough to me here. (It's the same with many
>>> >> communication primitives, like queues and even threads.)
>>> >>
>>> >> (The one thing that trips people up frequently is that, while
>>> >> set_result() and set_exception() are producer APIs, cancel() is a
>>> >> consumer API, and the cancellation signal travels from the consumer to
>>> >> the producer.)
>>> >>
>>> >> >  - Completely agree with you that Twisted-style callbacks are evil
>>> and
>>> >> > it is
>>> >> > better to have single code path for getting result or raising
>>> exception
>>> >> >  - Sorry for bad grammar in the proposal, it’s an early draft
>>> written in
>>> >> > 3
>>> >> > AM, so I will definitely improve on it if we decide to move forward.
>>> >>
>>> >> No problem!
>>> >>
>>> >> > Thanks,
>>> >> > Sergii
>>> >>
>>> >> --
>>> >> --Guido van Rossum (python.org/~guido)
>>> >> _______________________________________________
>>> >> Python-ideas mailing list
>>> >> Python-ideas at python.org
>>> >> https://mail.python.org/mailman/listinfo/python-ideas
>>> >> Code of Conduct: http://python.org/psf/codeofconduct/
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> --Guido van Rossum (python.org/~guido)
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20131224/f38b310b/attachment-0001.html>


More information about the Python-ideas mailing list