[Python-ideas] Auto-wrapping coroutines into Tasks

Fri May 4 21:16:03 EDT 2018

On Fri, May 4, 2018 at 2:58 PM, Guido van Rossum <guido at python.org> wrote:
> First, "start executing immediately" is an overstatement, right? They won't
> run until the caller executes a (possibly unrelated) `await`.

Well, traditional Future-returning functions often do execute some
logic immediately, but right, what I meant was something like "starts
executing without further intervention". I'm sure you know what I
mean, but here's a concrete example to make sure it's clear to
everyone else. Say we write this code:

  async_log("it happened!")

If async_log is a traditional Future-returning function, then this
line is sufficient to cause the message to be logged (eventually). If
async_log is an async coroutine-returning function, then it's a no-op
(except for generating a "coroutine was never awaited" warning). With
this proposal, it would always work.

> And I'm still
> unclear why anyone would care, *except* in the case where they've somehow
> learned by observation that "real" coroutines don't start immediately and
> build a dependency on this in their code. (The happy eyeballs use case that
> was brought up here earlier today seems like it would be better off not
> depending on this either way, and it wouldn't be hard to do this either.)
>
> Second, when adding callbacks (if you *have to* -- if you're not a framework
> author you're likely doing something wrong if you find yourself adding
> callbacks), the right thing to do is obviously to *always* call
> ensure_future() first.

Async/await often lets you avoid working with the Future API directly,
but Futures are still a major part of asyncio's public API, and so are
synchronous-flavored systems like protocols/transports, where you
can't use 'await'. I've been told that we need to keep that in mind
when thinking about asyncio extensions ;-). And if the right thing to
do is to *always* call a function, that's a good argument that the
library should call it for you, right? :-)

In practice I think cases like my 'async_log' example are the main
place where people are likely to run into this – there are a lot of
functions out there a bare call works to run something in the
background, and a lot where it doesn't. (In particular, all existing
Tornado and Twisted APIs are Future-returning, not async.)

> Third, hooks like this feel like a great way to create an even bigger mess
> -- it implicitly teaches users that all coroutines are Futures, which will
> then cause disappointments when they find themselves in an environment where
> the hook is not enabled.

Switching between async libraries is always going to be a pretty
messy. So I guess the only case people are likely to actually
encounter an unexpected hook configuration is in the period before
they enter asyncio (or whatever library they're using). Like, if
you've learned that async functions always return Futures, you might
expect this to work:

fut = some_async_fun()
# Error, 'fut' is actually a coroutine b/c the hook isn't set up yet
fut.add_done_callback(...)
asyncio.run(fut)

That's a bit of a wart. But this is something that basically never
worked and can't work, and very few people are likely to run into, so
while it's sad that it's a wart I don't think it's an argument against
fixing the other 99% of cases? (And of course this doesn't arise for
libraries like Trio, where you just never call async functions outside
of async context.)

> Perhaps we should go the other way and wrap most ways of creating Futures in
> coroutines? (Though there would have to be a way for ensure_future() to
> *unwrap* it instead of wrapping it in a second Future.)

So there's a few reasons I didn't suggest going this direction:

- Just in practical terms, I don't know how we could make this change.
There's one place that all coroutines are created, so we at least have
the technical ability to change their behavior all at once. OTOH
Future-returning functions are just regular functions that happen to
return a Future, so we'd have to go fix them one at a time, right?

- For regular asyncio users, the Future API is pretty much a superset
of the coroutine API. (The only thing you can do with an coroutine is
await it or call ensure_future, and Futures allow both of those.) That
means that turning coroutines into Futures is mostly backwards
compatible, but turning Futures into coroutines isn't.

- Similarly, having coroutine-returning functions start running
without further intervention is *mostly* backwards compatible, because
it's very unusual to intentionally create a coroutine object and then
never actually run it (via await or ensure_future or whatever). But I
suspect it is fairly common to call Future-returning functions and
never await them, like in the async_log example above. This is why we
have the weird "category 3" in the first place: people would like to
refactor Future-returning APIs to take advantage of async/await, but
right now that's a compatibility-breaking change.

- Exposing raw coroutine objects to users has led to various gross-ish
hacks, like the hoops that asyncio debug mode has to jump through to
try to give better warnings about missing 'await'. Eliminating raw
coroutine objects from public APIs would remove the need for these
hacks. Making coroutine objects more prominent would have the opposite
effect :-).

- And also of course it wouldn't have the benefits for Trio (better
error messages for forgetting an 'await', ability to transition a
function between sync and async with a deprecation period), for
whatever that's worth.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org