[Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio

Tue Mar 26 23:33:39 EDT 2019

On Mon, Mar 25, 2019 at 4:37 PM Guido van Rossum <guido at python.org> wrote:
>
> I also hope Nathaniel has something to say -- I wonder if trio supports nested event loops?

Trio does have a similar check to prevent starting a new Trio loop
inside a running Trio loop, and there's currently no way to disable
it: https://github.com/python-trio/trio/blob/444234392c064c0ec5e66b986a693e2e9f76bc58/trio/_core/_run.py#L1398-L1402

Like the comment says, I could imagine changing this if there's a good reason.

On Tue, Mar 26, 2019 at 11:56 AM Yury Selivanov <yselivanov at gmail.com> wrote:
> I think that if we implement this feature behind a flag then some libraries will start requiring that flag to be set.  Which will inevitably lead us to a situation where it's impossible to use asyncio without the flag.  Therefore I suppose we should either just implement this behaviour by default or defer this to 3.9 or later.

It is weird that if you have a synchronous public interface, then it
acts differently depending on whether you happened to implement that
interface using the socket module directly vs using asyncio.

If you want to "hide" that your synchronous API uses asyncio
internally, then you can actually do that now using
public/quasi-public APIs:

def asyncio_run_encapsulated(*args, **kwargs):
    old_loop = asyncio.get_running_loop()
    try:
        asyncio._set_running_loop(None)
        return asyncio.run(*args, **kwargs)
    finally:
        asyncio._set_running_loop(old_loop)

def my_sync_api(...):
    return asyncio_run_encapsulated(my_async_api(...))

But this is also a bit weird, because the check is useful. It's weird
that a blocking socket-module-based implementation and a blocking
asyncio-based implementation act differently, but arguably the way to
make them consistent is to fix the socket module so that it does give
an error if you try to issue blocking calls from inside asyncio,
rather than remove the error from asyncio. In fact newcomers often
make mistakes like using time.sleep or requests from inside async
code, and a common question is how to catch this in real code bases.

I wonder if we should have an interpreter-managed thread-local flag
"we're in async mode", and make blocking operations in the stdlib
check it. E.g. as a straw man, sys.set_allow_blocking(True/False),
sys.get_allow_blocking(), sys.check_allow_blocking() -> raises an
exception if sys.get_allow_blocking() is False, and then add calls to
sys.check_allow_blocking() in time.sleep, socket operations with
blocking mode enabled, etc. (And encourage third-party libraries that
do their own blocking I/O without going through the stdlib to add
similar calls.) Async I/O libraries (asyncio/trio/twisted/...) would
set the flag appropriately; and if someone like IPython *really wants*
to perform blocking operations inside async context, they can fiddle
with the flag themselves.

> I myself am -1 on making 'run_until_complete()' reentrant.  The separation of async/await code and blocking code is painful enough to some people, introducing another "hybrid" mode will ultimately do more damage than good.  E.g. it's hard to reason about this even for me: I simply don't know if I can make uvloop (or asyncio) fully reentrant.

Yeah, pumping the I/O loop from inside a task that's running on the
I/O loop is just a mess. It breaks the async/await readability
guarantees, it risks stack overflow, and by the time this stuff bites
you you're going to have to backtrack a lonnng way to get to something
sensible. Trio definitely does not support this, and I will fight to
keep it that way :-).

Most traditional GUI I/O loops *do* allow this, and in the traditional
Twisted approach of trying to support all the I/O loop APIs on top of
each other, this can be a problem – if you want an adapter to run Qt
or Gtk apps on top of your favorite asyncio loop implementation, then
your loop implementation needs to support reentrancy. But I guess so
far people are OK with doing things the other way (implementing the
asyncio APIs on top of the standard GUI event loops). In Trio I have a
Cunning Scheme to avoid doing either approach, but we'll see how that
goes...

> In case of Jupyter I don't think it's a good idea for them to advertise nest_asyncio.  IMHO the right approach would be to encourage library developers to expose async/await APIs and teach Jupyter users to "await" on async code directly.
>
> The linked Jupyter issue (https://github.com/jupyter/notebook/issues/3397) is a good example: someone tries to call "asyncio.get_event_loop().run_until_complete(foo())" and the call fails.  Instead of recommending to use "nest_asyncio", Jupyter REPL could simply catch the error and suggest the user to await "foo()".  We can make that slightly easier by changing the exception type from RuntimeError to NestedAsyncioLoopError.  In other words, in the Jupyters case, I think it's a UI/UX problem, not an asyncio problem.

I think this might be too simplistic... Jupyter/IPython are in a
tricky place, where some users reasonably want to treat them like a
regular REPL, so calling 'asyncio.run(...)' should be supported (and
not supporting it would be a backcompat break). But, other users want
first-class async/await support integrated into some persistent event
loop. (And as Glyph points out, not supporting this is *also*
potentially a backcompat break, though probably a much less disruptive
one.)

To me the key observation is that in Jupyter/IPython, they want their
async/await support to work with multiple async library backends.
Therefore, they're not going to get away with letting people just
assume that there's some ambient Tornado-ish loop running -- they
*need* some way to hide that away as an implementation detail, and an
interface for users to state which async loop they want to use. Given
that, IMO it makes most sense for them to default to providing a sync
context by default, by whatever mechanism makes sense -- for a Jupyter
kernel, maybe this is a dedicated thread for running user code,
whatever. And then for Glyph and everyone who wants to access ambient
async functionality from inside the REPL, that's something you opt-in
to by running in a special mode, or writing %asyncio at the top of
your notebook.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org