[Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio

Wed Mar 27 14:22:16 EDT 2019

> On Mar 27, 2019, at 1:11 AM, Glyph <glyph at twistedmatrix.com> wrote:
> 
> 
> 
>> On Mar 26, 2019, at 11:56 AM, Yury Selivanov <yselivanov at gmail.com <mailto:yselivanov at gmail.com>> wrote:
>> 
>> 
>>> On Mar 25, 2019, at 8:01 PM, Guido van Rossum <guido at python.org <mailto:guido at python.org>> wrote:
>>> 
>>> Given PBP, I wonder if we should just relent and have a configurable flag (off by default) to allow nested loop invocations (both the same loop and a different loop).
>>> 
>> 
>> 
>> I think that if we implement this feature behind a flag then some libraries will start requiring that flag to be set.  Which will inevitably lead us to a situation where it's impossible to use asyncio without the flag.  Therefore I suppose we should either just implement this behaviour by default or defer this to 3.9 or later.
> 
> How do you feel about my proposal of making the "flag" be simply an argument to run_until_complete?  If what you really want to do is start a task or await a future, you should get notified that you're reentrantly blocking; but if you're sure, just pass the arg and be on your way.

I'm not sure how making it an argument solves anything; please help me understand. I see two scenarios here:

1. Migrating big codebase to asyncio.  This is something that Ben mentioned and I also happen to know that a couple big companies were struggling with that.  In this case a reentrant event loop can ease the migration pain.  But if it's enabled via an argument to run_until_complete/run_forever you will have to blindly enable it for your entire source tree in order to take advantage of it.  This argument then becomes, effectively, a global flag, and discourages you from actually refactoring and fixing your code properly.

2. Jupyter case.  Their users would still struggle with copy/pasting asyncio.run()/loop.run_until_complete() and get an error. They will still have to add "reentrant=True".  From the standpoint of usability, this new argument would not be a significant improvement as I see it.

> 
> If it's a "flag" like an env var or some kind of global switch, then I totally agree with you.
> 
>> I myself am -1 on making 'run_until_complete()' reentrant.  The separation of async/await code and blocking code is painful enough to some people, introducing another "hybrid" mode will ultimately do more damage than good.  E.g. it's hard to reason about this even for me: I simply don't know if I can make uvloop (or asyncio) fully reentrant.
> 
> If uvloop has problems with global state that prevent reentrancy, fine - for the use-cases where you're doing this, you already kind of implicitly don't care about performance; someone can instantiate their own, safe loop.  (If you can't do this with asyncio though I kinda wonder what's going on.)

Both asyncio & uvloop have global state: child processes watchers, global system hooks, and signal handlers. 

Re process watchers: asyncio manages that itself via monitoring on SIGCHLD etc, and uvloop offloads this problem to libuv entirely.  Both still have weird bugs with multiple event loops in the same process losing track of their subprocesses.

Re signal handlers: they are globally set both for asyncio & uvloop, and running a nested loop with some code that doesn't expect to be run like that can create hard to debug problems.  We probably need new signals API in asyncio (I quite like the signals API in Trio).

Re global system hooks: hooks to intercept async generators creation/GC are global.  Event loops do save/restore them in their various run() methods so it shouldn't be a problem, but this is still a piece of global state to know about.

I'm also not entirely sure that it's safe to mix uvloop with vanilla asyncio in the same process.

> 
>> In case of Jupyter I don't think it's a good idea for them to advertise nest_asyncio.  IMHO the right approach would be to encourage library developers to expose async/await APIs and teach Jupyter users to "await" on async code directly.
> 
> ✨💖✨
> 
>> The linked Jupyter issue (https://github.com/jupyter/notebook/issues/3397 <https://github.com/jupyter/notebook/issues/3397>) is a good example: someone tries to call "asyncio.get_event_loop().run_until_complete(foo())" and the call fails.  Instead of recommending to use "nest_asyncio", Jupyter REPL could simply catch the error and suggest the user to await "foo()".  We can make that slightly easier by changing the exception type from RuntimeError to NestedAsyncioLoopError.  In other words, in the Jupyters case, I think it's a UI/UX problem, not an asyncio problem.
> 
> So, you may not be able to `await` right now, today, from a cell, given that that needs some additional support.  But you can create_task just fine, right?  Making await-with-no-indentation work seamlessly would be beautiful but I don't think we need to wait for that modification to get made in order to enjoy the benefits of proper asynchrony.

I was under impression that Jupyter already allows top-level "await" expressions.  If that's true, then both "await foo()" and "asyncio.create_task(foo())" should work.

Yury

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/async-sig/attachments/20190327/f37af603/attachment-0001.html>