[Async-sig] "Coroutines" sometimes run without being scheduled on an event loop

Chris Jerdonek chris.jerdonek at gmail.com
Thu May 3 16:56:12 EDT 2018


It would probably be hard for people to find at this point all the
places they might be relying on this behavior (if anywhere), but isn't
this a basic documented property of coroutines?

>From the introduction section on coroutines [1]:

> Calling a coroutine does not start its code running – the coroutine object returned by the call doesn’t do anything until you schedule its execution. There are two basic ways to start it running: call await coroutine or yield from coroutine from another coroutine (assuming the other coroutine is already running!), or schedule its execution using the ensure_future() function or the AbstractEventLoop.create_task() method.

> Coroutines (and tasks) can only run when the event loop is running.

[1]: https://docs.python.org/3/library/asyncio-task.html#coroutines

--Chris

On Thu, May 3, 2018 at 1:24 PM, Guido van Rossum <gvanrossum at gmail.com> wrote:
> Depending on the coroutine*not* running sounds like asking for trouble.
>
> On Thu, May 3, 2018, 09:38 Andrew Svetlov <andrew.svetlov at gmail.com> wrote:
>>
>> What real problem do you want to solve?
>> Correct code should always use `await loop.sock_connect(sock, addr)`, it
>> this case the behavior difference never hurts you.
>>
>> On Thu, May 3, 2018 at 7:04 PM twisteroid ambassador
>> <twisteroid.ambassador at gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> tl;dr: coroutine functions and regular functions returning Futures
>>> behave differently: the latter may start running immediately without
>>> being scheduled on a loop, or even with no loop running. This might be
>>> bad since the two are sometimes advertised to be interchangeable.
>>>
>>>
>>> I find that sometimes I want to construct a coroutine object, store it
>>> for some time, and run it later. Most times it works like one would
>>> expect: I call a coroutine function which gives me a coroutine object,
>>> I hold on to the coroutine object, I later await it or use
>>> loop.create_task(), asyncio.gather(), etc. on it, and only then it
>>> starts to run.
>>>
>>> However, I have found some cases where the "coroutine" starts running
>>> immediately. The first example is loop.run_in_executor(). I guess this
>>> is somewhat unsurprising since the passed function don't actually run
>>> in the event loop. Demonstrated below with strace and the interactive
>>> console:
>>>
>>> $ strace -e connect -f python3
>>> Python 3.6.5 (default, Apr  4 2018, 15:01:18)
>>> [GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux
>>> Type "help", "copyright", "credits" or "license" for more information.
>>> >>> import asyncio
>>> >>> import socket
>>> >>> s = socket.socket()
>>> >>> loop = asyncio.get_event_loop()
>>> >>> coro = loop.sock_connect(s, ('127.0.0.1', 80))
>>> >>> loop.run_until_complete(asyncio.sleep(1))
>>> >>> task = loop.create_task(coro)
>>> >>> loop.run_until_complete(asyncio.sleep(1))
>>> connect(3, {sa_family=AF_INET, sin_port=htons(80),
>>> sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection
>>> refused)
>>> >>> s.close()
>>> >>> s = socket.socket()
>>> >>> coro2 = loop.run_in_executor(None, s.connect, ('127.0.0.1', 80))
>>> strace: Process 13739 attached
>>> >>> [pid 13739] connect(3, {sa_family=AF_INET, sin_port=htons(80),
>>> >>> sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused)
>>>
>>> >>> coro2
>>> <Future pending cb=[_chain_future.<locals>._call_check_cancel() at
>>> /usr/lib64/python3.6/asyncio/futures.py:403]>
>>> >>> loop.run_until_complete(asyncio.sleep(1))
>>> >>> coro2
>>> <Future finished exception=ConnectionRefusedError(111, 'Connection
>>> refused')>
>>> >>>
>>>
>>> Note that with loop.sock_connect(), the connect syscall is only run
>>> after loop.create_task() is called on the coroutine AND the loop is
>>> running. On the other hand, as soon as loop.run_in_executor() is
>>> called on socket.connect, the connect syscall gets called, without the
>>> event loop running at all.
>>>
>>> Another such case is with Python 3.4.2, where even loop.sock_connect()
>>> will run immediately:
>>>
>>> $ strace -e connect -f python3
>>> Python 3.4.2 (default, Oct  8 2014, 10:45:20)
>>> [GCC 4.9.1] on linux
>>> Type "help", "copyright", "credits" or "license" for more information.
>>> >>> import socket
>>> >>> import asyncio
>>> >>> loop = asyncio.get_event_loop()
>>> >>> s = socket.socket()
>>> >>> c = loop.sock_connect(s, ('127.0.0.1', 82))
>>> connect(7, {sa_family=AF_INET, sin_port=htons(82),
>>> sin_addr=inet_addr("127.0.0.1")}, 16) = -1ECONNREFUSED (Connection
>>> refused)
>>> >>> c
>>> <Future finished exception=ConnectionRefusedError(111, 'Connection
>>> refused')>
>>> >>>
>>>
>>> In both these cases, the misbehaving "coroutine" aren't actually
>>> defined as coroutine functions, but regular functions returning a
>>> Future, which is probably why they don't act like coroutines. However,
>>> coroutine functions and regular functions returning Futures are often
>>> used interchangeably: Python docs Section 18.5.3.1 even says:
>>>
>>> > Note: In this documentation, some methods are documented as coroutines,
>>> > even if they are plain Python functions returning a Future. This is
>>> > intentional to have a freedom of tweaking the implementation of these
>>> > functions in the future.
>>>
>>> In particular, both run_in_executor() and sock_connect() are
>>> documented as coroutines.
>>>
>>> If an asyncio API may change from a function returning Future to a
>>> coroutine function and vice versa any time, then one cannot rely on
>>> the behavior of creating the "coroutine object" not running the
>>> coroutine immediately. This seems like an important Gotcha waiting to
>>> bite someone.
>>>
>>> Back to the scenario in the beginning. If I want to write a function
>>> that takes coroutine objects and schedule them to run later, and some
>>> coroutine objects turn out to be misbehaving like above, then they
>>> will run too early. To avoid this, I could either 1. pass the
>>> coroutine functions and their arguments separately "callback style",
>>> 2. use functools.partial or lambdas, or 3. always pass in real
>>> coroutine objects returned from coroutine functions defined with
>>> "async def". Does this sound right?
>>>
>>> Thanks,
>>>
>>> twistero
>>> _______________________________________________
>>> Async-sig mailing list
>>> Async-sig at python.org
>>> https://mail.python.org/mailman/listinfo/async-sig
>>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
>> --
>> Thanks,
>> Andrew Svetlov
>> _______________________________________________
>> Async-sig mailing list
>> Async-sig at python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
>
> _______________________________________________
> Async-sig mailing list
> Async-sig at python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>


More information about the Async-sig mailing list