"Coroutines" sometimes run without being scheduled on an event loop
Hi, tl;dr: coroutine functions and regular functions returning Futures behave differently: the latter may start running immediately without being scheduled on a loop, or even with no loop running. This might be bad since the two are sometimes advertised to be interchangeable. I find that sometimes I want to construct a coroutine object, store it for some time, and run it later. Most times it works like one would expect: I call a coroutine function which gives me a coroutine object, I hold on to the coroutine object, I later await it or use loop.create_task(), asyncio.gather(), etc. on it, and only then it starts to run. However, I have found some cases where the "coroutine" starts running immediately. The first example is loop.run_in_executor(). I guess this is somewhat unsurprising since the passed function don't actually run in the event loop. Demonstrated below with strace and the interactive console: $ strace -e connect -f python3 Python 3.6.5 (default, Apr 4 2018, 15:01:18) [GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux Type "help", "copyright", "credits" or "license" for more information.
import asyncio import socket s = socket.socket() loop = asyncio.get_event_loop() coro = loop.sock_connect(s, ('127.0.0.1', 80)) loop.run_until_complete(asyncio.sleep(1)) task = loop.create_task(coro) loop.run_until_complete(asyncio.sleep(1)) connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused) s.close() s = socket.socket() coro2 = loop.run_in_executor(None, s.connect, ('127.0.0.1', 80)) strace: Process 13739 attached [pid 13739] connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused)
coro2 <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /usr/lib64/python3.6/asyncio/futures.py:403]> loop.run_until_complete(asyncio.sleep(1)) coro2 <Future finished exception=ConnectionRefusedError(111, 'Connection refused')>
Note that with loop.sock_connect(), the connect syscall is only run after loop.create_task() is called on the coroutine AND the loop is running. On the other hand, as soon as loop.run_in_executor() is called on socket.connect, the connect syscall gets called, without the event loop running at all. Another such case is with Python 3.4.2, where even loop.sock_connect() will run immediately: $ strace -e connect -f python3 Python 3.4.2 (default, Oct 8 2014, 10:45:20) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information.
import socket import asyncio loop = asyncio.get_event_loop() s = socket.socket() c = loop.sock_connect(s, ('127.0.0.1', 82)) connect(7, {sa_family=AF_INET, sin_port=htons(82), sin_addr=inet_addr("127.0.0.1")}, 16) = -1ECONNREFUSED (Connection refused) c <Future finished exception=ConnectionRefusedError(111, 'Connection refused')>
In both these cases, the misbehaving "coroutine" aren't actually defined as coroutine functions, but regular functions returning a Future, which is probably why they don't act like coroutines. However, coroutine functions and regular functions returning Futures are often used interchangeably: Python docs Section 18.5.3.1 even says:
Note: In this documentation, some methods are documented as coroutines, even if they are plain Python functions returning a Future. This is intentional to have a freedom of tweaking the implementation of these functions in the future.
In particular, both run_in_executor() and sock_connect() are documented as coroutines. If an asyncio API may change from a function returning Future to a coroutine function and vice versa any time, then one cannot rely on the behavior of creating the "coroutine object" not running the coroutine immediately. This seems like an important Gotcha waiting to bite someone. Back to the scenario in the beginning. If I want to write a function that takes coroutine objects and schedule them to run later, and some coroutine objects turn out to be misbehaving like above, then they will run too early. To avoid this, I could either 1. pass the coroutine functions and their arguments separately "callback style", 2. use functools.partial or lambdas, or 3. always pass in real coroutine objects returned from coroutine functions defined with "async def". Does this sound right? Thanks, twistero
What real problem do you want to solve? Correct code should always use `await loop.sock_connect(sock, addr)`, it this case the behavior difference never hurts you. On Thu, May 3, 2018 at 7:04 PM twisteroid ambassador < twisteroid.ambassador@gmail.com> wrote:
Hi,
tl;dr: coroutine functions and regular functions returning Futures behave differently: the latter may start running immediately without being scheduled on a loop, or even with no loop running. This might be bad since the two are sometimes advertised to be interchangeable.
I find that sometimes I want to construct a coroutine object, store it for some time, and run it later. Most times it works like one would expect: I call a coroutine function which gives me a coroutine object, I hold on to the coroutine object, I later await it or use loop.create_task(), asyncio.gather(), etc. on it, and only then it starts to run.
However, I have found some cases where the "coroutine" starts running immediately. The first example is loop.run_in_executor(). I guess this is somewhat unsurprising since the passed function don't actually run in the event loop. Demonstrated below with strace and the interactive console:
$ strace -e connect -f python3 Python 3.6.5 (default, Apr 4 2018, 15:01:18) [GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux Type "help", "copyright", "credits" or "license" for more information.
import asyncio import socket s = socket.socket() loop = asyncio.get_event_loop() coro = loop.sock_connect(s, ('127.0.0.1', 80)) loop.run_until_complete(asyncio.sleep(1)) task = loop.create_task(coro) loop.run_until_complete(asyncio.sleep(1)) connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused) s.close() s = socket.socket() coro2 = loop.run_in_executor(None, s.connect, ('127.0.0.1', 80)) strace: Process 13739 attached [pid 13739] connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused)
coro2 <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /usr/lib64/python3.6/asyncio/futures.py:403]> loop.run_until_complete(asyncio.sleep(1)) coro2 <Future finished exception=ConnectionRefusedError(111, 'Connection refused')>
Note that with loop.sock_connect(), the connect syscall is only run after loop.create_task() is called on the coroutine AND the loop is running. On the other hand, as soon as loop.run_in_executor() is called on socket.connect, the connect syscall gets called, without the event loop running at all.
Another such case is with Python 3.4.2, where even loop.sock_connect() will run immediately:
$ strace -e connect -f python3 Python 3.4.2 (default, Oct 8 2014, 10:45:20) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information.
import socket import asyncio loop = asyncio.get_event_loop() s = socket.socket() c = loop.sock_connect(s, ('127.0.0.1', 82)) connect(7, {sa_family=AF_INET, sin_port=htons(82), sin_addr=inet_addr("127.0.0.1")}, 16) = -1ECONNREFUSED (Connection refused) c <Future finished exception=ConnectionRefusedError(111, 'Connection refused')>
In both these cases, the misbehaving "coroutine" aren't actually defined as coroutine functions, but regular functions returning a Future, which is probably why they don't act like coroutines. However, coroutine functions and regular functions returning Futures are often used interchangeably: Python docs Section 18.5.3.1 even says:
Note: In this documentation, some methods are documented as coroutines, even if they are plain Python functions returning a Future. This is intentional to have a freedom of tweaking the implementation of these functions in the future.
In particular, both run_in_executor() and sock_connect() are documented as coroutines.
If an asyncio API may change from a function returning Future to a coroutine function and vice versa any time, then one cannot rely on the behavior of creating the "coroutine object" not running the coroutine immediately. This seems like an important Gotcha waiting to bite someone.
Back to the scenario in the beginning. If I want to write a function that takes coroutine objects and schedule them to run later, and some coroutine objects turn out to be misbehaving like above, then they will run too early. To avoid this, I could either 1. pass the coroutine functions and their arguments separately "callback style", 2. use functools.partial or lambdas, or 3. always pass in real coroutine objects returned from coroutine functions defined with "async def". Does this sound right?
Thanks,
twistero _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
Depending on the coroutine*not* running sounds like asking for trouble. On Thu, May 3, 2018, 09:38 Andrew Svetlov <andrew.svetlov@gmail.com> wrote:
What real problem do you want to solve? Correct code should always use `await loop.sock_connect(sock, addr)`, it this case the behavior difference never hurts you.
On Thu, May 3, 2018 at 7:04 PM twisteroid ambassador < twisteroid.ambassador@gmail.com> wrote:
Hi,
tl;dr: coroutine functions and regular functions returning Futures behave differently: the latter may start running immediately without being scheduled on a loop, or even with no loop running. This might be bad since the two are sometimes advertised to be interchangeable.
I find that sometimes I want to construct a coroutine object, store it for some time, and run it later. Most times it works like one would expect: I call a coroutine function which gives me a coroutine object, I hold on to the coroutine object, I later await it or use loop.create_task(), asyncio.gather(), etc. on it, and only then it starts to run.
However, I have found some cases where the "coroutine" starts running immediately. The first example is loop.run_in_executor(). I guess this is somewhat unsurprising since the passed function don't actually run in the event loop. Demonstrated below with strace and the interactive console:
$ strace -e connect -f python3 Python 3.6.5 (default, Apr 4 2018, 15:01:18) [GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux Type "help", "copyright", "credits" or "license" for more information.
import asyncio import socket s = socket.socket() loop = asyncio.get_event_loop() coro = loop.sock_connect(s, ('127.0.0.1', 80)) loop.run_until_complete(asyncio.sleep(1)) task = loop.create_task(coro) loop.run_until_complete(asyncio.sleep(1)) connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused) s.close() s = socket.socket() coro2 = loop.run_in_executor(None, s.connect, ('127.0.0.1', 80)) strace: Process 13739 attached [pid 13739] connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused)
coro2 <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /usr/lib64/python3.6/asyncio/futures.py:403]> loop.run_until_complete(asyncio.sleep(1)) coro2 <Future finished exception=ConnectionRefusedError(111, 'Connection refused')>
Note that with loop.sock_connect(), the connect syscall is only run after loop.create_task() is called on the coroutine AND the loop is running. On the other hand, as soon as loop.run_in_executor() is called on socket.connect, the connect syscall gets called, without the event loop running at all.
Another such case is with Python 3.4.2, where even loop.sock_connect() will run immediately:
$ strace -e connect -f python3 Python 3.4.2 (default, Oct 8 2014, 10:45:20) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information.
import socket import asyncio loop = asyncio.get_event_loop() s = socket.socket() c = loop.sock_connect(s, ('127.0.0.1', 82)) connect(7, {sa_family=AF_INET, sin_port=htons(82), sin_addr=inet_addr("127.0.0.1")}, 16) = -1ECONNREFUSED (Connection refused) c <Future finished exception=ConnectionRefusedError(111, 'Connection refused')>
In both these cases, the misbehaving "coroutine" aren't actually defined as coroutine functions, but regular functions returning a Future, which is probably why they don't act like coroutines. However, coroutine functions and regular functions returning Futures are often used interchangeably: Python docs Section 18.5.3.1 even says:
Note: In this documentation, some methods are documented as coroutines, even if they are plain Python functions returning a Future. This is intentional to have a freedom of tweaking the implementation of these functions in the future.
In particular, both run_in_executor() and sock_connect() are documented as coroutines.
If an asyncio API may change from a function returning Future to a coroutine function and vice versa any time, then one cannot rely on the behavior of creating the "coroutine object" not running the coroutine immediately. This seems like an important Gotcha waiting to bite someone.
Back to the scenario in the beginning. If I want to write a function that takes coroutine objects and schedule them to run later, and some coroutine objects turn out to be misbehaving like above, then they will run too early. To avoid this, I could either 1. pass the coroutine functions and their arguments separately "callback style", 2. use functools.partial or lambdas, or 3. always pass in real coroutine objects returned from coroutine functions defined with "async def". Does this sound right?
Thanks,
twistero _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
It would probably be hard for people to find at this point all the places they might be relying on this behavior (if anywhere), but isn't this a basic documented property of coroutines?
From the introduction section on coroutines [1]:
Calling a coroutine does not start its code running – the coroutine object returned by the call doesn’t do anything until you schedule its execution. There are two basic ways to start it running: call await coroutine or yield from coroutine from another coroutine (assuming the other coroutine is already running!), or schedule its execution using the ensure_future() function or the AbstractEventLoop.create_task() method.
Coroutines (and tasks) can only run when the event loop is running.
[1]: https://docs.python.org/3/library/asyncio-task.html#coroutines --Chris On Thu, May 3, 2018 at 1:24 PM, Guido van Rossum <gvanrossum@gmail.com> wrote:
Depending on the coroutine*not* running sounds like asking for trouble.
On Thu, May 3, 2018, 09:38 Andrew Svetlov <andrew.svetlov@gmail.com> wrote:
What real problem do you want to solve? Correct code should always use `await loop.sock_connect(sock, addr)`, it this case the behavior difference never hurts you.
On Thu, May 3, 2018 at 7:04 PM twisteroid ambassador <twisteroid.ambassador@gmail.com> wrote:
Hi,
tl;dr: coroutine functions and regular functions returning Futures behave differently: the latter may start running immediately without being scheduled on a loop, or even with no loop running. This might be bad since the two are sometimes advertised to be interchangeable.
I find that sometimes I want to construct a coroutine object, store it for some time, and run it later. Most times it works like one would expect: I call a coroutine function which gives me a coroutine object, I hold on to the coroutine object, I later await it or use loop.create_task(), asyncio.gather(), etc. on it, and only then it starts to run.
However, I have found some cases where the "coroutine" starts running immediately. The first example is loop.run_in_executor(). I guess this is somewhat unsurprising since the passed function don't actually run in the event loop. Demonstrated below with strace and the interactive console:
$ strace -e connect -f python3 Python 3.6.5 (default, Apr 4 2018, 15:01:18) [GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux Type "help", "copyright", "credits" or "license" for more information.
import asyncio import socket s = socket.socket() loop = asyncio.get_event_loop() coro = loop.sock_connect(s, ('127.0.0.1', 80)) loop.run_until_complete(asyncio.sleep(1)) task = loop.create_task(coro) loop.run_until_complete(asyncio.sleep(1)) connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused) s.close() s = socket.socket() coro2 = loop.run_in_executor(None, s.connect, ('127.0.0.1', 80)) strace: Process 13739 attached [pid 13739] connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused)
coro2 <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /usr/lib64/python3.6/asyncio/futures.py:403]> loop.run_until_complete(asyncio.sleep(1)) coro2 <Future finished exception=ConnectionRefusedError(111, 'Connection refused')>
Note that with loop.sock_connect(), the connect syscall is only run after loop.create_task() is called on the coroutine AND the loop is running. On the other hand, as soon as loop.run_in_executor() is called on socket.connect, the connect syscall gets called, without the event loop running at all.
Another such case is with Python 3.4.2, where even loop.sock_connect() will run immediately:
$ strace -e connect -f python3 Python 3.4.2 (default, Oct 8 2014, 10:45:20) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information.
import socket import asyncio loop = asyncio.get_event_loop() s = socket.socket() c = loop.sock_connect(s, ('127.0.0.1', 82)) connect(7, {sa_family=AF_INET, sin_port=htons(82), sin_addr=inet_addr("127.0.0.1")}, 16) = -1ECONNREFUSED (Connection refused) c <Future finished exception=ConnectionRefusedError(111, 'Connection refused')>
In both these cases, the misbehaving "coroutine" aren't actually defined as coroutine functions, but regular functions returning a Future, which is probably why they don't act like coroutines. However, coroutine functions and regular functions returning Futures are often used interchangeably: Python docs Section 18.5.3.1 even says:
Note: In this documentation, some methods are documented as coroutines, even if they are plain Python functions returning a Future. This is intentional to have a freedom of tweaking the implementation of these functions in the future.
In particular, both run_in_executor() and sock_connect() are documented as coroutines.
If an asyncio API may change from a function returning Future to a coroutine function and vice versa any time, then one cannot rely on the behavior of creating the "coroutine object" not running the coroutine immediately. This seems like an important Gotcha waiting to bite someone.
Back to the scenario in the beginning. If I want to write a function that takes coroutine objects and schedule them to run later, and some coroutine objects turn out to be misbehaving like above, then they will run too early. To avoid this, I could either 1. pass the coroutine functions and their arguments separately "callback style", 2. use functools.partial or lambdas, or 3. always pass in real coroutine objects returned from coroutine functions defined with "async def". Does this sound right?
Thanks,
twistero _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
_______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
I doubt if we should specify such things very explicitly. Call it "implementation detail" :) FYI in Python 3.7 all `sock_*()` methods are native coroutines now. `run_in_executor()` is a regular function that returns a future object. I don't remember is it the only exception or asyncio has other functions with such return type. On Thu, May 3, 2018 at 11:56 PM Chris Jerdonek <chris.jerdonek@gmail.com> wrote:
It would probably be hard for people to find at this point all the places they might be relying on this behavior (if anywhere), but isn't this a basic documented property of coroutines?
From the introduction section on coroutines [1]:
Calling a coroutine does not start its code running – the coroutine object returned by the call doesn’t do anything until you schedule its execution. There are two basic ways to start it running: call await coroutine or yield from coroutine from another coroutine (assuming the other coroutine is already running!), or schedule its execution using the ensure_future() function or the AbstractEventLoop.create_task() method.
Coroutines (and tasks) can only run when the event loop is running.
[1]: https://docs.python.org/3/library/asyncio-task.html#coroutines
--Chris
On Thu, May 3, 2018 at 1:24 PM, Guido van Rossum <gvanrossum@gmail.com> wrote:
Depending on the coroutine*not* running sounds like asking for trouble.
On Thu, May 3, 2018, 09:38 Andrew Svetlov <andrew.svetlov@gmail.com> wrote:
What real problem do you want to solve? Correct code should always use `await loop.sock_connect(sock, addr)`, it this case the behavior difference never hurts you.
On Thu, May 3, 2018 at 7:04 PM twisteroid ambassador <twisteroid.ambassador@gmail.com> wrote:
Hi,
tl;dr: coroutine functions and regular functions returning Futures behave differently: the latter may start running immediately without being scheduled on a loop, or even with no loop running. This might be bad since the two are sometimes advertised to be interchangeable.
I find that sometimes I want to construct a coroutine object, store it for some time, and run it later. Most times it works like one would expect: I call a coroutine function which gives me a coroutine object, I hold on to the coroutine object, I later await it or use loop.create_task(), asyncio.gather(), etc. on it, and only then it starts to run.
However, I have found some cases where the "coroutine" starts running immediately. The first example is loop.run_in_executor(). I guess this is somewhat unsurprising since the passed function don't actually run in the event loop. Demonstrated below with strace and the interactive console:
$ strace -e connect -f python3 Python 3.6.5 (default, Apr 4 2018, 15:01:18) [GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux Type "help", "copyright", "credits" or "license" for more information.
> import asyncio > import socket > s = socket.socket() > loop = asyncio.get_event_loop() > coro = loop.sock_connect(s, ('127.0.0.1', 80)) > loop.run_until_complete(asyncio.sleep(1)) > task = loop.create_task(coro) > loop.run_until_complete(asyncio.sleep(1)) connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused) > s.close() > s = socket.socket() > coro2 = loop.run_in_executor(None, s.connect, ('127.0.0.1', 80)) strace: Process 13739 attached > [pid 13739] connect(3, {sa_family=AF_INET, sin_port=htons(80), > sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED
(Connection refused)
> coro2 <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /usr/lib64/python3.6/asyncio/futures.py:403]> > loop.run_until_complete(asyncio.sleep(1)) > coro2 <Future finished exception=ConnectionRefusedError(111, 'Connection refused')> >
Note that with loop.sock_connect(), the connect syscall is only run after loop.create_task() is called on the coroutine AND the loop is running. On the other hand, as soon as loop.run_in_executor() is called on socket.connect, the connect syscall gets called, without the event loop running at all.
Another such case is with Python 3.4.2, where even loop.sock_connect() will run immediately:
$ strace -e connect -f python3 Python 3.4.2 (default, Oct 8 2014, 10:45:20) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information.
> import socket > import asyncio > loop = asyncio.get_event_loop() > s = socket.socket() > c = loop.sock_connect(s, ('127.0.0.1', 82)) connect(7, {sa_family=AF_INET, sin_port=htons(82), sin_addr=inet_addr("127.0.0.1")}, 16) = -1ECONNREFUSED (Connection refused) > c <Future finished exception=ConnectionRefusedError(111, 'Connection refused')> >
In both these cases, the misbehaving "coroutine" aren't actually defined as coroutine functions, but regular functions returning a Future, which is probably why they don't act like coroutines. However, coroutine functions and regular functions returning Futures are often used interchangeably: Python docs Section 18.5.3.1 even says:
Note: In this documentation, some methods are documented as
coroutines,
even if they are plain Python functions returning a Future. This is intentional to have a freedom of tweaking the implementation of these functions in the future.
In particular, both run_in_executor() and sock_connect() are documented as coroutines.
If an asyncio API may change from a function returning Future to a coroutine function and vice versa any time, then one cannot rely on the behavior of creating the "coroutine object" not running the coroutine immediately. This seems like an important Gotcha waiting to bite someone.
Back to the scenario in the beginning. If I want to write a function that takes coroutine objects and schedule them to run later, and some coroutine objects turn out to be misbehaving like above, then they will run too early. To avoid this, I could either 1. pass the coroutine functions and their arguments separately "callback style", 2. use functools.partial or lambdas, or 3. always pass in real coroutine objects returned from coroutine functions defined with "async def". Does this sound right?
Thanks,
twistero _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
_______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
Then, perhaps it's safest to treat <coroutines intended to be run later> the same way as <functions intended to be run later>, i.e. callbacks, and pass them around the old way, using partials, lambdas, separate arguments for coroutine function and arguments, etc. Aww, suddenly coroutines don't feel as sexy as before. (j/k) On Fri, May 4, 2018 at 4:24 AM, Guido van Rossum <gvanrossum@gmail.com> wrote:
Depending on the coroutine*not* running sounds like asking for trouble.
On Thu, May 3, 2018, 09:38 Andrew Svetlov <andrew.svetlov@gmail.com> wrote:
What real problem do you want to solve? Correct code should always use `await loop.sock_connect(sock, addr)`, it this case the behavior difference never hurts you.
On Thu, May 3, 2018 at 7:04 PM twisteroid ambassador <twisteroid.ambassador@gmail.com> wrote:
Hi,
tl;dr: coroutine functions and regular functions returning Futures behave differently: the latter may start running immediately without being scheduled on a loop, or even with no loop running. This might be bad since the two are sometimes advertised to be interchangeable.
I find that sometimes I want to construct a coroutine object, store it for some time, and run it later. Most times it works like one would expect: I call a coroutine function which gives me a coroutine object, I hold on to the coroutine object, I later await it or use loop.create_task(), asyncio.gather(), etc. on it, and only then it starts to run.
However, I have found some cases where the "coroutine" starts running immediately. The first example is loop.run_in_executor(). I guess this is somewhat unsurprising since the passed function don't actually run in the event loop. Demonstrated below with strace and the interactive console:
$ strace -e connect -f python3 Python 3.6.5 (default, Apr 4 2018, 15:01:18) [GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux Type "help", "copyright", "credits" or "license" for more information.
import asyncio import socket s = socket.socket() loop = asyncio.get_event_loop() coro = loop.sock_connect(s, ('127.0.0.1', 80)) loop.run_until_complete(asyncio.sleep(1)) task = loop.create_task(coro) loop.run_until_complete(asyncio.sleep(1)) connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused) s.close() s = socket.socket() coro2 = loop.run_in_executor(None, s.connect, ('127.0.0.1', 80)) strace: Process 13739 attached [pid 13739] connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused)
coro2 <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /usr/lib64/python3.6/asyncio/futures.py:403]> loop.run_until_complete(asyncio.sleep(1)) coro2 <Future finished exception=ConnectionRefusedError(111, 'Connection refused')>
Note that with loop.sock_connect(), the connect syscall is only run after loop.create_task() is called on the coroutine AND the loop is running. On the other hand, as soon as loop.run_in_executor() is called on socket.connect, the connect syscall gets called, without the event loop running at all.
Another such case is with Python 3.4.2, where even loop.sock_connect() will run immediately:
$ strace -e connect -f python3 Python 3.4.2 (default, Oct 8 2014, 10:45:20) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information.
import socket import asyncio loop = asyncio.get_event_loop() s = socket.socket() c = loop.sock_connect(s, ('127.0.0.1', 82)) connect(7, {sa_family=AF_INET, sin_port=htons(82), sin_addr=inet_addr("127.0.0.1")}, 16) = -1ECONNREFUSED (Connection refused) c <Future finished exception=ConnectionRefusedError(111, 'Connection refused')>
In both these cases, the misbehaving "coroutine" aren't actually defined as coroutine functions, but regular functions returning a Future, which is probably why they don't act like coroutines. However, coroutine functions and regular functions returning Futures are often used interchangeably: Python docs Section 18.5.3.1 even says:
Note: In this documentation, some methods are documented as coroutines, even if they are plain Python functions returning a Future. This is intentional to have a freedom of tweaking the implementation of these functions in the future.
In particular, both run_in_executor() and sock_connect() are documented as coroutines.
If an asyncio API may change from a function returning Future to a coroutine function and vice versa any time, then one cannot rely on the behavior of creating the "coroutine object" not running the coroutine immediately. This seems like an important Gotcha waiting to bite someone.
Back to the scenario in the beginning. If I want to write a function that takes coroutine objects and schedule them to run later, and some coroutine objects turn out to be misbehaving like above, then they will run too early. To avoid this, I could either 1. pass the coroutine functions and their arguments separately "callback style", 2. use functools.partial or lambdas, or 3. always pass in real coroutine objects returned from coroutine functions defined with "async def". Does this sound right?
Thanks,
twistero _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
The real problem I'm playing with is implementing "happy eyeballs", where I may have several sockets attempting to connect simultaneously, and the first one to successfully connect gets used. I had the idea of preparing all of the loop.sock_connect() coroutine objects in advance, and scheduling them one by one on the loop, but wanted to make double sure that the sockets won't start connecting before the coroutines are scheduled. I wanted to write something like this: successful_socket = await staggered_start([loop.sock_connect(socket.socket(), addr) for addr in addresses]) where async def staggered_start(coros) is some kind of reusable scheduling logic. As it turns out, I can't actually depend on loop.sock_connect() doing the Right Thing (TM) if I want to support Python 3.4. On Fri, May 4, 2018 at 12:37 AM, Andrew Svetlov <andrew.svetlov@gmail.com> wrote:
What real problem do you want to solve? Correct code should always use `await loop.sock_connect(sock, addr)`, it this case the behavior difference never hurts you.
On Thu, May 3, 2018 at 7:04 PM twisteroid ambassador <twisteroid.ambassador@gmail.com> wrote:
Hi,
tl;dr: coroutine functions and regular functions returning Futures behave differently: the latter may start running immediately without being scheduled on a loop, or even with no loop running. This might be bad since the two are sometimes advertised to be interchangeable.
I find that sometimes I want to construct a coroutine object, store it for some time, and run it later. Most times it works like one would expect: I call a coroutine function which gives me a coroutine object, I hold on to the coroutine object, I later await it or use loop.create_task(), asyncio.gather(), etc. on it, and only then it starts to run.
However, I have found some cases where the "coroutine" starts running immediately. The first example is loop.run_in_executor(). I guess this is somewhat unsurprising since the passed function don't actually run in the event loop. Demonstrated below with strace and the interactive console:
$ strace -e connect -f python3 Python 3.6.5 (default, Apr 4 2018, 15:01:18) [GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux Type "help", "copyright", "credits" or "license" for more information.
import asyncio import socket s = socket.socket() loop = asyncio.get_event_loop() coro = loop.sock_connect(s, ('127.0.0.1', 80)) loop.run_until_complete(asyncio.sleep(1)) task = loop.create_task(coro) loop.run_until_complete(asyncio.sleep(1)) connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused) s.close() s = socket.socket() coro2 = loop.run_in_executor(None, s.connect, ('127.0.0.1', 80)) strace: Process 13739 attached [pid 13739] connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused)
coro2 <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /usr/lib64/python3.6/asyncio/futures.py:403]> loop.run_until_complete(asyncio.sleep(1)) coro2 <Future finished exception=ConnectionRefusedError(111, 'Connection refused')>
Note that with loop.sock_connect(), the connect syscall is only run after loop.create_task() is called on the coroutine AND the loop is running. On the other hand, as soon as loop.run_in_executor() is called on socket.connect, the connect syscall gets called, without the event loop running at all.
Another such case is with Python 3.4.2, where even loop.sock_connect() will run immediately:
$ strace -e connect -f python3 Python 3.4.2 (default, Oct 8 2014, 10:45:20) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information.
import socket import asyncio loop = asyncio.get_event_loop() s = socket.socket() c = loop.sock_connect(s, ('127.0.0.1', 82)) connect(7, {sa_family=AF_INET, sin_port=htons(82), sin_addr=inet_addr("127.0.0.1")}, 16) = -1ECONNREFUSED (Connection refused) c <Future finished exception=ConnectionRefusedError(111, 'Connection refused')>
In both these cases, the misbehaving "coroutine" aren't actually defined as coroutine functions, but regular functions returning a Future, which is probably why they don't act like coroutines. However, coroutine functions and regular functions returning Futures are often used interchangeably: Python docs Section 18.5.3.1 even says:
Note: In this documentation, some methods are documented as coroutines, even if they are plain Python functions returning a Future. This is intentional to have a freedom of tweaking the implementation of these functions in the future.
In particular, both run_in_executor() and sock_connect() are documented as coroutines.
If an asyncio API may change from a function returning Future to a coroutine function and vice versa any time, then one cannot rely on the behavior of creating the "coroutine object" not running the coroutine immediately. This seems like an important Gotcha waiting to bite someone.
Back to the scenario in the beginning. If I want to write a function that takes coroutine objects and schedule them to run later, and some coroutine objects turn out to be misbehaving like above, then they will run too early. To avoid this, I could either 1. pass the coroutine functions and their arguments separately "callback style", 2. use functools.partial or lambdas, or 3. always pass in real coroutine objects returned from coroutine functions defined with "async def". Does this sound right?
Thanks,
twistero _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
My 2c: don't use py3.4; in fact don't use 3.5 either :) If you decide to support older Python versions, it's only fair that separate implementation may be needed. Re: overall problem, why not try the following: wrap your individual tasks in async def, where each staggers, connects and resolves and handles cancellation (if it didn't win the race). IMO that's easier to reason about, debug and works around your problem ;) On Fri, 4 May 2018 at 9:34 AM, twisteroid ambassador < twisteroid.ambassador@gmail.com> wrote:
The real problem I'm playing with is implementing "happy eyeballs", where I may have several sockets attempting to connect simultaneously, and the first one to successfully connect gets used. I had the idea of preparing all of the loop.sock_connect() coroutine objects in advance, and scheduling them one by one on the loop, but wanted to make double sure that the sockets won't start connecting before the coroutines are scheduled. I wanted to write something like this:
successful_socket = await staggered_start([loop.sock_connect(socket.socket(), addr) for addr in addresses])
where async def staggered_start(coros) is some kind of reusable scheduling logic. As it turns out, I can't actually depend on loop.sock_connect() doing the Right Thing (TM) if I want to support Python 3.4.
On Fri, May 4, 2018 at 12:37 AM, Andrew Svetlov <andrew.svetlov@gmail.com> wrote:
What real problem do you want to solve? Correct code should always use `await loop.sock_connect(sock, addr)`, it this case the behavior difference never hurts you.
On Thu, May 3, 2018 at 7:04 PM twisteroid ambassador <twisteroid.ambassador@gmail.com> wrote:
Hi,
tl;dr: coroutine functions and regular functions returning Futures behave differently: the latter may start running immediately without being scheduled on a loop, or even with no loop running. This might be bad since the two are sometimes advertised to be interchangeable.
I find that sometimes I want to construct a coroutine object, store it for some time, and run it later. Most times it works like one would expect: I call a coroutine function which gives me a coroutine object, I hold on to the coroutine object, I later await it or use loop.create_task(), asyncio.gather(), etc. on it, and only then it starts to run.
However, I have found some cases where the "coroutine" starts running immediately. The first example is loop.run_in_executor(). I guess this is somewhat unsurprising since the passed function don't actually run in the event loop. Demonstrated below with strace and the interactive console:
$ strace -e connect -f python3 Python 3.6.5 (default, Apr 4 2018, 15:01:18) [GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux Type "help", "copyright", "credits" or "license" for more information.
import asyncio import socket s = socket.socket() loop = asyncio.get_event_loop() coro = loop.sock_connect(s, ('127.0.0.1', 80)) loop.run_until_complete(asyncio.sleep(1)) task = loop.create_task(coro) loop.run_until_complete(asyncio.sleep(1)) connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused) s.close() s = socket.socket() coro2 = loop.run_in_executor(None, s.connect, ('127.0.0.1', 80)) strace: Process 13739 attached [pid 13739] connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection
refused)
coro2 <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /usr/lib64/python3.6/asyncio/futures.py:403]> loop.run_until_complete(asyncio.sleep(1)) coro2 <Future finished exception=ConnectionRefusedError(111, 'Connection refused')>
Note that with loop.sock_connect(), the connect syscall is only run after loop.create_task() is called on the coroutine AND the loop is running. On the other hand, as soon as loop.run_in_executor() is called on socket.connect, the connect syscall gets called, without the event loop running at all.
Another such case is with Python 3.4.2, where even loop.sock_connect() will run immediately:
$ strace -e connect -f python3 Python 3.4.2 (default, Oct 8 2014, 10:45:20) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information.
import socket import asyncio loop = asyncio.get_event_loop() s = socket.socket() c = loop.sock_connect(s, ('127.0.0.1', 82)) connect(7, {sa_family=AF_INET, sin_port=htons(82), sin_addr=inet_addr("127.0.0.1")}, 16) = -1ECONNREFUSED (Connection refused) c <Future finished exception=ConnectionRefusedError(111, 'Connection refused')>
In both these cases, the misbehaving "coroutine" aren't actually defined as coroutine functions, but regular functions returning a Future, which is probably why they don't act like coroutines. However, coroutine functions and regular functions returning Futures are often used interchangeably: Python docs Section 18.5.3.1 even says:
Note: In this documentation, some methods are documented as
coroutines,
even if they are plain Python functions returning a Future. This is intentional to have a freedom of tweaking the implementation of these functions in the future.
In particular, both run_in_executor() and sock_connect() are documented as coroutines.
If an asyncio API may change from a function returning Future to a coroutine function and vice versa any time, then one cannot rely on the behavior of creating the "coroutine object" not running the coroutine immediately. This seems like an important Gotcha waiting to bite someone.
Back to the scenario in the beginning. If I want to write a function that takes coroutine objects and schedule them to run later, and some coroutine objects turn out to be misbehaving like above, then they will run too early. To avoid this, I could either 1. pass the coroutine functions and their arguments separately "callback style", 2. use functools.partial or lambdas, or 3. always pass in real coroutine objects returned from coroutine functions defined with "async def". Does this sound right?
Thanks,
twistero _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
On Thu, May 3, 2018 at 8:52 PM, Dima Tisnek <dimaqq@gmail.com> wrote:
My 2c: don't use py3.4; in fact don't use 3.5 either :) If you decide to support older Python versions, it's only fair that separate implementation may be needed.
I'd agree - focus Python 3.6+
Re: overall problem, why not try the following: wrap your individual tasks in async def, where each staggers, connects and resolves and handles cancellation (if it didn't win the race). IMO that's easier to reason about, debug and works around your problem ;)
On Fri, 4 May 2018 at 9:34 AM, twisteroid ambassador < twisteroid.ambassador@gmail.com> wrote:
The real problem I'm playing with is implementing "happy eyeballs", where I may have several sockets attempting to connect simultaneously, and the first one to successfully connect gets used. I had the idea of
Simpler is better ... this isn't an asyncio example, but maybe the readability (ymmv? For me - very clearly readable) is worth a ponder: https://github.com/dabeaz/curio/blob/master/README.rst#a-complex-example
preparing all of the loop.sock_connect() coroutine objects in advance,
and scheduling them one by one on the loop, but wanted to make double sure that the sockets won't start connecting before the coroutines are scheduled. I wanted to write something like this:
successful_socket = await staggered_start([loop.sock_connect(socket.socket(), addr) for addr in addresses])
where async def staggered_start(coros) is some kind of reusable scheduling logic. As it turns out, I can't actually depend on loop.sock_connect() doing the Right Thing (TM) if I want to support Python 3.4.
On Fri, May 4, 2018 at 12:37 AM, Andrew Svetlov <andrew.svetlov@gmail.com> wrote:
What real problem do you want to solve? Correct code should always use `await loop.sock_connect(sock, addr)`, it this case the behavior difference never hurts you.
On Thu, May 3, 2018 at 7:04 PM twisteroid ambassador <twisteroid.ambassador@gmail.com> wrote:
Hi,
tl;dr: coroutine functions and regular functions returning Futures behave differently: the latter may start running immediately without being scheduled on a loop, or even with no loop running. This might be bad since the two are sometimes advertised to be interchangeable.
I find that sometimes I want to construct a coroutine object, store it for some time, and run it later. Most times it works like one would expect: I call a coroutine function which gives me a coroutine object, I hold on to the coroutine object, I later await it or use loop.create_task(), asyncio.gather(), etc. on it, and only then it starts to run.
However, I have found some cases where the "coroutine" starts running immediately. The first example is loop.run_in_executor(). I guess this is somewhat unsurprising since the passed function don't actually run in the event loop. Demonstrated below with strace and the interactive console:
$ strace -e connect -f python3 Python 3.6.5 (default, Apr 4 2018, 15:01:18) [GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux Type "help", "copyright", "credits" or "license" for more information.
> import asyncio > import socket > s = socket.socket() > loop = asyncio.get_event_loop() > coro = loop.sock_connect(s, ('127.0.0.1', 80)) > loop.run_until_complete(asyncio.sleep(1)) > task = loop.create_task(coro) > loop.run_until_complete(asyncio.sleep(1)) connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused) > s.close() > s = socket.socket() > coro2 = loop.run_in_executor(None, s.connect, ('127.0.0.1', 80)) strace: Process 13739 attached > [pid 13739] connect(3, {sa_family=AF_INET, sin_port=htons(80), > sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED
(Connection refused)
> coro2 <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /usr/lib64/python3.6/asyncio/futures.py:403]> > loop.run_until_complete(asyncio.sleep(1)) > coro2 <Future finished exception=ConnectionRefusedError(111, 'Connection refused')> >
Note that with loop.sock_connect(), the connect syscall is only run after loop.create_task() is called on the coroutine AND the loop is running. On the other hand, as soon as loop.run_in_executor() is called on socket.connect, the connect syscall gets called, without the event loop running at all.
Another such case is with Python 3.4.2, where even loop.sock_connect() will run immediately:
$ strace -e connect -f python3 Python 3.4.2 (default, Oct 8 2014, 10:45:20) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information.
> import socket > import asyncio > loop = asyncio.get_event_loop() > s = socket.socket() > c = loop.sock_connect(s, ('127.0.0.1', 82)) connect(7, {sa_family=AF_INET, sin_port=htons(82), sin_addr=inet_addr("127.0.0.1")}, 16) = -1ECONNREFUSED (Connection refused) > c <Future finished exception=ConnectionRefusedError(111, 'Connection refused')> >
In both these cases, the misbehaving "coroutine" aren't actually defined as coroutine functions, but regular functions returning a Future, which is probably why they don't act like coroutines. However, coroutine functions and regular functions returning Futures are often used interchangeably: Python docs Section 18.5.3.1 even says:
Note: In this documentation, some methods are documented as
coroutines,
even if they are plain Python functions returning a Future. This is intentional to have a freedom of tweaking the implementation of these functions in the future.
In particular, both run_in_executor() and sock_connect() are documented as coroutines.
If an asyncio API may change from a function returning Future to a coroutine function and vice versa any time, then one cannot rely on the behavior of creating the "coroutine object" not running the coroutine immediately. This seems like an important Gotcha waiting to bite someone.
Back to the scenario in the beginning. If I want to write a function that takes coroutine objects and schedule them to run later, and some coroutine objects turn out to be misbehaving like above, then they will run too early. To avoid this, I could either 1. pass the coroutine functions and their arguments separately "callback style", 2. use functools.partial or lambdas, or 3. always pass in real coroutine objects returned from coroutine functions defined with "async def". Does this sound right?
Thanks,
twistero _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
_______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
On Fri, May 4, 2018 at 12:11 PM, Yarko Tymciurak <yarkot1@gmail.com> wrote:
On Thu, May 3, 2018 at 8:52 PM, Dima Tisnek <dimaqq@gmail.com> wrote:
My 2c: don't use py3.4; in fact don't use 3.5 either :) If you decide to support older Python versions, it's only fair that separate implementation may be needed.
I'd agree - focus Python 3.6+
Oh, I'm not going to support py3.4, if not just for the sweet async def and await syntax ;-) I dug it out for demonstration purposes, as an example that asyncio APIs do change in ways that matter for the problem discussed in the OP.
Re: overall problem, why not try the following: wrap your individual tasks in async def, where each staggers, connects and resolves and handles cancellation (if it didn't win the race). IMO that's easier to reason about, debug and works around your problem ;)
On Fri, 4 May 2018 at 9:34 AM, twisteroid ambassador <twisteroid.ambassador@gmail.com> wrote:
The real problem I'm playing with is implementing "happy eyeballs", where I may have several sockets attempting to connect simultaneously, and the first one to successfully connect gets used. I had the idea of
Simpler is better ... this isn't an asyncio example, but maybe the readability (ymmv? For me - very clearly readable) is worth a ponder:
https://github.com/dabeaz/curio/blob/master/README.rst#a-complex-example
Thanks for mentioning that. In fact what prompted all this is the recent article on trio, which mentioned happy eyeballs, which then reminded me that I have 2 separate implementations of staggered-start-return-first-successful-cancel-all-others logic in one of my projects and they both look ugly as sin and I should probably try to improve them. So now I have looked at trio's implementation ( https://github.com/python-trio/trio/pull/145/files ), curio's (above), and a bug report for twisted ( https://twistedmatrix.com/trac/ticket/9345 ). One thing that struck me is that these implementations all have subtly different behavior. They all start the next connection when the previous one doesn't complete (either succeed or fail) within `delay`, but: - trio starts the next connection early if the immediately preceding one fails; - curio starts the next connection early if any of the connections still in flight fails; - twisted does not start the next connection early at all. (One of my implementations does the same thing as curio, the other starts early if there is no longer any connections in flight, i.e. all previous connections fail.)
preparing all of the loop.sock_connect() coroutine objects in advance, and scheduling them one by one on the loop, but wanted to make double sure that the sockets won't start connecting before the coroutines are scheduled. I wanted to write something like this:
successful_socket = await staggered_start([loop.sock_connect(socket.socket(), addr) for addr in addresses])
where async def staggered_start(coros) is some kind of reusable scheduling logic. As it turns out, I can't actually depend on loop.sock_connect() doing the Right Thing (TM) if I want to support Python 3.4.
On Fri, May 4, 2018 at 12:37 AM, Andrew Svetlov <andrew.svetlov@gmail.com> wrote:
What real problem do you want to solve? Correct code should always use `await loop.sock_connect(sock, addr)`, it this case the behavior difference never hurts you.
On Thu, May 3, 2018 at 7:04 PM twisteroid ambassador <twisteroid.ambassador@gmail.com> wrote:
Hi,
tl;dr: coroutine functions and regular functions returning Futures behave differently: the latter may start running immediately without being scheduled on a loop, or even with no loop running. This might be bad since the two are sometimes advertised to be interchangeable.
I find that sometimes I want to construct a coroutine object, store it for some time, and run it later. Most times it works like one would expect: I call a coroutine function which gives me a coroutine object, I hold on to the coroutine object, I later await it or use loop.create_task(), asyncio.gather(), etc. on it, and only then it starts to run.
However, I have found some cases where the "coroutine" starts running immediately. The first example is loop.run_in_executor(). I guess this is somewhat unsurprising since the passed function don't actually run in the event loop. Demonstrated below with strace and the interactive console:
$ strace -e connect -f python3 Python 3.6.5 (default, Apr 4 2018, 15:01:18) [GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux Type "help", "copyright", "credits" or "license" for more information.
>> import asyncio >> import socket >> s = socket.socket() >> loop = asyncio.get_event_loop() >> coro = loop.sock_connect(s, ('127.0.0.1', 80)) >> loop.run_until_complete(asyncio.sleep(1)) >> task = loop.create_task(coro) >> loop.run_until_complete(asyncio.sleep(1)) connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused) >> s.close() >> s = socket.socket() >> coro2 = loop.run_in_executor(None, s.connect, ('127.0.0.1', 80)) strace: Process 13739 attached >> [pid 13739] connect(3, {sa_family=AF_INET, sin_port=htons(80), >> sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED >> (Connection refused)
>> coro2 <Future pending cb=[_chain_future.<locals>._call_check_cancel() at /usr/lib64/python3.6/asyncio/futures.py:403]> >> loop.run_until_complete(asyncio.sleep(1)) >> coro2 <Future finished exception=ConnectionRefusedError(111, 'Connection refused')> >>
Note that with loop.sock_connect(), the connect syscall is only run after loop.create_task() is called on the coroutine AND the loop is running. On the other hand, as soon as loop.run_in_executor() is called on socket.connect, the connect syscall gets called, without the event loop running at all.
Another such case is with Python 3.4.2, where even loop.sock_connect() will run immediately:
$ strace -e connect -f python3 Python 3.4.2 (default, Oct 8 2014, 10:45:20) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information.
>> import socket >> import asyncio >> loop = asyncio.get_event_loop() >> s = socket.socket() >> c = loop.sock_connect(s, ('127.0.0.1', 82)) connect(7, {sa_family=AF_INET, sin_port=htons(82), sin_addr=inet_addr("127.0.0.1")}, 16) = -1ECONNREFUSED (Connection refused) >> c <Future finished exception=ConnectionRefusedError(111, 'Connection refused')> >>
In both these cases, the misbehaving "coroutine" aren't actually defined as coroutine functions, but regular functions returning a Future, which is probably why they don't act like coroutines. However, coroutine functions and regular functions returning Futures are often used interchangeably: Python docs Section 18.5.3.1 even says:
Note: In this documentation, some methods are documented as coroutines, even if they are plain Python functions returning a Future. This is intentional to have a freedom of tweaking the implementation of these functions in the future.
In particular, both run_in_executor() and sock_connect() are documented as coroutines.
If an asyncio API may change from a function returning Future to a coroutine function and vice versa any time, then one cannot rely on the behavior of creating the "coroutine object" not running the coroutine immediately. This seems like an important Gotcha waiting to bite someone.
Back to the scenario in the beginning. If I want to write a function that takes coroutine objects and schedule them to run later, and some coroutine objects turn out to be misbehaving like above, then they will run too early. To avoid this, I could either 1. pass the coroutine functions and their arguments separately "callback style", 2. use functools.partial or lambdas, or 3. always pass in real coroutine objects returned from coroutine functions defined with "async def". Does this sound right?
Thanks,
twistero _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
_______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
participants (6)
-
Andrew Svetlov
-
Chris Jerdonek
-
Dima Tisnek
-
Guido van Rossum
-
twisteroid ambassador
-
Yarko Tymciurak