Re: [Python-ideas] async/await in Python
1. Overall I like the proposal very much. However, I have got one semantic remark. You propose `async for` as a syntax for asynchronous iterators: async for row in Cursor(): print(row) Wouldn't it be more semantically correct to use `await for` instead of `async for`? await for row in Cursor(): print(row) For me the word 'await' is an indicator that I am awaiting for some value being returned. For example, with simple `await` expression I am awaiting for a data being fetched from db: data = await db.fetch('SELECT ...') When I use asynchronous iterator I am awaiting for a value being returned as well. For example I am awaiting (in each iteration) for a row from a cursor. Therefore, it seems to me to be natural to use word 'await' instead of 'async'. Furthermore syntax 'await for row in cursor' reassembles natural English language. On the other hand, when I use context manager, I am not awaiting for any value, so syntax `async with` seems to be proper in that case: async with session.transaction(): ... await session.update(data) Dart, for example, goes that way. They use `await` expression for awaiting single Future and `await for` statement for asynchronous iterators: await for (variable declaration in expression) { // Executes each time the stream emits a value. } 2. I would like to go little beyond this proposal and think about composition of async coroutines (aka waiting for multiple coroutines). For example C# has helper functions WhenAll and WhenAny for that: await Task.WhenAll(tasks_list); await Task.WhenAny(tasks_list); In asyncio module there is a function asyncio.wait() which can be used to achieve similar result: asyncio.wait(fs, timeout=None, return_when=ALL_COMPLETED) asyncio.wait(fs, timeout=None, return_when=FIRST_COMPLETED) However, after introduction of `await` its name becomes problematic. First, it reassembles `await` too much and can cause a confusion. Second, its usage would result in an awkward 'await wait': done, pending = await asyncio.wait(coroutines_list) results = [] for task in done: results.append(task.result()) Another problem with asyncio.wait() is that it returns Tasks, not their results directly, so user has to unpack them. There is function asyncio.gather(*coros_or_futures) which return results list directly, however it can be only used for ALL_COMPLETED case. There is also a function asyncio.wait_for() which (unlike asyncio.wait()) unpacks the result, but can only be used for one coroutine (so what is the difference from `await` expression?). Finally, there is asyncio.as_completed() which returns iterator for iterating over coroutines results as they complete (but I don't know how exactly this iterator relates to async iterators proposed here). I can imagine the set of three functions being exposed to user to control waiting for multiple coroutines: asynctools.as_done() # returns asynchronous iterator for iterating over the results of coroutines as they complete asynctools.all_done() # returns a future aggregating results from the given coroutine objects, which awaited returns list of results (like asyncio.gather()) asynctools.any_done() # returns a future, which awaited returns result of first completed coroutine Example: from asynctools import as_done, all_done, any_done corobj0 = async_sql_query("SELECT...") corobj1 = async_memcached_get("someid") corobj2 = async_http_get("http://python.org") # ------------------------------------------------ # Iterate over results as coroutines complete # using async iterator await for result in as_done([corobj0, corobj1, corobj2]): print(result) # ------------------------------------------------ # Await for results of all coroutines # using async iterator results = [] await for result in as_done([corobj0, corobj1, corobj2]): results.append(result) # or using shorthand coroutine all_done() results = await all_done([corobj0, corobj1, corobj2]) # ------------------------------------------------ # Await for a result of first completed coroutine # using async iterator await for result in as_done([corobj0, corobj1, corobj2]): first_result = result break # or using shorthand coroutine any_done() first_result = await any_done([corobj0, corobj1, corobj2]) I deliberately placed these functions in a new asynctools module, not in the asyncio module. I find asyncio module being too much complicated to expose it to an ordinary user. There are four very similar concepts used in it: Coroutine (function), Coroutine (object), Future and Task. In addition many functions accept both coroutines and Futures in the same argument, Task is a subclass of Future -- it makes people very confused. It is difficult to grasp what are differences between them and how they relate to each other. For comparison in JavaScript that are only two concepts: async functions and Promises. (Furthermore, after this PEP being accepted there will be fifth concept: old-style coroutines. And there are also concurrent.futures.Futures...) Personally, I think that asyncio module should be refactored and broken into two separate modules, named for example: - asyncloop # consisting low-level loop-related things, mostly not intended to be used by the average user (apart from get_event_loop() and run_until_xxx()) - asynctools # consisting high-level helper functions, like described before As with this PEP async/await will become first class member of Python environment, all rest high-level functions should be in my opinion moved from asyncio to appropriate modules, like socket or subprocess. These are the places where users will be looking for them. For example: socket.socket.recv() socket.socket.recv_async() socket.socket.sendall() socket.socket.sendall_async() socket.getaddrinfo() socket.getaddrinfo_async() Finally, concurrent.futures should either be renamed to avoid the usage of word 'future', or be made compatible with async/await. I know that I went far beyond scope of this PEP, but I think that these are the issues which will pop up after acceptance of this PEP sooner or later. Finally, I remind about my proposal from the beginning of this email, to use `await for` instead of `async for` for asynchronous iterators. What's your opinion about that? Piotr
Hi Piotr, Thank you very much for your detailed feedback. Answers below: On 2015-04-18 11:19 PM, Piotr Jurkiewicz wrote:
1. Overall I like the proposal very much. However, I have got one semantic remark. You propose `async for` as a syntax for asynchronous iterators:
async for row in Cursor(): print(row)
Wouldn't it be more semantically correct to use `await for` instead of `async for`?
await for row in Cursor(): print(row)
I like that the current proposal is simple. You use 'await' keyword only to call a coroutine, and you use 'async' only as a modifier, i.e. 'async def' becomes a coroutine, 'async for' is an asynchronous iteration block, etc. The less confusion we have the better.
For me the word 'await' is an indicator that I am awaiting for some value being returned. For example, with simple `await` expression I am awaiting for a data being fetched from db:
data = await db.fetch('SELECT ...')
When I use asynchronous iterator I am awaiting for a value being returned as well. For example I am awaiting (in each iteration) for a row from a cursor. Therefore, it seems to me to be natural to use word 'await' instead of 'async'. Furthermore syntax 'await for row in cursor' reassembles natural English language.
To me it reads different. There is no value of 'for' statement in Python, it's a block of code. Hence we use 'async' to mark it as an asynchronous block of code.
On the other hand, when I use context manager, I am not awaiting for any value, so syntax `async with` seems to be proper in that case:
async with session.transaction(): ... await session.update(data)
Dart, for example, goes that way. They use `await` expression for awaiting single Future and `await for` statement for asynchronous iterators:
await for (variable declaration in expression) { // Executes each time the stream emits a value. }
2. I would like to go little beyond this proposal and think about composition of async coroutines (aka waiting for multiple coroutines). For example C# has helper functions WhenAll and WhenAny for that:
await Task.WhenAll(tasks_list); await Task.WhenAny(tasks_list);
In asyncio module there is a function asyncio.wait() which can be used to achieve similar result:
asyncio.wait(fs, timeout=None, return_when=ALL_COMPLETED) asyncio.wait(fs, timeout=None, return_when=FIRST_COMPLETED)
However, after introduction of `await` its name becomes problematic. First, it reassembles `await` too much and can cause a confusion. Second, its usage would result in an awkward 'await wait':
done, pending = await asyncio.wait(coroutines_list) results = [] for task in done: results.append(task.result())
Another problem with asyncio.wait() is that it returns Tasks, not their results directly, so user has to unpack them. There is function asyncio.gather(*coros_or_futures) which return results list directly, however it can be only used for ALL_COMPLETED case. There is also a function asyncio.wait_for() which (unlike asyncio.wait()) unpacks the result, but can only be used for one coroutine (so what is the difference from `await` expression?). Finally, there is asyncio.as_completed() which returns iterator for iterating over coroutines results as they complete (but I don't know how exactly this iterator relates to async iterators proposed here).
I can imagine the set of three functions being exposed to user to control waiting for multiple coroutines:
asynctools.as_done() # returns asynchronous iterator for iterating over the results of coroutines as they complete asynctools.all_done() # returns a future aggregating results from the given coroutine objects, which awaited returns list of results (like asyncio.gather()) asynctools.any_done() # returns a future, which awaited returns result of first completed coroutine
Example:
from asynctools import as_done, all_done, any_done
corobj0 = async_sql_query("SELECT...") corobj1 = async_memcached_get("someid") corobj2 = async_http_get("http://python.org")
# ------------------------------------------------
# Iterate over results as coroutines complete # using async iterator
await for result in as_done([corobj0, corobj1, corobj2]): print(result)
# ------------------------------------------------
# Await for results of all coroutines # using async iterator
results = [] await for result in as_done([corobj0, corobj1, corobj2]): results.append(result)
# or using shorthand coroutine all_done()
results = await all_done([corobj0, corobj1, corobj2])
# ------------------------------------------------
# Await for a result of first completed coroutine # using async iterator
await for result in as_done([corobj0, corobj1, corobj2]): first_result = result break
# or using shorthand coroutine any_done()
first_result = await any_done([corobj0, corobj1, corobj2])
I deliberately placed these functions in a new asynctools module, not in the asyncio module. I find asyncio module being too much complicated to expose it to an ordinary user. There are four very similar concepts used in it: Coroutine (function), Coroutine (object), Future and Task. In addition many functions accept both coroutines and Futures in the same argument, Task is a subclass of Future -- it makes people very confused. It is difficult to grasp what are differences between them and how they relate to each other. For comparison in JavaScript that are only two concepts: async functions and Promises.
(Furthermore, after this PEP being accepted there will be fifth concept: old-style coroutines. And there are also concurrent.futures.Futures...)
Personally, I think that asyncio module should be refactored and broken into two separate modules, named for example:
- asyncloop # consisting low-level loop-related things, mostly not intended to be used by the average user (apart from get_event_loop() and run_until_xxx()) - asynctools # consisting high-level helper functions, like described before
As with this PEP async/await will become first class member of Python environment, all rest high-level functions should be in my opinion moved from asyncio to appropriate modules, like socket or subprocess. These are the places where users will be looking for them. For example:
socket.socket.recv() socket.socket.recv_async() socket.socket.sendall() socket.socket.sendall_async() socket.getaddrinfo() socket.getaddrinfo_async()
Finally, concurrent.futures should either be renamed to avoid the usage of word 'future', or be made compatible with async/await.
I know that I went far beyond scope of this PEP, but I think that these are the issues which will pop up after acceptance of this PEP sooner or later.
You're exactly right--your ideas are very nice and sound,--but are outside of the scope of PEP 492. One step at a time. First, we do the syntax changes and integrate only the most important functions and builtins. Later, we get the feedback and integrate other important features to the standard library (for instance, I liked your idea about asyncloop module. I think it should be prototyped and put on PyPI if the PEP is accepted). Thanks a lot! Yury
participants (2)
-
Piotr Jurkiewicz
-
Yury Selivanov