PEP: asynchronous generators
![](https://secure.gravatar.com/avatar/60a2f1855ca0d8aac3fa75a57233a3f1.jpg?s=120&d=mm&r=g)
Hi, I have been working on a PEP to add asynchronous generators to Python. The PEP is now ready for a review. It would be great to hear some initial feedback from async-sig, before I post it to python-ideas. I have a complete and working reference implementation of everything that PEP proposes here: https://github.com/1st1/cpython/tree/async_gen PEP: XXX Title: Asynchronous Generators Version: $Revision$ Last-Modified: $Date$ Author: Yury Selivanov <yury@magic.io> Discussions-To: <python-dev@python.org> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 28-Jul-2016 Python-Version: 3.6 Post-History: Abstract ======== PEP 492 introduced support for native coroutines and ``async``/``await`` syntax to Python 3.5. It is proposed here to extend Python's asynchronous capabilities by adding support for *asynchronous generators*. Rationale and Goals =================== Regular generators (introduced in PEP 255) enabled an elegant way of writing complex *data producers* and have them behave like an iterator. However, currently there is no equivalent concept for the *asynchronous iteration protocol* (``async for``). This makes writing asynchronous data producers unnecessarily complex, as one must define a class that implements ``__aiter__`` to be able to use it in an ``async for`` statement. Essentially, the goals and rationale for PEP 255, applied to the asynchronous execution case, hold true for this proposal as well. Performance is an additional point for this proposal: in our testing of the reference implementation, asynchronous generators are *2x* faster than an equivalent implemented as an asynchronous iterator. As an illustration of the code quality improvement, consider the following class that prints numbers with a given delay once iterated:: class ticker: """Print numbers from 0 to `to` every `delay` seconds.""" def __init__(self, delay, to): self.delay = delay self.i = 0 self.to = to def __aiter__(self): return self async def __anext__(self): i = self.i if i >= self.to: raise StopAsyncIteration self.i += 1 if i: await asyncio.sleep(self.delay) return i The same can be implemented as a much simpler asynchronous generator:: async def ticker(delay, to): """Print numbers from 0 to `to` every `delay` seconds.""" i = 0 while i < to: yield i i += 1 await asyncio.sleep(delay) Specification ============= This proposal introduces the concept of *asynchronous generators* to Python. This specification presumes knowledge of the implementation of generators and coroutines in Python (PEP 342, PEP 380 and PEP 492). Asynchronous Generators ----------------------- A Python *generator* is any function containing one or more ``yield`` expressions:: def func(): # a function return def genfunc(): # a generator function yield We propose to use the same approach to define *asynchronous generators*:: async def coro(): # a coroutine function await smth() async def asyncgen(): # an asynchronous generator function await smth() yield val The result of calling an *asynchronous generator function* is an *asynchronous generator object*, which implements the asynchronous iteration protocol defined in PEP 492. It is a ``SyntaxError`` to have a non-empty ``return`` statement in an asynchronous generator. Support for Asynchronous Iteration Protocol ------------------------------------------- The protocol requires two special methods to be implemented: 1. An ``__aiter__`` method returning an *asynchronous iterator*. 2. An ``__anext__`` method returning an *awaitable* object, which uses ``StopIteration`` exception to "yield" values, and ``StopAsyncIteration`` exception to signal the end of the iteration. Asynchronous generators define both of these methods:: async def genfunc(): yield 1 yield 2 gen = genfunc() assert gen.__aiter__() is gen assert await gen.__anext__() == 1 assert await gen.__anext__() == 2 with assertRaises(StopAsyncIteration): await gen.__anext__() Finalization ------------ PEP 492 requires an event loop or a scheduler to run coroutines. Because asynchronous generators are meant to be used from coroutines, they also require an event loop to run and finalize them. Asynchronous generators can have ``try..finally`` blocks, as well as ``async with``. It is important to provide a guarantee that, even when partially iterated, and then garbage collected, generators can be safely finalized. For example:: async def square_series(con, to): async with con.transaction(): cursor = con.cursor( 'SELECT generate_series(0, $1) AS i', to) async for row in cursor: yield row['i'] ** 2 async for i in square_series(con, 100): if i == 100: break The above code defines an asynchronous generator that uses ``async with`` to iterate over a database cursor in a transaction. The generator is then iterated over with ``async for``, which interrupts the iteration at some point. The ``square_series()`` generator will then be garbage collected, and without a mechanism to asynchronously close the generator, Python interpreter would not be able to do anything. To solve this problem we propose to do the following: 1. Implement an ``aclose`` method on asynchronous generators returning a special *awaitable*. When awaited it throws a ``GeneratorExit`` into the suspended generator and iterates over it until either a ``GeneratorExit`` or a ``StopAsyncIteration`` occur. This is very similar to what the ``close()`` method does to regular Python generators, except that an event loop is required to execute ``aclose()``. 2. Raise a ``RuntimeError``, when an asynchronous generator executes a ``yield`` expression in its ``finally`` block (using ``await`` is fine, though):: async def gen(): try: yield finally: yield # Cannot use 'yield' await asyncio.sleep(1) # Can use 'await' 3. Add two new methods to the ``sys`` module: ``set_asyncgen_finalizer`` and ``get_asyncgen_finalizer``. The idea behind ``sys.set_asyncgen_finalizer`` is to allow event loops to handle generators finalization, so that the end user does not need to care about the finalization problem, and it just works. When an asynchronous generator is iterated for the first time, it stores a reference to the current finalizer. If there is none, a ``RuntimeError`` is raised. This provides a strong guarantee that every asynchronous generator object will always have a finalizer installed by the correct event loop. When an asynchronous generator is about to be garbage collected, it calls its cached finalizer. The assumption is that the finalizer will schedule an ``aclose()`` call with the loop that was active when the iteration started. For instance, here is how asyncio can be modified to allow safe finalization of asynchronous generators:: # asyncio/base_events.py class BaseEventLoop: def run_forever(self): ... old_finalizer = sys.get_asyncgen_finalizer() sys.set_asyncgen_finalizer(self._finalize_asyncgen) try: ... finally: sys.set_asyncgen_finalizer(old_finalizer) ... def _finalize_asyncgen(self, gen): self.create_task(gen.aclose()) ``sys.set_asyncgen_finalizer`` is thread-specific, so several event loops running in parallel threads can use it safely. Asynchronous Generator Object ----------------------------- The object is modeled after the standard Python generator object. Essentially, the behaviour of asynchronous generators is designed to replicate the behaviour of synchronous generators, with the only difference in that the API is asynchronous. The following methods and properties are defined: 1. ``agen.__aiter__()``: Returns ``agen``. 2. ``agen.__anext__()``: Returns an *awaitable*, that performs one asynchronous generator iteration when awaited. 3. ``agen.anext(val)``: Returns an *awaitable*, that pushes the ``val`` object in the ``agen`` generator. When the ``agen`` has not yet been iterated, ``val`` must be ``None``. Example:: async def gen(): await asyncio.sleep(0.1) v = yield 42 print(v) await asyncio.sleep(0.1) g = gen() await g.send(None) # Will return 42 await g.send('hello') # Will print 'hello' and # raise StopAsyncIteration # (after sleeping for 0.1 seconds) 4. ``agen.athrow(typ, [val, [tb]])``: Returns an *awaitable*, that throws an exception into the ``agen`` generator. Example:: async def gen(): try: await asyncio.sleep(0.1) yield 'hello' except ZeroDivisionError: await asyncio.sleep(0.2) yield 'world' g = gen() v = await g.asend(None) print(v) # Will print 'hello' after sleeping 0.1s v = await g.athrow(ZeroDivisionError) print(v) # Will print 'world' after sleeping 0.2s 5. ``agen.aclose()``: Returns an *awaitable*, that throws a ``GeneratorExit`` exception into the generator. The *awaitable* can either return a yielded value, if ``agen`` handled the exception, or ``agen`` will be closed and the exception will propagate back to the caller. 6. ``agen.__name__`` and ``agen.__qualname__``: readable and writable name and qualified name attributes. 7. ``agen.ag_await``: The object that ``agen`` is currently awaiting on, or ``None``. 8. ``agen.ag_frame``, ``agen.ag_running``, and ``agen.ag_code``: defined in the same way as similar attributes of standard generators. New Standard Library Functions and Types ---------------------------------------- 1. ``types.AsyncGeneratorType`` -- type of asynchronous generator object. 2. ``sys.set_asyncgen_finalizer()`` and ``sys.get_asyncgen_finalizer()`` methods to set up asynchronous generators finalizers in event loops. 3. ``inspect.isasyncgen()`` and ``inspect.isasyncgenfunction()`` introspection functions. Backwards Compatibility ----------------------- The proposal is fully backwards compatible. In Python 3.5 it is a ``SyntaxError`` to define an ``async def`` function with a ``yield`` expression inside, therefore it's safe to introduce asynchronous generators in 3.6. Performance =========== Regular Generators ------------------ There is no performance degradation for regular generators. The following micro benchmark runs at the same speed on CPython with and without asynchronous generators:: def gen(): i = 0 while i < 100000000: yield i i += 1 list(gen()) Improvements over asynchronous iterators ---------------------------------------- The following micro-benchmark shows that asynchronous generators are about **2x faster** than asynchronous iterators implemented in pure Python: async def agen(): i = 0 while i < N: yield i i += 1 class AIter: def __init__(self): self.i = 0 def __aiter__(self): return self async def __anext__(self): i = self.i if i >= N: raise StopAsyncIteration self.i += 1 return i Design Considerations ===================== ``aiter()`` and ``anext()`` builtins ------------------------------------ Originally PEP 492 defined ``__aiter__`` as a method that should return an *awaitable* object, resulting in an asynchronous iterator. However, in CPython 3.5.2, ``__aiter__`` was redefined to return asynchronous iterators directly. To avoid breaking backwards compatibility, it was decided that Python 3.6 will support both ways: ``__aiter__`` can still return an *awaitable* with a ``DeprecationWarning`` being issued. Because of this dual nature of ``__aiter__`` in Python 3.6, we cannot add a synchronous implementation of ``aiter()`` built-in. Therefore, it is proposed to wait until Python 3.7. Asynchronous list/dict/set comprehensions ----------------------------------------- Syntax for asynchronous comprehensions is unrelated to the asynchronous generators machinery, and should be considered in a separate PEP. Asynchronous ``yield from`` --------------------------- While it is theoretically possible to implement ``yield from`` support for asynchronous generators, it would require a serious redesign of the generator implementation. ``yield from`` is also less critical for asynchronous generators, since there is no need provide a mechanism of implementing another coroutines protocol on top of coroutines. To compose asynchronous generators a simple ``async for`` loop can be used:: async def g1(): yield 1 yield 2 async def g2(): async for v in g1(): yield v Why the ``asend`` and ``athrow`` methods are necessary ------------------------------------------------------ They make it possible to implement concepts similar to ``contextlib.contextmanager`` using asynchronous generators. For instance, with the proposed design, it is possible to implement the following pattern:: @async_context_manager async def ctx(): await open() try: yield finally: await close() async with ctx(): await ... Another reason is that it is possible to push data and throw exceptions into asynchronous generators using the object returned from ``__anext__`` object, but it is hard to do that correctly. Adding explicit ``asend`` and ``athrow`` will pave a safe way to accomplish that. Example ======= A working example with the current reference implementation (will print numbers from 0 to 9 with one second delay):: async def ticker(delay, to): i = 0 while i < to: yield i i += 1 await asyncio.sleep(delay) async def run(): async for i in ticker(1, 10): print(i) import asyncio loop = asyncio.get_event_loop() try: loop.run_until_complete(run()) finally: loop.close() Implementation ============== The complete reference implementation is available at [1]_. References ========== .. [1] https://github.com/1st1/cpython/tree/async_gen Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: Thank you! Yury
![](https://secure.gravatar.com/avatar/e8600d16ba667cc8d7f00ddc9f254340.jpg?s=120&d=mm&r=g)
On Fri, 29 Jul 2016 at 09:18 Yury Selivanov <yselivanov@gmail.com> wrote:
Hi,
I have been working on a PEP to add asynchronous generators to Python. The PEP is now ready for a review.
Woohoo!
It would be great to hear some initial feedback from async-sig, before I post it to python-ideas.
I have a complete and working reference implementation of everything that PEP proposes here:
https://github.com/1st1/cpython/tree/async_gen
PEP: XXX Title: Asynchronous Generators Version: $Revision$ Last-Modified: $Date$ Author: Yury Selivanov <yury@magic.io> Discussions-To: <python-dev@python.org> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 28-Jul-2016 Python-Version: 3.6 Post-History:
Abstract ========
PEP 492 introduced support for native coroutines and ``async``/``await`` syntax to Python 3.5. It is proposed here to extend Python's asynchronous capabilities by adding support for *asynchronous generators*.
Rationale and Goals ===================
Regular generators (introduced in PEP 255) enabled an elegant way of writing complex *data producers* and have them behave like an iterator.
However, currently there is no equivalent concept for the *asynchronous iteration protocol* (``async for``). This makes writing asynchronous data producers unnecessarily complex, as one must define a class that implements ``__aiter__`` to be able to use it in an ``async for`` statement.
Essentially, the goals and rationale for PEP 255, applied to the asynchronous execution case, hold true for this proposal as well.
Performance is an additional point for this proposal: in our testing of the reference implementation, asynchronous generators are *2x* faster than an equivalent implemented as an asynchronous iterator.
Another motivation is that types.coroutine becomes purely a backwards-compatibility/low-level thing that is no longer required for event loop frameworks to use on their generators which provide their async API. So now an async framework can be written entirely in terms of def and async def and not def and @types.coroutine.
As an illustration of the code quality improvement, consider the following class that prints numbers with a given delay once iterated::
class ticker: """Print numbers from 0 to `to` every `delay` seconds."""
def __init__(self, delay, to): self.delay = delay self.i = 0 self.to = to
def __aiter__(self): return self
async def __anext__(self): i = self.i if i >= self.to: raise StopAsyncIteration self.i += 1 if i: await asyncio.sleep(self.delay) return i
The same can be implemented as a much simpler asynchronous generator::
async def ticker(delay, to): """Print numbers from 0 to `to` every `delay` seconds.""" i = 0 while i < to: yield i i += 1 await asyncio.sleep(delay)
Specification =============
This proposal introduces the concept of *asynchronous generators* to Python.
This specification presumes knowledge of the implementation of generators and coroutines in Python (PEP 342, PEP 380 and PEP 492).
Asynchronous Generators -----------------------
A Python *generator* is any function containing one or more ``yield`` expressions::
def func(): # a function return
def genfunc(): # a generator function yield
We propose to use the same approach to define *asynchronous generators*::
async def coro(): # a coroutine function await smth()
async def asyncgen(): # an asynchronous generator function await smth() yield val
The result of calling an *asynchronous generator function* is an *asynchronous generator object*, which implements the asynchronous iteration protocol defined in PEP 492.
It is a ``SyntaxError`` to have a non-empty ``return`` statement in an asynchronous generator.
Support for Asynchronous Iteration Protocol -------------------------------------------
The protocol requires two special methods to be implemented:
1. An ``__aiter__`` method returning an *asynchronous iterator*. 2. An ``__anext__`` method returning an *awaitable* object, which uses ``StopIteration`` exception to "yield" values, and ``StopAsyncIteration`` exception to signal the end of the iteration.
I assume this is in place of __iter__() and __next__(), i.e. an async generator won't have both defined?
Asynchronous generators define both of these methods::
async def genfunc(): yield 1 yield 2
gen = genfunc()
assert gen.__aiter__() is gen
assert await gen.__anext__() == 1 assert await gen.__anext__() == 2
with assertRaises(StopAsyncIteration): await gen.__anext__()
Finalization ------------
PEP 492 requires an event loop or a scheduler to run coroutines. Because asynchronous generators are meant to be used from coroutines, they also require an event loop to run and finalize them.
Asynchronous generators can have ``try..finally`` blocks, as well as ``async with``. It is important to provide a guarantee that, even when partially iterated, and then garbage collected, generators can be safely finalized. For example::
async def square_series(con, to): async with con.transaction(): cursor = con.cursor( 'SELECT generate_series(0, $1) AS i', to) async for row in cursor: yield row['i'] ** 2
async for i in square_series(con, 100): if i == 100: break
The above code defines an asynchronous generator that uses ``async with`` to iterate over a database cursor in a transaction. The generator is then iterated over with ``async for``, which interrupts the iteration at some point.
The ``square_series()`` generator will then be garbage collected, and without a mechanism to asynchronously close the generator, Python interpreter would not be able to do anything.
To solve this problem we propose to do the following:
1. Implement an ``aclose`` method on asynchronous generators returning a special *awaitable*. When awaited it throws a ``GeneratorExit`` into the suspended generator and iterates over it until either a ``GeneratorExit`` or a ``StopAsyncIteration`` occur.
This is very similar to what the ``close()`` method does to regular Python generators, except that an event loop is required to execute ``aclose()``.
I'm going to ask this now instead of later when there's more motivation behind this question: do we need to append "a" to every async method we have? If asynchronous generators won't have a normal close() then why can't it just be close(), especially if people are not going to be calling it directly and instead it will be event loops? I'm just leery of codifying this practice of prepending "a" to every async method or function and ending up in a future where I get tired of a specific letter of the alphabet.
2. Raise a ``RuntimeError``, when an asynchronous generator executes a ``yield`` expression in its ``finally`` block (using ``await`` is fine, though)::
async def gen(): try: yield finally: yield # Cannot use 'yield' await asyncio.sleep(1) # Can use 'await'
3. Add two new methods to the ``sys`` module: ``set_asyncgen_finalizer`` and ``get_asyncgen_finalizer``.
The idea behind ``sys.set_asyncgen_finalizer`` is to allow event loops to handle generators finalization, so that the end user does not need to care about the finalization problem, and it just works.
When an asynchronous generator is iterated for the first time, it stores a reference to the current finalizer. If there is none, a ``RuntimeError`` is raised. This provides a strong guarantee that every asynchronous generator object will always have a finalizer installed by the correct event loop.
When an asynchronous generator is about to be garbage collected, it calls its cached finalizer. The assumption is that the finalizer will schedule an ``aclose()`` call with the loop that was active when the iteration started.
For instance, here is how asyncio can be modified to allow safe finalization of asynchronous generators::
# asyncio/base_events.py
class BaseEventLoop:
def run_forever(self): ... old_finalizer = sys.get_asyncgen_finalizer() sys.set_asyncgen_finalizer(self._finalize_asyncgen) try: ... finally: sys.set_asyncgen_finalizer(old_finalizer) ...
def _finalize_asyncgen(self, gen): self.create_task(gen.aclose())
``sys.set_asyncgen_finalizer`` is thread-specific, so several event loops running in parallel threads can use it safely.
Asynchronous Generator Object -----------------------------
The object is modeled after the standard Python generator object. Essentially, the behaviour of asynchronous generators is designed to replicate the behaviour of synchronous generators, with the only difference in that the API is asynchronous.
The following methods and properties are defined:
1. ``agen.__aiter__()``: Returns ``agen``.
2. ``agen.__anext__()``: Returns an *awaitable*, that performs one asynchronous generator iteration when awaited.
3. ``agen.anext(val)``: Returns an *awaitable*, that pushes the ``val`` object in the ``agen`` generator. When the ``agen`` has not yet been iterated, ``val`` must be ``None``.
How is this different from an async send()? I see an asend() used in the example below but not anext().
Example::
async def gen(): await asyncio.sleep(0.1) v = yield 42 print(v) await asyncio.sleep(0.1)
g = gen() await g.send(None) # Will return 42 await g.send('hello') # Will print 'hello' and # raise StopAsyncIteration # (after sleeping for 0.1 seconds)
4. ``agen.athrow(typ, [val, [tb]])``: Returns an *awaitable*, that throws an exception into the ``agen`` generator.
Example::
async def gen(): try: await asyncio.sleep(0.1) yield 'hello' except ZeroDivisionError: await asyncio.sleep(0.2) yield 'world'
g = gen() v = await g.asend(None) print(v) # Will print 'hello' after sleeping 0.1s v = await g.athrow(ZeroDivisionError) print(v) # Will print 'world' after sleeping 0.2s
5. ``agen.aclose()``: Returns an *awaitable*, that throws a ``GeneratorExit`` exception into the generator. The *awaitable* can either return a yielded value, if ``agen`` handled the exception, or ``agen`` will be closed and the exception will propagate back to the caller.
6. ``agen.__name__`` and ``agen.__qualname__``: readable and writable name and qualified name attributes.
7. ``agen.ag_await``: The object that ``agen`` is currently awaiting on, or ``None``.
That's an interesting addition. I like it! There's no equivalent on normal generators, correct?
8. ``agen.ag_frame``, ``agen.ag_running``, and ``agen.ag_code``: defined in the same way as similar attributes of standard generators.
New Standard Library Functions and Types ----------------------------------------
1. ``types.AsyncGeneratorType`` -- type of asynchronous generator object.
2. ``sys.set_asyncgen_finalizer()`` and ``sys.get_asyncgen_finalizer()`` methods to set up asynchronous generators finalizers in event loops.
3. ``inspect.isasyncgen()`` and ``inspect.isasyncgenfunction()`` introspection functions.
Backwards Compatibility -----------------------
The proposal is fully backwards compatible.
In Python 3.5 it is a ``SyntaxError`` to define an ``async def`` function with a ``yield`` expression inside, therefore it's safe to introduce asynchronous generators in 3.6.
Performance ===========
Regular Generators ------------------
There is no performance degradation for regular generators. The following micro benchmark runs at the same speed on CPython with and without asynchronous generators::
def gen(): i = 0 while i < 100000000: yield i i += 1
list(gen())
Improvements over asynchronous iterators ----------------------------------------
The following micro-benchmark shows that asynchronous generators are about **2x faster** than asynchronous iterators implemented in pure Python:
async def agen(): i = 0 while i < N: yield i i += 1
class AIter: def __init__(self): self.i = 0
def __aiter__(self): return self
async def __anext__(self): i = self.i if i >= N: raise StopAsyncIteration self.i += 1 return i
Design Considerations =====================
``aiter()`` and ``anext()`` builtins ------------------------------------
Originally PEP 492 defined ``__aiter__`` as a method that should return an *awaitable* object, resulting in an asynchronous iterator.
However, in CPython 3.5.2, ``__aiter__`` was redefined to return asynchronous iterators directly. To avoid breaking backwards compatibility, it was decided that Python 3.6 will support both ways: ``__aiter__`` can still return an *awaitable* with a ``DeprecationWarning`` being issued.
Because of this dual nature of ``__aiter__`` in Python 3.6, we cannot add a synchronous implementation of ``aiter()`` built-in. Therefore, it is proposed to wait until Python 3.7.
Asynchronous list/dict/set comprehensions -----------------------------------------
Syntax for asynchronous comprehensions is unrelated to the asynchronous generators machinery, and should be considered in a separate PEP.
Asynchronous ``yield from`` ---------------------------
While it is theoretically possible to implement ``yield from`` support for asynchronous generators, it would require a serious redesign of the generator implementation.
``yield from`` is also less critical for asynchronous generators, since there is no need provide a mechanism of implementing another coroutines protocol on top of coroutines. To compose asynchronous generators a simple ``async for`` loop can be used::
async def g1(): yield 1 yield 2
async def g2(): async for v in g1(): yield v
Why the ``asend`` and ``athrow`` methods are necessary ------------------------------------------------------
They make it possible to implement concepts similar to ``contextlib.contextmanager`` using asynchronous generators. For instance, with the proposed design, it is possible to implement the following pattern::
@async_context_manager async def ctx(): await open() try: yield finally: await close()
async with ctx(): await ...
Another reason is that it is possible to push data and throw exceptions into asynchronous generators using the object returned from ``__anext__`` object, but it is hard to do that correctly. Adding explicit ``asend`` and ``athrow`` will pave a safe way to accomplish that.
Example =======
A working example with the current reference implementation (will print numbers from 0 to 9 with one second delay)::
async def ticker(delay, to): i = 0 while i < to: yield i i += 1 await asyncio.sleep(delay)
async def run(): async for i in ticker(1, 10): print(i)
import asyncio loop = asyncio.get_event_loop() try: loop.run_until_complete(run()) finally: loop.close()
Implementation ==============
The complete reference implementation is available at [1]_.
References ==========
.. [1] https://github.com/1st1/cpython/tree/async_gen
Copyright =========
This document has been placed in the public domain.
.. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
Thank you! Yury
_______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
![](https://secure.gravatar.com/avatar/60a2f1855ca0d8aac3fa75a57233a3f1.jpg?s=120&d=mm&r=g)
Thanks a lot for the feedback, Brett! Comments inlined below:
On Jul 29, 2016, at 1:25 PM, Brett Cannon <brett@python.org> wrote:
[..]
Performance is an additional point for this proposal: in our testing of the reference implementation, asynchronous generators are *2x* faster than an equivalent implemented as an asynchronous iterator.
Another motivation is that types.coroutine becomes purely a backwards-compatibility/low-level thing that is no longer required for event loop frameworks to use on their generators which provide their async API. So now an async framework can be written entirely in terms of def and async def and not def and @types.coroutine.
A slight misunderstanding here: @types.coroutine turns a normal old-style generator into an awaitable object. That isn’t directly related to asynchronous iteration. Frameworks like curio will continue using @types.coroutine to implement future-like objects in event loops. [..]
Support for Asynchronous Iteration Protocol -------------------------------------------
The protocol requires two special methods to be implemented:
1. An ``__aiter__`` method returning an *asynchronous iterator*. 2. An ``__anext__`` method returning an *awaitable* object, which uses ``StopIteration`` exception to "yield" values, and ``StopAsyncIteration`` exception to signal the end of the iteration.
I assume this is in place of __iter__() and __next__(), i.e. an async generator won't have both defined?
Correct, async generators don’t have __iter__ and __next__. They can only be iterated with an `async for`. [..]
To solve this problem we propose to do the following:
1. Implement an ``aclose`` method on asynchronous generators returning a special *awaitable*. When awaited it throws a ``GeneratorExit`` into the suspended generator and iterates over it until either a ``GeneratorExit`` or a ``StopAsyncIteration`` occur.
This is very similar to what the ``close()`` method does to regular Python generators, except that an event loop is required to execute ``aclose()``.
I'm going to ask this now instead of later when there's more motivation behind this question: do we need to append "a" to every async method we have? If asynchronous generators won't have a normal close() then why can't it just be close(), especially if people are not going to be calling it directly and instead it will be event loops? I'm just leery of codifying this practice of prepending "a" to every async method or function and ending up in a future where I get tired of a specific letter of the alphabet.
I decided to use the prefix because we already use it in magic method names: __anext__ and __aiter__. I think it also makes it easier to understand the API of async generators (and understand how it’s different from sync generators API). And while it’s entirely possible to drop the ‘a’ for async generator API, it’s not so simple for other cases. Later, for Python 3.7, we might consider adding ‘aiter()’ and ‘anext()’ builtins, for which we’d have to use ‘a’ or ‘async’ prefix (we can’t reuse 'iter()' and 'next()’ for async generators). [..]
3. ``agen.anext(val)``: Returns an *awaitable*, that pushes the ``val`` object in the ``agen`` generator. When the ``agen`` has not yet been iterated, ``val`` must be ``None``.
How is this different from an async send()? I see an asend() used in the example below but not anext().
Good catch! This is a typo, it should read ``agen.asend(val)``. Similarly to sync generators, where ‘__next__()’ call is equivalent of ‘.send(None)’, ‘__anext__()’ awaitable is equivalent to ‘.asend(None)' [..]
7. ``agen.ag_await``: The object that ``agen`` is currently awaiting on, or ``None``.
That's an interesting addition. I like it! There's no equivalent on normal generators, correct?
We actually added that in 3.5 (last minute!). For sync generators the field is called ‘.gi_yieldfrom’, for coroutines it’s ‘.cr_await’, and for proposed async generators it will be ‘.ag_await’. Thanks! Yury
![](https://secure.gravatar.com/avatar/e8600d16ba667cc8d7f00ddc9f254340.jpg?s=120&d=mm&r=g)
On Fri, 29 Jul 2016 at 10:46 Yury Selivanov <yselivanov@gmail.com> wrote:
Thanks a lot for the feedback, Brett! Comments inlined below:
On Jul 29, 2016, at 1:25 PM, Brett Cannon <brett@python.org> wrote:
[..]
Performance is an additional point for this proposal: in our testing of the reference implementation, asynchronous generators are *2x* faster than an equivalent implemented as an asynchronous iterator.
Another motivation is that types.coroutine becomes purely a
backwards-compatibility/low-level thing that is no longer required for event loop frameworks to use on their generators which provide their async API. So now an async framework can be written entirely in terms of def and async def and not def and @types.coroutine.
A slight misunderstanding here: @types.coroutine turns a normal old-style generator into an awaitable object. That isn’t directly related to asynchronous iteration. Frameworks like curio will continue using @types.coroutine to implement future-like objects in event loops.
Ah, OK. So that would be yet another PEP to make that kind of change (and another keyword).
[..]
Support for Asynchronous Iteration Protocol -------------------------------------------
The protocol requires two special methods to be implemented:
1. An ``__aiter__`` method returning an *asynchronous iterator*. 2. An ``__anext__`` method returning an *awaitable* object, which uses ``StopIteration`` exception to "yield" values, and ``StopAsyncIteration`` exception to signal the end of the iteration.
I assume this is in place of __iter__() and __next__(), i.e. an async
generator won't have both defined?
Correct, async generators don’t have __iter__ and __next__. They can only be iterated with an `async for`.
[..]
To solve this problem we propose to do the following:
1. Implement an ``aclose`` method on asynchronous generators returning a special *awaitable*. When awaited it throws a ``GeneratorExit`` into the suspended generator and iterates over it until either a ``GeneratorExit`` or a ``StopAsyncIteration`` occur.
This is very similar to what the ``close()`` method does to regular Python generators, except that an event loop is required to execute ``aclose()``.
I'm going to ask this now instead of later when there's more motivation
behind this question: do we need to append "a" to every async method we have? If asynchronous generators won't have a normal close() then why can't it just be close(), especially if people are not going to be calling it directly and instead it will be event loops? I'm just leery of codifying this practice of prepending "a" to every async method or function and ending up in a future where I get tired of a specific letter of the alphabet.
I decided to use the prefix because we already use it in magic method names: __anext__ and __aiter__. I think it also makes it easier to understand the API of async generators (and understand how it’s different from sync generators API).
And while it’s entirely possible to drop the ‘a’ for async generator API, it’s not so simple for other cases. Later, for Python 3.7, we might consider adding ‘aiter()’ and ‘anext()’ builtins, for which we’d have to use ‘a’ or ‘async’ prefix (we can’t reuse 'iter()' and 'next()’ for async generators).
I guess we just need to decide as a group that an 'a' prefix is what we want to signify something is asynchronous vs some other prefix like 'a_' or 'async', or 'async_' as people will follow this style choice in their own code going forward.
[..]
3. ``agen.anext(val)``: Returns an *awaitable*, that pushes the ``val`` object in the ``agen`` generator. When the ``agen`` has not yet been iterated, ``val`` must be ``None``.
How is this different from an async send()? I see an asend() used in the
example below but not anext().
Good catch! This is a typo, it should read ``agen.asend(val)``.
Similarly to sync generators, where ‘__next__()’ call is equivalent of ‘.send(None)’, ‘__anext__()’ awaitable is equivalent to ‘.asend(None)'
[..]
7. ``agen.ag_await``: The object that ``agen`` is currently awaiting on, or ``None``.
That's an interesting addition. I like it! There's no equivalent on
normal generators, correct?
We actually added that in 3.5 (last minute!).
For sync generators the field is called ‘.gi_yieldfrom’, for coroutines it’s ‘.cr_await’, and for proposed async generators it will be ‘.ag_await’.
I would also clarify that "waiting on" means "what `await` has been called on (if anything as an `await` call might not have been used)" and not what the last yielded object happened to be (which is what my brain initially thought it was simply because the async generator is paused on the event loop returning based on what was yielded).
Thanks! Yury
![](https://secure.gravatar.com/avatar/60a2f1855ca0d8aac3fa75a57233a3f1.jpg?s=120&d=mm&r=g)
Few comments below:
On Jul 29, 2016, at 1:57 PM, Brett Cannon <brett@python.org> wrote:
On Fri, 29 Jul 2016 at 10:46 Yury Selivanov <yselivanov@gmail.com> wrote: Thanks a lot for the feedback, Brett! Comments inlined below:
On Jul 29, 2016, at 1:25 PM, Brett Cannon <brett@python.org> wrote:
[..]
Performance is an additional point for this proposal: in our testing of the reference implementation, asynchronous generators are *2x* faster than an equivalent implemented as an asynchronous iterator.
Another motivation is that types.coroutine becomes purely a backwards-compatibility/low-level thing that is no longer required for event loop frameworks to use on their generators which provide their async API. So now an async framework can be written entirely in terms of def and async def and not def and @types.coroutine.
A slight misunderstanding here: @types.coroutine turns a normal old-style generator into an awaitable object. That isn’t directly related to asynchronous iteration. Frameworks like curio will continue using @types.coroutine to implement future-like objects in event loops.
Ah, OK. So that would be yet another PEP to make that kind of change (and another keyword).
TBH I’m not really sure we need that. To separate coroutines from generators completely, you don’t just need another keyword — you basically need to have a separate parallel implementation of the whole iteration protocol. I think @types.coroutine is a nice glue between two worlds, allowing us to reuse the code efficiently. Again, just my 2 cents.
I'm going to ask this now instead of later when there's more motivation behind this question: do we need to append "a" to every async method we have? If asynchronous generators won't have a normal close() then why can't it just be close(), especially if people are not going to be calling it directly and instead it will be event loops? I'm just leery of codifying this practice of prepending "a" to every async method or function and ending up in a future where I get tired of a specific letter of the alphabet.
I decided to use the prefix because we already use it in magic method names: __anext__ and __aiter__. I think it also makes it easier to understand the API of async generators (and understand how it’s different from sync generators API).
And while it’s entirely possible to drop the ‘a’ for async generator API, it’s not so simple for other cases. Later, for Python 3.7, we might consider adding ‘aiter()’ and ‘anext()’ builtins, for which we’d have to use ‘a’ or ‘async’ prefix (we can’t reuse 'iter()' and 'next()’ for async generators).
I guess we just need to decide as a group that an 'a' prefix is what we want to signify something is asynchronous vs some other prefix like 'a_' or 'async', or 'async_' as people will follow this style choice in their own code going forward.
I’m open to having this discussion. I don’t have a strong preference here; I, personally, like the ‘a’ prefix slightly better, because it’s consistent with __a*__ methods and easy to type. [..]
We actually added that in 3.5 (last minute!).
For sync generators the field is called ‘.gi_yieldfrom’, for coroutines it’s ‘.cr_await’, and for proposed async generators it will be ‘.ag_await’.
I would also clarify that "waiting on" means "what `await` has been called on (if anything as an `await` call might not have been used)" and not what the last yielded object happened to be (which is what my brain initially thought it was simply because the async generator is paused on the event loop returning based on what was yielded).
OK, I’ll try to clarify this! Yury
![](https://secure.gravatar.com/avatar/273890f5944345c84e255cdd6efcfb35.jpg?s=120&d=mm&r=g)
On Friday, July 29, 2016, Brett Cannon <brett@python.org> wrote:
On Fri, 29 Jul 2016 at 10:46 Yury Selivanov <yselivanov@gmail.com <javascript:_e(%7B%7D,'cvml','yselivanov@gmail.com');>> wrote:
Thanks a lot for the feedback, Brett! Comments inlined below:
On Jul 29, 2016, at 1:25 PM, Brett Cannon <brett@python.org <javascript:_e(%7B%7D,'cvml','brett@python.org');>> wrote:
[..]
Performance is an additional point for this proposal: in our testing of the reference implementation, asynchronous generators are *2x* faster than an equivalent implemented as an asynchronous iterator.
Another motivation is that types.coroutine becomes purely a
backwards-compatibility/low-level thing that is no longer required for event loop frameworks to use on their generators which provide their async API. So now an async framework can be written entirely in terms of def and async def and not def and @types.coroutine.
A slight misunderstanding here: @types.coroutine turns a normal old-style generator into an awaitable object. That isn’t directly related to asynchronous iteration. Frameworks like curio will continue using @types.coroutine to implement future-like objects in event loops.
Ah, OK. So that would be yet another PEP to make that kind of change (and another keyword).
[..]
Support for Asynchronous Iteration Protocol -------------------------------------------
The protocol requires two special methods to be implemented:
1. An ``__aiter__`` method returning an *asynchronous iterator*. 2. An ``__anext__`` method returning an *awaitable* object, which uses ``StopIteration`` exception to "yield" values, and ``StopAsyncIteration`` exception to signal the end of the iteration.
I assume this is in place of __iter__() and __next__(), i.e. an async
generator won't have both defined?
Correct, async generators don’t have __iter__ and __next__. They can only be iterated with an `async for`.
[..]
To solve this problem we propose to do the following:
1. Implement an ``aclose`` method on asynchronous generators returning a special *awaitable*. When awaited it throws a ``GeneratorExit`` into the suspended generator and iterates over it until either a ``GeneratorExit`` or a ``StopAsyncIteration`` occur.
This is very similar to what the ``close()`` method does to regular Python generators, except that an event loop is required to execute ``aclose()``.
I'm going to ask this now instead of later when there's more motivation
behind this question: do we need to append "a" to every async method we have? If asynchronous generators won't have a normal close() then why can't it just be close(), especially if people are not going to be calling it directly and instead it will be event loops? I'm just leery of codifying this practice of prepending "a" to every async method or function and ending up in a future where I get tired of a specific letter of the alphabet.
I decided to use the prefix because we already use it in magic method names: __anext__ and __aiter__. I think it also makes it easier to understand the API of async generators (and understand how it’s different from sync generators API).
And while it’s entirely possible to drop the ‘a’ for async generator API, it’s not so simple for other cases. Later, for Python 3.7, we might consider adding ‘aiter()’ and ‘anext()’ builtins, for which we’d have to use ‘a’ or ‘async’ prefix (we can’t reuse 'iter()' and 'next()’ for async generators).
I guess we just need to decide as a group that an 'a' prefix is what we want to signify something is asynchronous vs some other prefix like 'a_' or 'async', or 'async_' as people will follow this style choice in their own code going forward.
Hmm... I think we need to think about a future where, programmatically, there's little-to no distinction between async and synchronous functions. Pushing this down deeper in the system is the way to go. For one, it will serve simple multi-core use once gilectomy is completed (it, or something effectively equivalent will complete). For another, this is the path to reducing the functionally "useless" rewrite efforts of libraries (e.g. github.com/aio-libs), which somehow resemble all the efforts of migrating libraries from 2-3 (loosely). The resistance and unexpected time that 2-3 migration experienced won't readily be mimicked in async tasks - too much effort to get computer and I/O bound benefits? Maintain two versions of needed libraries, or jump languages is what will increasingly happen in the distributed (and more so IOT) world. Time to think about paving the way to async-as first class citizen world. That's probably too much for this PEP, but the topic (a- prefixing) is a good canary for the bigger picture we need to start mulling over. So in this context (and in general w/ async) asking the question "can we make it so it doesn't matter?" is a good one to always be asking - it will get is there. - Yarko
[..]
3. ``agen.anext(val)``: Returns an *awaitable*, that pushes the ``val`` object in the ``agen`` generator. When the ``agen`` has not yet been iterated, ``val`` must be ``None``.
How is this different from an async send()? I see an asend() used in
the example below but not anext().
Good catch! This is a typo, it should read ``agen.asend(val)``.
Similarly to sync generators, where ‘__next__()’ call is equivalent of ‘.send(None)’, ‘__anext__()’ awaitable is equivalent to ‘.asend(None)'
[..]
7. ``agen.ag_await``: The object that ``agen`` is currently awaiting on, or ``None``.
That's an interesting addition. I like it! There's no equivalent on
normal generators, correct?
We actually added that in 3.5 (last minute!).
For sync generators the field is called ‘.gi_yieldfrom’, for coroutines it’s ‘.cr_await’, and for proposed async generators it will be ‘.ag_await’.
I would also clarify that "waiting on" means "what `await` has been called on (if anything as an `await` call might not have been used)" and not what the last yielded object happened to be (which is what my brain initially thought it was simply because the async generator is paused on the event loop returning based on what was yielded).
Thanks! Yury
![](https://secure.gravatar.com/avatar/60a2f1855ca0d8aac3fa75a57233a3f1.jpg?s=120&d=mm&r=g)
Comments inlined:
On Jul 29, 2016, at 2:20 PM, Yarko Tymciurak <yarkot1@gmail.com> wrote:
Hmm... I think we need to think about a future where, programmatically, there's little-to no distinction between async and synchronous functions. Pushing this down deeper in the system is the way to go. For one, it will serve simple multi-core use once gilectomy is completed (it, or something effectively equivalent will complete). For another, this is the path to reducing the functionally "useless" rewrite efforts of libraries (e.g. github.com/aio-libs), which somehow resemble all the efforts of migrating libraries from 2-3 (loosely). The resistance and unexpected time that 2-3 migration experienced won't readily be mimicked in async tasks - too much effort to get computer and I/O bound benefits? Maintain two versions of needed libraries, or jump languages is what will increasingly happen in the distributed (and more so IOT) world.
When and *if* gilectomy is completed (or another project to remove the GIL), we will be able to do this: 1) Run existing async/await applications as is, but instead of running a process per core, we will be able to run a single process with many threads. Likely one asyncio (or other) event loop per thread. This is very speculative, but possible in theory. 2) Run existing blocking IO applications in several threads in one process. This is something that only sounds like an easy thing to do, I suspect that a lot of code will break (or dead lock) when the GIL is removed. Even if everything works perfectly well, threads aren’t answer to all problems — try to manage 1000s of them. Long story short, even if we had no GIL at all, having async/await (and non-blocking IO) would make sense. And if you have async/await, with GIL or without, you will inevitably have different APIs and different IO low-level libs that drive them. There are ways to lessen the pain. For instance, I like Cory’s approach with hyper - implement protocols separately from IO, so that it’s easy to port them to various sync and async frameworks.
Time to think about paving the way to async-as first class citizen world.
That's probably too much for this PEP, but the topic (a- prefixing) is a good canary for the bigger picture we need to start mulling over.
So in this context (and in general w/ async) asking the question "can we make it so it doesn't matter?" is a good one to always be asking - it will get is there.
Unfortunately there is no way to use the same APIs for both async/await and synchronous world. At least for CPython builtin types they have to have different names. I’m fine to discuss the ‘a’ prefix, but I’m a bit afraid that focusing on it too much will distract us from the PEP and details of it that really matter. Yury
![](https://secure.gravatar.com/avatar/273890f5944345c84e255cdd6efcfb35.jpg?s=120&d=mm&r=g)
On Friday, July 29, 2016, Yury Selivanov <yselivanov@gmail.com> wrote:
Comments inlined:
On Jul 29, 2016, at 2:20 PM, Yarko Tymciurak <yarkot1@gmail.com <javascript:;>> wrote:
Hmm... I think we need to think about a future where, programmatically, there's little-to no distinction between async and synchronous functions. Pushing this down deeper in the system is the way to go. For one, it will serve simple multi-core use once gilectomy is completed (it, or something effectively equivalent will complete). For another, this is the path to reducing the functionally "useless" rewrite efforts of libraries (e.g. github.com/aio-libs), which somehow resemble all the efforts of migrating libraries from 2-3 (loosely). The resistance and unexpected time that 2-3 migration experienced won't readily be mimicked in async tasks - too much effort to get computer and I/O bound benefits? Maintain two versions of needed libraries, or jump languages is what will increasingly happen in the distributed (and more so IOT) world.
When and *if* gilectomy is completed (or another project to remove the GIL), we will be able to do this:
1) Run existing async/await applications as is, but instead of running a process per core, we will be able to run a single process with many threads. Likely one asyncio (or other) event loop per thread. This is very speculative, but possible in theory.
2) Run existing blocking IO applications in several threads in one process. This is something that only sounds like an easy thing to do, I suspect that a lot of code will break (or dead lock) when the GIL is removed. Even if everything works perfectly well, threads aren’t answer to all problems — try to manage 1000s of them.
Long story short, even if we had no GIL at all, having async/await (and non-blocking IO) would make sense. And if you have async/await, with GIL or without, you will inevitably have different APIs and different IO low-level libs that drive them.
There are ways to lessen the pain. For instance, I like Cory’s approach with hyper - implement protocols separately from IO, so that it’s easy to port them to various sync and async frameworks.
Time to think about paving the way to async-as first class citizen world.
That's probably too much for this PEP, but the topic (a- prefixing) is a
good canary for the bigger picture we need to start mulling over.
So in this context (and in general w/ async) asking the question "can we
make it so it doesn't matter?" is a good one to always be asking - it will get is there.
Unfortunately there is no way to use the same APIs for both async/await and synchronous world. At least for CPython builtin types they have to have different names.
I’m fine to discuss the ‘a’ prefix, but I’m a bit afraid that focusing on it too much will distract us from the PEP and details of it that really matter.
Yury
To keep it simple, try thinking like this (and yes, Yury, apologies - this is now a side discussion, and not about this pep): everything in CPython is async, and if you don't want async, you don't need to know about, you run a single async task and don't need to know more... Can we get there? That would be cool... - Yarko
![](https://secure.gravatar.com/avatar/7775d42d960a69e98fecf270bdeb6f57.jpg?s=120&d=mm&r=g)
I'm personally think that `a` prefix is pretty fine and understandable. Do we really need to implement full PEP-342 spec in async way? What are use cases? My first thought is PEP-255-like simple async generators should be enough. Full PEP-342 styled generators are clue for implementing things like twisted.inlineCallback and tornado.gen. Thanks to PEP-492 we don't need it anymore. All my requirements are covered perfectly fine by existing `__aiter__`/`__anext__`. What I really need is allowing to write just `yield val` from coroutine but never need for `resp = yield req` On Fri, Jul 29, 2016 at 8:50 PM Yarko Tymciurak <yarkot1@gmail.com> wrote:
On Friday, July 29, 2016, Yury Selivanov <yselivanov@gmail.com> wrote:
Comments inlined:
On Jul 29, 2016, at 2:20 PM, Yarko Tymciurak <yarkot1@gmail.com> wrote:
Hmm... I think we need to think about a future where, programmatically, there's little-to no distinction between async and synchronous functions. Pushing this down deeper in the system is the way to go. For one, it will serve simple multi-core use once gilectomy is completed (it, or something effectively equivalent will complete). For another, this is the path to reducing the functionally "useless" rewrite efforts of libraries (e.g. github.com/aio-libs), which somehow resemble all the efforts of migrating libraries from 2-3 (loosely). The resistance and unexpected time that 2-3 migration experienced won't readily be mimicked in async tasks - too much effort to get computer and I/O bound benefits? Maintain two versions of needed libraries, or jump languages is what will increasingly happen in the distributed (and more so IOT) world.
When and *if* gilectomy is completed (or another project to remove the GIL), we will be able to do this:
1) Run existing async/await applications as is, but instead of running a process per core, we will be able to run a single process with many threads. Likely one asyncio (or other) event loop per thread. This is very speculative, but possible in theory.
2) Run existing blocking IO applications in several threads in one process. This is something that only sounds like an easy thing to do, I suspect that a lot of code will break (or dead lock) when the GIL is removed. Even if everything works perfectly well, threads aren’t answer to all problems — try to manage 1000s of them.
Long story short, even if we had no GIL at all, having async/await (and non-blocking IO) would make sense. And if you have async/await, with GIL or without, you will inevitably have different APIs and different IO low-level libs that drive them.
There are ways to lessen the pain. For instance, I like Cory’s approach with hyper - implement protocols separately from IO, so that it’s easy to port them to various sync and async frameworks.
Time to think about paving the way to async-as first class citizen
world.
That's probably too much for this PEP, but the topic (a- prefixing) is
a good canary for the bigger picture we need to start mulling over.
So in this context (and in general w/ async) asking the question "can
we make it so it doesn't matter?" is a good one to always be asking - it will get is there.
Unfortunately there is no way to use the same APIs for both async/await and synchronous world. At least for CPython builtin types they have to have different names.
I’m fine to discuss the ‘a’ prefix, but I’m a bit afraid that focusing on it too much will distract us from the PEP and details of it that really matter.
Yury
To keep it simple, try thinking like this (and yes, Yury, apologies - this is now a side discussion, and not about this pep): everything in CPython is async, and if you don't want async, you don't need to know about, you run a single async task and don't need to know more...
Can we get there? That would be cool...
- Yarko _______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
![](https://secure.gravatar.com/avatar/60a2f1855ca0d8aac3fa75a57233a3f1.jpg?s=120&d=mm&r=g)
On Jul 29, 2016, at 3:06 PM, Andrew Svetlov <andrew.svetlov@gmail.com> wrote:
I'm personally think that `a` prefix is pretty fine and understandable.
Do we really need to implement full PEP-342 spec in async way? What are use cases? My first thought is PEP-255-like simple async generators should be enough.
Full PEP-342 styled generators are clue for implementing things like twisted.inlineCallback and tornado.gen.
Thanks to PEP-492 we don't need it anymore.
All my requirements are covered perfectly fine by existing `__aiter__`/`__anext__`. What I really need is allowing to write just `yield val` from coroutine but never need for `resp = yield req`
There are a few reasons for having asend() and athrow(): 1) They will have to be implemented regardless; aclose() is a slightly limited version of athrow(), and __anext__() is almost the same thing as asend(). It’s 5-10 additional lines of code to have them. 2) __anext__() has to be a generator-like object, so that YIELD_FROM opcode works correctly with it. That means it has to implement ‘send()’ and ‘throw()’ methods. Which, in turn, means that even if we don’t expose ‘agen.athrow()’, people would be able to do ‘agen.__anext__().throw()’. But that’s a very error-prone thing to do. 3) Having ‘anext()’ and ‘athrow()’ enable you to implement decorators like `@contextlib.contextmanager` but for ‘async with’. And I’m pretty sure that people will come up with even more creative things. To conclude, having ‘asend()’ and ‘athrow()’ doesn’t complicate the implementation at all, but makes asynchronous generators consistent and more compatible with synchronous generators, which I don’t think is a bad thing. Yury
![](https://secure.gravatar.com/avatar/7775d42d960a69e98fecf270bdeb6f57.jpg?s=120&d=mm&r=g)
Ok, if the implementation gets `asend()` etc. "for free" I support it. On Fri, Jul 29, 2016 at 9:14 PM Yury Selivanov <yselivanov@gmail.com> wrote:
On Jul 29, 2016, at 3:06 PM, Andrew Svetlov <andrew.svetlov@gmail.com> wrote:
I'm personally think that `a` prefix is pretty fine and understandable.
Do we really need to implement full PEP-342 spec in async way? What are use cases? My first thought is PEP-255-like simple async generators should be enough.
Full PEP-342 styled generators are clue for implementing things like twisted.inlineCallback and tornado.gen.
Thanks to PEP-492 we don't need it anymore.
All my requirements are covered perfectly fine by existing `__aiter__`/`__anext__`. What I really need is allowing to write just `yield val` from coroutine but never need for `resp = yield req`
There are a few reasons for having asend() and athrow():
1) They will have to be implemented regardless; aclose() is a slightly limited version of athrow(), and __anext__() is almost the same thing as asend(). It’s 5-10 additional lines of code to have them.
2) __anext__() has to be a generator-like object, so that YIELD_FROM opcode works correctly with it. That means it has to implement ‘send()’ and ‘throw()’ methods. Which, in turn, means that even if we don’t expose ‘agen.athrow()’, people would be able to do ‘agen.__anext__().throw()’. But that’s a very error-prone thing to do.
3) Having ‘anext()’ and ‘athrow()’ enable you to implement decorators like `@contextlib.contextmanager` but for ‘async with’. And I’m pretty sure that people will come up with even more creative things.
To conclude, having ‘asend()’ and ‘athrow()’ doesn’t complicate the implementation at all, but makes asynchronous generators consistent and more compatible with synchronous generators, which I don’t think is a bad thing.
Yury
-- Thanks, Andrew Svetlov
![](https://secure.gravatar.com/avatar/60a2f1855ca0d8aac3fa75a57233a3f1.jpg?s=120&d=mm&r=g)
On Jul 29, 2016, at 2:50 PM, Yarko Tymciurak <yarkot1@gmail.com> wrote: To keep it simple, try thinking like this (and yes, Yury, apologies - this is now a side discussion, and not about this pep): everything in CPython is async, and if you don't want async, you don't need to know about, you run a single async task and don't need to know more...
Can we get there? That would be cool...
So something like what they have in Golang? I don’t know if that’s possible in CPython... Yury
![](https://secure.gravatar.com/avatar/60a2f1855ca0d8aac3fa75a57233a3f1.jpg?s=120&d=mm&r=g)
To keep it simple, try thinking like this (and yes, Yury, apologies - this is now a side discussion, and not about this pep): everything in CPython is async, and if you don't want async, you don't need to know about, you run a single async task and don't need to know more...
Can we get there? That would be cool...
So, essentially, something similar to Golang? I don’t know if that’s possible. It would require a complete CPython IO layer rewrite, integrating an event loop directly into the core, etc. The closest thing to that is gevent — no async/await and all IO is non-blocking, but it has its own warts. Yury
![](https://secure.gravatar.com/avatar/273890f5944345c84e255cdd6efcfb35.jpg?s=120&d=mm&r=g)
On Tuesday, August 2, 2016, Yury Selivanov <yselivanov@gmail.com> wrote:
To keep it simple, try thinking like this (and yes, Yury, apologies - this is now a side discussion, and not about this pep): everything in CPython is async, and if you don't want async, you don't need to know about, you run a single async task and don't need to know more...
Can we get there? That would be cool...
So, essentially, something similar to Golang? I don’t know if that’s possible. It would require a complete CPython IO layer rewrite, integrating an event loop directly into the core, etc. The closest thing to that is gevent — no async/await and all IO is non-blocking, but it has its own warts.
Warts in "all in on async", or on gevent (which isn't integrated. Very interested to hear your thoughts - redirect discussion to wherever is the right channel. - Yarko
Yury
![](https://secure.gravatar.com/avatar/3041a99ff2b84bc3dc10805020d35516.jpg?s=120&d=mm&r=g)
+1 for PEP, nothing more to add from technical point of view. An extra step to the right direction, at least to me. Thank you Yury for that :-) About side conversation on sync/async split world, except to force coroutines pattern usage like in Go, I don't see how we can become more implicit. Even if the zen of Python recommands to prefer an explicit approach, I see more explicit/implicit as a balance you must adjust between Simplicity/Flexibility than a binary choice. To me, the success of Python as language is also because you have a good balance between theses approaches, and the last move from "yield from" to "await" illustrates that: Hide the internal mechanisms of implementation, but keep the explicit way to declare that. Like Andrew Svetlov, I don't believe a lot in the implicit approach of Gevent, because very quickly, you need to understand the extra tools, like synchronization primitives. The fact to know if you need to prefix with "await" or not the call of the functions is the tree that hides the forest. With the async pattern, it's impossible to hide everything and everything will work automagically: You must understand a little bit what's happening, or it will be very complicated to debug. To me, you can hide everything only if you are really sure it will work 100% of time without human intervention, like with autonomous Google cars. However, it might be interesting to have an async "linter", that should list all blocking I/O code in async coroutines, to help new comers to find this type of bugs. But with the dynamic nature of Python, I don't know if it's realistic to try to implement that. To me, it should be a better answer than to try to remove all sync/async code differences. Moreover, I see the need of async libs as an extra opportunity to challenge and simplify the Python toolbox. For now, with aiohttp, you have an unified API for HTTP in general, contrary in sync world with requests and flask for example. At least to me, a client and a server are only two sides of the same piece. More true with p2p protocols. As discussed several times, the next level might be more code reuse like suggested by David Beazley with SansIO, split protocol and I/O handling: https://twitter.com/dabeaz/status/761599925444550656?lang=fr https://github.com/brettcannon/sans-io I don't know yet if the benefit to share more code between implementations will be more important than the potential complexity code increase. The only point I'm sure for now: I'm preparing the pop-corn to watch the next episodes: curious to see what are the next ideas/implementations will emerge ;-) At least to me, it's more interesting than follow a TV serie, thank you for that :-) Have a nice week. Ludovic Gasc (GMLudo) http://www.gmludo.eu/ On 29 Jul 2016 20:50, "Yarko Tymciurak" <yarkot1@gmail.com> wrote:
On Friday, July 29, 2016, Yury Selivanov <yselivanov@gmail.com> wrote:
Comments inlined:
On Jul 29, 2016, at 2:20 PM, Yarko Tymciurak <yarkot1@gmail.com> wrote:
Hmm... I think we need to think about a future where, programmatically, there's little-to no distinction between async and synchronous functions. Pushing this down deeper in the system is the way to go. For one, it will serve simple multi-core use once gilectomy is completed (it, or something effectively equivalent will complete). For another, this is the path to reducing the functionally "useless" rewrite efforts of libraries (e.g. github.com/aio-libs), which somehow resemble all the efforts of migrating libraries from 2-3 (loosely). The resistance and unexpected time that 2-3 migration experienced won't readily be mimicked in async tasks - too much effort to get computer and I/O bound benefits? Maintain two versions of needed libraries, or jump languages is what will increasingly happen in the distributed (and more so IOT) world.
When and *if* gilectomy is completed (or another project to remove the GIL), we will be able to do this:
1) Run existing async/await applications as is, but instead of running a process per core, we will be able to run a single process with many threads. Likely one asyncio (or other) event loop per thread. This is very speculative, but possible in theory.
2) Run existing blocking IO applications in several threads in one process. This is something that only sounds like an easy thing to do, I suspect that a lot of code will break (or dead lock) when the GIL is removed. Even if everything works perfectly well, threads aren’t answer to all problems — try to manage 1000s of them.
Long story short, even if we had no GIL at all, having async/await (and non-blocking IO) would make sense. And if you have async/await, with GIL or without, you will inevitably have different APIs and different IO low-level libs that drive them.
There are ways to lessen the pain. For instance, I like Cory’s approach with hyper - implement protocols separately from IO, so that it’s easy to port them to various sync and async frameworks.
Time to think about paving the way to async-as first class citizen
world.
That's probably too much for this PEP, but the topic (a- prefixing) is
a good canary for the bigger picture we need to start mulling over.
So in this context (and in general w/ async) asking the question "can
we make it so it doesn't matter?" is a good one to always be asking - it will get is there.
Unfortunately there is no way to use the same APIs for both async/await and synchronous world. At least for CPython builtin types they have to have different names.
I’m fine to discuss the ‘a’ prefix, but I’m a bit afraid that focusing on it too much will distract us from the PEP and details of it that really matter.
Yury
To keep it simple, try thinking like this (and yes, Yury, apologies - this is now a side discussion, and not about this pep): everything in CPython is async, and if you don't want async, you don't need to know about, you run a single async task and don't need to know more...
Can we get there? That would be cool...
- Yarko
_______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
![](https://secure.gravatar.com/avatar/273890f5944345c84e255cdd6efcfb35.jpg?s=120&d=mm&r=g)
I still have to wonder, though, how an async repl, from the inside-out,which handles a single task by default (synchronous equivalent) would be anything less than explicit, or would complicate much (if anything - I suspect a significant amount of the opposite). Regardless, I am grateful for the discussions. - Yarko On Sunday, August 7, 2016, Ludovic Gasc <gmludo@gmail.com> wrote:
+1 for PEP, nothing more to add from technical point of view. An extra step to the right direction, at least to me. Thank you Yury for that :-)
About side conversation on sync/async split world, except to force coroutines pattern usage like in Go, I don't see how we can become more implicit. Even if the zen of Python recommands to prefer an explicit approach, I see more explicit/implicit as a balance you must adjust between Simplicity/Flexibility than a binary choice.
To me, the success of Python as language is also because you have a good balance between theses approaches, and the last move from "yield from" to "await" illustrates that: Hide the internal mechanisms of implementation, but keep the explicit way to declare that.
Like Andrew Svetlov, I don't believe a lot in the implicit approach of Gevent, because very quickly, you need to understand the extra tools, like synchronization primitives. The fact to know if you need to prefix with "await" or not the call of the functions is the tree that hides the forest.
With the async pattern, it's impossible to hide everything and everything will work automagically: You must understand a little bit what's happening, or it will be very complicated to debug.
To me, you can hide everything only if you are really sure it will work 100% of time without human intervention, like with autonomous Google cars.
However, it might be interesting to have an async "linter", that should list all blocking I/O code in async coroutines, to help new comers to find this type of bugs. But with the dynamic nature of Python, I don't know if it's realistic to try to implement that. To me, it should be a better answer than to try to remove all sync/async code differences.
Moreover, I see the need of async libs as an extra opportunity to challenge and simplify the Python toolbox.
For now, with aiohttp, you have an unified API for HTTP in general, contrary in sync world with requests and flask for example. At least to me, a client and a server are only two sides of the same piece. More true with p2p protocols.
As discussed several times, the next level might be more code reuse like suggested by David Beazley with SansIO, split protocol and I/O handling: https://twitter.com/dabeaz/status/761599925444550656?lang=fr
https://github.com/brettcannon/sans-io
I don't know yet if the benefit to share more code between implementations will be more important than the potential complexity code increase.
The only point I'm sure for now: I'm preparing the pop-corn to watch the next episodes: curious to see what are the next ideas/implementations will emerge ;-) At least to me, it's more interesting than follow a TV serie, thank you for that :-)
Have a nice week.
Ludovic Gasc (GMLudo) http://www.gmludo.eu/
On 29 Jul 2016 20:50, "Yarko Tymciurak" <yarkot1@gmail.com <javascript:_e(%7B%7D,'cvml','yarkot1@gmail.com');>> wrote:
On Friday, July 29, 2016, Yury Selivanov <yselivanov@gmail.com <javascript:_e(%7B%7D,'cvml','yselivanov@gmail.com');>> wrote:
Comments inlined:
On Jul 29, 2016, at 2:20 PM, Yarko Tymciurak <yarkot1@gmail.com> wrote:
Hmm... I think we need to think about a future where, programmatically, there's little-to no distinction between async and synchronous functions. Pushing this down deeper in the system is the way to go. For one, it will serve simple multi-core use once gilectomy is completed (it, or something effectively equivalent will complete). For another, this is the path to reducing the functionally "useless" rewrite efforts of libraries (e.g. github.com/aio-libs), which somehow resemble all the efforts of migrating libraries from 2-3 (loosely). The resistance and unexpected time that 2-3 migration experienced won't readily be mimicked in async tasks - too much effort to get computer and I/O bound benefits? Maintain two versions of needed libraries, or jump languages is what will increasingly happen in the distributed (and more so IOT) world.
When and *if* gilectomy is completed (or another project to remove the GIL), we will be able to do this:
1) Run existing async/await applications as is, but instead of running a process per core, we will be able to run a single process with many threads. Likely one asyncio (or other) event loop per thread. This is very speculative, but possible in theory.
2) Run existing blocking IO applications in several threads in one process. This is something that only sounds like an easy thing to do, I suspect that a lot of code will break (or dead lock) when the GIL is removed. Even if everything works perfectly well, threads aren’t answer to all problems — try to manage 1000s of them.
Long story short, even if we had no GIL at all, having async/await (and non-blocking IO) would make sense. And if you have async/await, with GIL or without, you will inevitably have different APIs and different IO low-level libs that drive them.
There are ways to lessen the pain. For instance, I like Cory’s approach with hyper - implement protocols separately from IO, so that it’s easy to port them to various sync and async frameworks.
Time to think about paving the way to async-as first class citizen
world.
That's probably too much for this PEP, but the topic (a- prefixing) is
a good canary for the bigger picture we need to start mulling over.
So in this context (and in general w/ async) asking the question "can
we make it so it doesn't matter?" is a good one to always be asking - it will get is there.
Unfortunately there is no way to use the same APIs for both async/await and synchronous world. At least for CPython builtin types they have to have different names.
I’m fine to discuss the ‘a’ prefix, but I’m a bit afraid that focusing on it too much will distract us from the PEP and details of it that really matter.
Yury
To keep it simple, try thinking like this (and yes, Yury, apologies - this is now a side discussion, and not about this pep): everything in CPython is async, and if you don't want async, you don't need to know about, you run a single async task and don't need to know more...
Can we get there? That would be cool...
- Yarko
_______________________________________________ Async-sig mailing list Async-sig@python.org <javascript:_e(%7B%7D,'cvml','Async-sig@python.org');> https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
![](https://secure.gravatar.com/avatar/273890f5944345c84e255cdd6efcfb35.jpg?s=120&d=mm&r=g)
On Monday, August 8, 2016, Yarko Tymciurak <yarkot1@gmail.com> wrote:
I still have to wonder, though, how an async repl, from the inside-out,which handles a single task by default (synchronous equivalent) would be anything less than explicit, or would complicate much (if anything - I suspect a significant amount of the opposite).
Regardless, I am grateful for the discussions.
- Yarko
On Sunday, August 7, 2016, Ludovic Gasc <gmludo@gmail.com <javascript:_e(%7B%7D,'cvml','gmludo@gmail.com');>> wrote:
+1 for PEP, nothing more to add from technical point of view. An extra step to the right direction, at least to me. Thank you Yury for that :-)
About side conversation on sync/async split world, except to force coroutines pattern usage like in Go, I don't see how we can become more implicit. Even if the zen of Python recommands to prefer an explicit approach, I see more explicit/implicit as a balance you must adjust between Simplicity/Flexibility than a binary choice.
To me, the success of Python as language is also because you have a good balance between theses approaches, and the last move from "yield from" to "await" illustrates that: Hide the internal mechanisms of implementation, but keep the explicit way to declare that.
Like Andrew Svetlov, I don't believe a lot in the implicit approach of Gevent, because very quickly, you need to understand the extra tools, like synchronization primitives. The fact to know if you need to prefix with "await" or not the call of the functions is the tree that hides the forest.
With the async pattern, it's impossible to hide everything and everything will work automagically: You must understand a little bit what's happening, or it will be very complicated to debug.
To me, you can hide everything only if you are really sure it will work 100% of time without human intervention, like with autonomous Google cars.
However, it might be interesting to have an async "linter", that should list all blocking I/O code in async coroutines, to help new comers to find this type of bugs. But with the dynamic nature of Python, I don't know if it's realistic to try to implement that. To me, it should be a better answer than to try to remove all sync/async code differences.
Moreover, I see the need of async libs as an extra opportunity to challenge and simplify the Python toolbox.
For now, with aiohttp, you have an unified API for HTTP in general, contrary in sync world with requests and flask for example. At least to me, a client and a server are only two sides of the same piece. More true with p2p protocols.
As discussed several times, the next level might be more code reuse like suggested by David Beazley with SansIO, split protocol and I/O handling: https://twitter.com/dabeaz/status/761599925444550656?lang=fr
Question: isn't SansIO / Corey's work just a specific instance of Bob Martin's "Clean Architecture"? It sounds familiar to me, when thinking of Brandon Rhode's 2014 PyOhio talk, and his recast of the topic in Warsaw in 2015. It seems like it... If so, then perhaps async aspects are just a second aspect. I don't know yet if the benefit to share more code between implementations
will be more important than the potential complexity code increase.
The only point I'm sure for now: I'm preparing the pop-corn to watch the next episodes: curious to see what are the next ideas/implementations will emerge ;-) At least to me, it's more interesting than follow a TV serie, thank you for that :-)
Have a nice week.
Ludovic Gasc (GMLudo) http://www.gmludo.eu/
On 29 Jul 2016 20:50, "Yarko Tymciurak" <yarkot1@gmail.com> wrote:
On Friday, July 29, 2016, Yury Selivanov <yselivanov@gmail.com> wrote:
Comments inlined:
On Jul 29, 2016, at 2:20 PM, Yarko Tymciurak <yarkot1@gmail.com> wrote:
Hmm... I think we need to think about a future where, programmatically, there's little-to no distinction between async and synchronous functions. Pushing this down deeper in the system is the way to go. For one, it will serve simple multi-core use once gilectomy is completed (it, or something effectively equivalent will complete). For another, this is the path to reducing the functionally "useless" rewrite efforts of libraries (e.g. github.com/aio-libs), which somehow resemble all the efforts of migrating libraries from 2-3 (loosely). The resistance and unexpected time that 2-3 migration experienced won't readily be mimicked in async tasks - too much effort to get computer and I/O bound benefits? Maintain two versions of needed libraries, or jump languages is what will increasingly happen in the distributed (and more so IOT) world.
When and *if* gilectomy is completed (or another project to remove the GIL), we will be able to do this:
1) Run existing async/await applications as is, but instead of running a process per core, we will be able to run a single process with many threads. Likely one asyncio (or other) event loop per thread. This is very speculative, but possible in theory.
2) Run existing blocking IO applications in several threads in one process. This is something that only sounds like an easy thing to do, I suspect that a lot of code will break (or dead lock) when the GIL is removed. Even if everything works perfectly well, threads aren’t answer to all problems — try to manage 1000s of them.
Long story short, even if we had no GIL at all, having async/await (and non-blocking IO) would make sense. And if you have async/await, with GIL or without, you will inevitably have different APIs and different IO low-level libs that drive them.
There are ways to lessen the pain. For instance, I like Cory’s approach with hyper - implement protocols separately from IO, so that it’s easy to port them to various sync and async frameworks.
Time to think about paving the way to async-as first class citizen
world.
That's probably too much for this PEP, but the topic (a- prefixing)
is a good canary for the bigger picture we need to start mulling over.
So in this context (and in general w/ async) asking the question "can
we make it so it doesn't matter?" is a good one to always be asking - it will get is there.
Unfortunately there is no way to use the same APIs for both async/await and synchronous world. At least for CPython builtin types they have to have different names.
I’m fine to discuss the ‘a’ prefix, but I’m a bit afraid that focusing on it too much will distract us from the PEP and details of it that really matter.
Yury
To keep it simple, try thinking like this (and yes, Yury, apologies - this is now a side discussion, and not about this pep): everything in CPython is async, and if you don't want async, you don't need to know about, you run a single async task and don't need to know more...
Can we get there? That would be cool...
- Yarko
_______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
![](https://secure.gravatar.com/avatar/214c694acb154321379cbc58dc91528c.jpg?s=120&d=mm&r=g)
On 8 Aug 2016, at 07:56, Yarko Tymciurak <yarkot1@gmail.com> wrote:
Question: isn't SansIO / Corey's work just a specific instance of Bob Martin's "Clean Architecture"? It sounds familiar to me, when thinking of Brandon Rhode's 2014 PyOhio talk, and his recast of the topic in Warsaw in 2015. It seems like it... If so, then perhaps async aspects are just a second aspect.
Yeah, so this is worth highlighting more clearly. =) I apologise to the list in advance for this, but this is going to be something of a digression, so I have re-titled the thread. I have invented and pioneered *nothing* here. Designing protocol implementations this way is not new in the world of software development. No-one is going to give me a medal or name a design pattern after me, and I don’t even think that pushing for this design pattern will help me move up my employer’s technical career path the way spending time on writing RFCs would have. However, I think we can safely say that the Python community has not effectively done this over our twenty-plus year lifetime. Brett’s “sans-io” page is a beautiful example of this: currently we can find maybe *three* libraries that tick the box of being I/O-independent. This, in my mind, points to a cultural problem in the Python community. For whatever reason, we have tricked ourselves into abandoning what I would call “good software design”. I have my theories for why this is. A huge factor of this is the relative ease-of-use of sockets and other I/O tools in Python (relative to C, that is: languages that also have this relative ease of use have similar problems, such as Javascript and Ruby). Another factor is the Python community’s love of beautiful APIs. Libraries with beautiful APIs provide such an appeal to programmers that there is huge incentive to build such a library. Building such a library requires including your I/O, and in the absence of any I/O-free implementations of a protocol you are almost certainly going to take the path of least resistance, which means intermingling your I/O and event loop primitives with your protocol code. And I’m not criticising programmers that do this. As I pointed out with the fake title of my talk, I am a programmer that does this! Almost all the best programmers our community has to offer have done this and continue to do this. It really is easier. And trust me, as someone who has written one: there is very little that is sexy about writing a protocol library that does no I/O. It doesn’t make you famous, and it doesn’t make anyone applaud your design sense. They’re extremely boring and prosaic, and they rarely solve the actual problem you have: they are a step on the road to solving the problem generally, but not specifically. So am I talking about the same thing as Brandon? Yes, yes I am. And Brandon is a better speaker than I am, so I agree that it’s weird that I’m getting more credit for this than he is. But I suspect that I’ve had some advantages that Brandon didn’t have. The first is that, by sheer good luck, I’ve managed to tap into a zeitgeist and be in the right time at the right place to deliver this message. Dave Beazley’s work on curio here is helping, because of curio’s sheer incompatibility with the other event loop approaches, which means that his work and mine have a nice symbiosis. Nathaniel and I have managed to give him the building blocks to demonstrate curio’s effectiveness without him needing to be an expert in HTTP. The second is that, I think, Brandon was targeting a different audience. Brandon was trying to talk to the general case, but he stopped short of the case I made. If you go back to watch Brandon’s talk, he talks about hoisting your I/O to the top of your control flow, but never leaps forward to say “and this has a very specific application to any tool or library that talks a network protocol”. My argument is actually *more specific* than Brandon’s: I am saying that you can strip out the low-level stuff into an entirely separate logical codebase, and then not just change the *type* of I/O you do, but change the complete style in which you do it. Brandon’s argument isn’t so much about code reuse as it is about code maintenance. That said, I think the real reason people are talking about my work and not Brandon’s is because, while we both said “You should hoist your I/O out of your logic”, I got to follow up with “and I have done it and released the code and it’s actually being used today in projects you know about”. That second argument really does help, because whenever you tell people to totally change the way they’re writing something, they have a tendency to go “Oh, yeah, but that doesn’t work for *my* project”. Being able to say “look, I did it, it really does work, and here are all the ways that it let me build a better thing”. That said, I’d love to have Brandon weigh in on this too. I do feel a bit bad that I am re-treading ground that others have covered before, but…hell, we can’t all be top-tier programmers! If all I say is that I have repackaged an idea my intellectual betters have previously made such that it’s more appealing to the masses, I think I can call myself happy with that. Cory
![](https://secure.gravatar.com/avatar/e1554622707bedd9202884900430b838.jpg?s=120&d=mm&r=g)
On Aug 8, 2016, at 2:45 AM, Cory Benfield <cory@lukasa.co.uk> wrote:
On 8 Aug 2016, at 07:56, Yarko Tymciurak <yarkot1@gmail.com <mailto:yarkot1@gmail.com>> wrote:
Question: isn't SansIO / Corey's work just a specific instance of Bob Martin's "Clean Architecture"? It sounds familiar to me, when thinking of Brandon Rhode's 2014 PyOhio talk, and his recast of the topic in Warsaw in 2015. It seems like it... If so, then perhaps async aspects are just a second aspect.
Yeah, so this is worth highlighting more clearly. =) I apologise to the list in advance for this, but this is going to be something of a digression, so I have re-titled the thread.
I have invented and pioneered *nothing* here. Designing protocol implementations this way is not new in the world of software development. No-one is going to give me a medal or name a design pattern after me, and I don’t even think that pushing for this design pattern will help me move up my employer’s technical career path the way spending time on writing RFCs would have.
I wouldn't be so sure about that - I have definitely been referring to the principle that one should design an I/O free layer in every system as "Benfield's Law" in recent conversations ;-).
That said, I’d love to have Brandon weigh in on this too. I do feel a bit bad that I am re-treading ground that others have covered before, but…hell, we can’t all be top-tier programmers! If all I say is that I have repackaged an idea my intellectual betters have previously made such that it’s more appealing to the masses, I think I can call myself happy with that.
I really don't think you should feel even a bit bad here. I'm one of the people who has been trying to tell people to do this for the last decade or so, and frankly, I'd been expressing myself poorly. There's also a "do as I say, not as I do" problem here: I learned this lesson _after_ I had implemented a bunch of protocols the wrong way (stacks and stacks of inheritance, tight I/O coupling), and then decided I'd surely do the _next_ one right. But of course then I was buried under the crushing maintenance burden of the existing stuff and I still have not managed to crawl my way out. Certainly, none of the popular code I've written works this way. As a result, I feel tremendously grateful that you have managed to noticeably increase the popularity of this concept, and as far as I'm concerned, you deserve quite a bit of credit. Given the number of people publicly associated with it now, clearly it's something we all thought was important to advocate for (and did successfully, to the extent that some people have learned those lessons!), but nevertheless nobody that I'm aware of, especially in the Python community, has managed as impactful a presentation as you have. This is not entirely an accident of history - it was a very good talk that was deceptively simple-looking but belied a lot of deep expertise. So don't sell yourself short: you may not have invented the concept, but I know I could not have given that talk in quite that way; I'm not sure any of the giants whose shoulders you stand upon could have either. -glyph
![](https://secure.gravatar.com/avatar/53bcfa5bd120125e45d9207069dd764b.jpg?s=120&d=mm&r=g)
On Aug 8, 2016, at 4:45 AM, Cory Benfield <cory@lukasa.co.uk> wrote:
The first is that, by sheer good luck, I’ve managed to tap into a zeitgeist and be in the right time at the right place to deliver this message. Dave Beazley’s work on curio here is helping, because of curio’s sheer incompatibility with the other event loop approaches, which means that his work and mine have a nice symbiosis. Nathaniel and I have managed to give him the building blocks to demonstrate curio’s effectiveness without him needing to be an expert in HTTP.
Chiming in on the "zeitgeist" comment for a moment, I've wondered for a long time why Python can't reinvent itself in the area of I/O (and maybe systems programming generally). Honestly, I feel like a whole lot of time has been burned up thinking about Python 2/3 compatibility instead of looking forward with futuristic new projects and ideas. Perhaps "async/await" serves as a catalyst to rethink some of these things. A lot of my work with async/await is really focused on exploring the API space with it--well, at least seeing how much I can twist that part of the language in diabolical ways. The protocol issue is real though. Sure, I could probably bang out a passable HTTP/0.9 protocol in an afternoon, but try to tackle something modern like HTTP/2? No way. I'm totally out of my element with something like that. Having an I/O-free implementation of it is cool. It would be pretty neat to have something like that for various other things too (Redis, MySQL, postgres, ZeroMQ, etc.). Cheers, Dave
![](https://secure.gravatar.com/avatar/273890f5944345c84e255cdd6efcfb35.jpg?s=120&d=mm&r=g)
Thanks Cory - I wasn't so much saying "Isn't Cory doing kinda what Brandon said" as I was saying "Isn't this kinda the more general stuff that Bob Martin talks (tries to talk about?) with Clean Architechture / Clean Coding (which Brandon - first I saw explicitly - shows a bit of examples of in the Python context in his PyOhio, and how that helps testability ... and I suppose maintainability). So, less than my intending to point-to / assign credit (credit is due _everywhere_, honestly great stuff abounds!), I was hoping to say "hey - we've just scratched the surface - we can take this more broadly!" For me, the interesting question (in looking at what coding errors I've seen which opened my eyes to all this, as entangled as they've been in several kinds all mixed in) - isn't async in general one of these elements with potential for "clean", for thinking about taking it further in python, in effect doing with the language sort of a similar thing which Cory has demonstrated w/ protocols? I don't know, but I am curios (pun not intended). Dave Beazley's curio (which I had the pleasure of saying "quit talking, show me running code!" early on - and then "wow!"), and other things are leading me to think our / Python community's perspective on async might (?) need a bigger shift, a bigger kick-out-of-current-ways-of-thinking. Yury's comment "you mean put async into the core, like go?" was probably spot on as far as understanding my thinking - but not to mimic, but rather to honestly explore the bigger library implications - maintenance and simplicity and performance - and the "keeping-you-out-of-trouble" effect that (hopefully) might exist for application coders. At this year's PyOhio, Dave mentioned a project moving from python to go because of this. I saw in enough (?) detail the "technical" reasons why (I think errors in managing cooperative multi-tasking, which async is - and never spotting the real problem), and I gasped, and so started some side discussions with several of you. So - to everyone - thanks for all the discussion, and thinking, and presenting, and break-away libraries. I hope we keep kicking this ball around a bit more, into new, deeper space! - Yarko On Mon, Aug 8, 2016 at 7:26 AM, David Beazley <dave@dabeaz.com> wrote:
On Aug 8, 2016, at 4:45 AM, Cory Benfield <cory@lukasa.co.uk> wrote:
The first is that, by sheer good luck, I’ve managed to tap into a zeitgeist and be in the right time at the right place to deliver this message. Dave Beazley’s work on curio here is helping, because of curio’s sheer incompatibility with the other event loop approaches, which means that his work and mine have a nice symbiosis. Nathaniel and I have managed to give him the building blocks to demonstrate curio’s effectiveness without him needing to be an expert in HTTP.
Chiming in on the "zeitgeist" comment for a moment, I've wondered for a long time why Python can't reinvent itself in the area of I/O (and maybe systems programming generally). Honestly, I feel like a whole lot of time has been burned up thinking about Python 2/3 compatibility instead of looking forward with futuristic new projects and ideas. Perhaps "async/await" serves as a catalyst to rethink some of these things. A lot of my work with async/await is really focused on exploring the API space with it--well, at least seeing how much I can twist that part of the language in diabolical ways. The protocol issue is real though. Sure, I could probably bang out a passable HTTP/0.9 protocol in an afternoon, but try to tackle something modern like HTTP/2? No way. I'm totally out of my element with something like that. Having an I/O-free implementation of it is cool. It would be pretty neat to have something like that for various other things too (Redis, MySQL, postgres, ZeroMQ, etc.).
Cheers, Dave
![](https://secure.gravatar.com/avatar/047f2332cde3730f1ed661eebb0c5686.jpg?s=120&d=mm&r=g)
OK, never mind Impostor Syndrome... How can we move this forward in the community though actual applications? Should we have the complete I/O-free HTTP parser in the stdlib, or is that a bad idea? Do we want a PEP? Is there something we could add to PEP 8? On Mon, Aug 8, 2016 at 2:45 AM, Cory Benfield <cory@lukasa.co.uk> wrote:
On 8 Aug 2016, at 07:56, Yarko Tymciurak <yarkot1@gmail.com> wrote:
Question: isn't SansIO / Corey's work just a specific instance of Bob Martin's "Clean Architecture"? It sounds familiar to me, when thinking of Brandon Rhode's 2014 PyOhio talk, and his recast of the topic in Warsaw in 2015. It seems like it... If so, then perhaps async aspects are just a second aspect.
Yeah, so this is worth highlighting more clearly. =) I apologise to the list in advance for this, but this is going to be something of a digression, so I have re-titled the thread.
I have invented and pioneered *nothing* here. Designing protocol implementations this way is not new in the world of software development. No-one is going to give me a medal or name a design pattern after me, and I don’t even think that pushing for this design pattern will help me move up my employer’s technical career path the way spending time on writing RFCs would have.
However, I think we can safely say that the Python community has not effectively done this over our twenty-plus year lifetime. Brett’s “sans-io” page is a beautiful example of this: currently we can find maybe *three* libraries that tick the box of being I/O-independent. This, in my mind, points to a cultural problem in the Python community. For whatever reason, we have tricked ourselves into abandoning what I would call “good software design”.
I have my theories for why this is. A huge factor of this is the relative ease-of-use of sockets and other I/O tools in Python (relative to C, that is: languages that also have this relative ease of use have similar problems, such as Javascript and Ruby). Another factor is the Python community’s love of beautiful APIs. Libraries with beautiful APIs provide such an appeal to programmers that there is huge incentive to build such a library. Building such a library requires including your I/O, and in the absence of any I/O-free implementations of a protocol you are almost certainly going to take the path of least resistance, which means intermingling your I/O and event loop primitives with your protocol code.
And I’m not criticising programmers that do this. As I pointed out with the fake title of my talk, I am a programmer that does this! Almost all the best programmers our community has to offer have done this and continue to do this. It really is easier. And trust me, as someone who has written one: there is very little that is sexy about writing a protocol library that does no I/O. It doesn’t make you famous, and it doesn’t make anyone applaud your design sense. They’re extremely boring and prosaic, and they rarely solve the actual problem you have: they are a step on the road to solving the problem generally, but not specifically.
So am I talking about the same thing as Brandon? Yes, yes I am. And Brandon is a better speaker than I am, so I agree that it’s weird that I’m getting more credit for this than he is. But I suspect that I’ve had some advantages that Brandon didn’t have.
The first is that, by sheer good luck, I’ve managed to tap into a zeitgeist and be in the right time at the right place to deliver this message. Dave Beazley’s work on curio here is helping, because of curio’s sheer incompatibility with the other event loop approaches, which means that his work and mine have a nice symbiosis. Nathaniel and I have managed to give him the building blocks to demonstrate curio’s effectiveness without him needing to be an expert in HTTP.
The second is that, I think, Brandon was targeting a different audience. Brandon was trying to talk to the general case, but he stopped short of the case I made. If you go back to watch Brandon’s talk, he talks about hoisting your I/O to the top of your control flow, but never leaps forward to say “and this has a very specific application to any tool or library that talks a network protocol”. My argument is actually *more specific* than Brandon’s: I am saying that you can strip out the low-level stuff into an entirely separate logical codebase, and then not just change the *type* of I/O you do, but change the complete style in which you do it. Brandon’s argument isn’t so much about code reuse as it is about code maintenance.
That said, I think the real reason people are talking about my work and not Brandon’s is because, while we both said “You should hoist your I/O out of your logic”, I got to follow up with “and I have done it and released the code and it’s actually being used today in projects you know about”. That second argument really does help, because whenever you tell people to totally change the way they’re writing something, they have a tendency to go “Oh, yeah, but that doesn’t work for *my* project”. Being able to say “look, I did it, it really does work, and here are all the ways that it let me build a better thing”.
That said, I’d love to have Brandon weigh in on this too. I do feel a bit bad that I am re-treading ground that others have covered before, but…hell, we can’t all be top-tier programmers! If all I say is that I have repackaged an idea my intellectual betters have previously made such that it’s more appealing to the masses, I think I can call myself happy with that.
Cory
_______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido)
![](https://secure.gravatar.com/avatar/214c694acb154321379cbc58dc91528c.jpg?s=120&d=mm&r=g)
On 8 Aug 2016, at 16:47, Guido van Rossum <guido@python.org> wrote:
OK, never mind Impostor Syndrome... How can we move this forward in the community though actual applications? Should we have the complete I/O-free HTTP parser in the stdlib, or is that a bad idea? Do we want a PEP? Is there something we could add to PEP 8?
Addressing those in turn: Yes, we could put the I/O free HTTP parser in the stdlib. That’s really Nathaniel’s call, of course, as it’s his parser, but there’s no reason we couldn’t. Of course, all the regular caveats apply: we’ll want to give it a while longer on PyPI to bake, and of course we have to discuss where the logical end of this. How obscure does a protocol have to be for having a parser in the stdlib to no longer be sensible? Having a parser in the stdlib also raises another question: does the http module get rewritten in terms of this parser? The obvious answer is “yes”, but someone would have to do the work, and I’m not sure who is volunteering (though there’s a risk that it’d be me). As to having a PEP or putting something in PEP 8, I feel pretty lukewarm to those ideas. If the core Python team was able to legislate good coding practices via PEPs I think the world would be a very different place. Instead, it might be a better idea to focus our writing efforts on the SansIO page in the first instance. If we get to a place where we feel like we have a really good handle on how to explain what to do and what not to do, we could reconsider the PEP at that point. Cory
![](https://secure.gravatar.com/avatar/60a2f1855ca0d8aac3fa75a57233a3f1.jpg?s=120&d=mm&r=g)
On Aug 8, 2016, at 11:56 AM, Cory Benfield <cory@lukasa.co.uk> wrote:
On 8 Aug 2016, at 16:47, Guido van Rossum <guido@python.org> wrote:
OK, never mind Impostor Syndrome... How can we move this forward in the community though actual applications? Should we have the complete I/O-free HTTP parser in the stdlib, or is that a bad idea? Do we want a PEP? Is there something we could add to PEP 8?
Addressing those in turn:
Yes, we could put the I/O free HTTP parser in the stdlib. That’s really Nathaniel’s call, of course, as it’s his parser, but there’s no reason we couldn’t. Of course, all the regular caveats apply: we’ll want to give it a while longer on PyPI to bake, and of course we have to discuss where the logical end of this. How obscure does a protocol have to be for having a parser in the stdlib to no longer be sensible?
Having a parser in the stdlib also raises another question: does the http module get rewritten in terms of this parser? The obvious answer is “yes”,
This — we have to validate that h11 has the correct design *before* including it to the stdlib. So the answer is “yes”, indeed.
As to having a PEP or putting something in PEP 8, I feel pretty lukewarm to those ideas. If the core Python team was able to legislate good coding practices via PEPs I think the world would be a very different place.
What does this mean? A lot of people, in any serious project at least, follow PEP 8 and coding with linters is a wide-spread practice. Yury
![](https://secure.gravatar.com/avatar/214c694acb154321379cbc58dc91528c.jpg?s=120&d=mm&r=g)
On 8 Aug 2016, at 17:06, Yury Selivanov <yselivanov@gmail.com> wrote:
As to having a PEP or putting something in PEP 8, I feel pretty lukewarm to those ideas. If the core Python team was able to legislate good coding practices via PEPs I think the world would be a very different place.
What does this mean? A lot of people, in any serious project at least, follow PEP 8 and coding with linters is a wide-spread practice.
Sure, but that only works for things that linters can enforce, and I sincerely doubt that this is one of them. How would you code a linter in order to confirm that protocol code and I/O code are separated? More importantly though, people follow the bits of PEP 8 that are easy to enforce and that don’t require substantive architectural changes. If PEP 8 said something like “wherever possible use dependency injection as a design pattern”, that guideline would be ignored as both entirely unenforceable and totally subjective. Where is the line for “wherever possible”? How does one enforce the use of dependency injection? Can you programmatically determine dependency injection? What dependencies do not need to be injected? Dependency injection is a great design pattern that produces lots of useful side effects, and I use it often. But if I saw PEP 8 mandating it I’d be extremely perplexed. Realistically, at a certain point it’s the equivalent of writing “You should write good, maintainable code, obeying all relevant best practices” into PEP 8. *Of course* you should. This goes without saying. But that makes the advice not that helpful. Now, PEP 8 could *recommend* design patterns to follow, and that’s a whole other kettle of fish. But then we’re just asking a different question: how universally praised does a design pattern have to be to become part of PEP 8? Cory
![](https://secure.gravatar.com/avatar/60a2f1855ca0d8aac3fa75a57233a3f1.jpg?s=120&d=mm&r=g)
On Aug 8, 2016, at 12:16 PM, Cory Benfield <cory@lukasa.co.uk> wrote:
On 8 Aug 2016, at 17:06, Yury Selivanov <yselivanov@gmail.com> wrote:
As to having a PEP or putting something in PEP 8, I feel pretty lukewarm to those ideas. If the core Python team was able to legislate good coding practices via PEPs I think the world would be a very different place.
What does this mean? A lot of people, in any serious project at least, follow PEP 8 and coding with linters is a wide-spread practice.
Sure, but that only works for things that linters can enforce, and I sincerely doubt that this is one of them.
Correct. I don’t think it should be PEP 8 either. I think Guido’s idea on including h11 to the stdlib is cool, and that’s the better way to send the message. We can also add a *new* informational PEP instructing people on why (and how) they should write IO free protocol implementations. It would be great if you have time to champion such a PEP (I’d be glad to help if needed.) The other problem with sans-IO approach is that it takes longer to implement properly. You need at least a synchronous and an asynchronous versions to make sure that you got the design parts “right”. For instance, just few days ago we’ve open sourced a new PostgreSQL driver [1], and right now it’s bound to asyncio. While the foundational layer of the protocol is IO free, it would still take me a lot of time to (a) document it, (b) to build and test anything but asyncio on top of it. And, since our driver is far more feature packed (and faster!) than psycopg2, I’d love to make a synchronous version of it (but don’t have time atm). So what I was thinking about is a library that would provide a set of base Protocol (meta?) classes, one for asyncio, one for sync io, etc. Then the only thing you would need is to mixin your sans-IO protocol to them and write some glue code. Timeouts, cancellations, etc will be already taken care of for you. This is something that is super hard to make for protocols like HTTP, but shouldn’t be a big problem for DB protocols (redis, postgres, etc). Something to think about :) [1] http://magic.io/blog/asyncpg-1m-rows-from-postgres-to-python/ Thanks, Yury
How would you code a linter in order to confirm that protocol code and I/O code are separated?
More importantly though, people follow the bits of PEP 8 that are easy to enforce and that don’t require substantive architectural changes. If PEP 8 said something like “wherever possible use dependency injection as a design pattern”, that guideline would be ignored as both entirely unenforceable and totally subjective. Where is the line for “wherever possible”? How does one enforce the use of dependency injection? Can you programmatically determine dependency injection? What dependencies do not need to be injected?
Dependency injection is a great design pattern that produces lots of useful side effects, and I use it often. But if I saw PEP 8 mandating it I’d be extremely perplexed. Realistically, at a certain point it’s the equivalent of writing “You should write good, maintainable code, obeying all relevant best practices” into PEP 8. *Of course* you should. This goes without saying. But that makes the advice not that helpful.
Now, PEP 8 could *recommend* design patterns to follow, and that’s a whole other kettle of fish. But then we’re just asking a different question: how universally praised does a design pattern have to be to become part of PEP 8?
Cory
![](https://secure.gravatar.com/avatar/214c694acb154321379cbc58dc91528c.jpg?s=120&d=mm&r=g)
On 8 Aug 2016, at 17:33, Yury Selivanov <yselivanov@gmail.com> wrote:
On Aug 8, 2016, at 12:16 PM, Cory Benfield <cory@lukasa.co.uk> wrote:
On 8 Aug 2016, at 17:06, Yury Selivanov <yselivanov@gmail.com> wrote:
As to having a PEP or putting something in PEP 8, I feel pretty lukewarm to those ideas. If the core Python team was able to legislate good coding practices via PEPs I think the world would be a very different place.
What does this mean? A lot of people, in any serious project at least, follow PEP 8 and coding with linters is a wide-spread practice.
Sure, but that only works for things that linters can enforce, and I sincerely doubt that this is one of them.
Correct. I don’t think it should be PEP 8 either. I think Guido’s idea on including h11 to the stdlib is cool, and that’s the better way to send the message. We can also add a *new* informational PEP instructing people on why (and how) they should write IO free protocol implementations. It would be great if you have time to champion such a PEP (I’d be glad to help if needed.)
I am certainly happy to take the lead on a PEP like that if we believe there is value in it. I suspect I’d want to run it past this group quite a few times, because there are many others in this SIG (Brett, Nathaniel, Dave, and Glyph all jump to mind) that would have lots of useful things to say.
The other problem with sans-IO approach is that it takes longer to implement properly. You need at least a synchronous and an asynchronous versions to make sure that you got the design parts “right”.
Nah, that’s not necessary, at least for just the protocol part. You just need a test suite that tests it entirely in memory. Necessarily, if you can test it with tests that don’t involve writing mocks for anything in the socket, select, selectors, or asyncio modules, you’re probably in a pretty good place to be arguing that you’re I/O free. If your tests only use the public interfaces, then you’re totally set. The *outer* layers are where you need to duplicate work, but you don’t need both of those right away. If you have the low-level no-I/O layer in place, you can start with just writing (say) the asyncio implementation and leave the sync implementation for later in the day. I strongly, strongly advocate enforcing this distinction by splitting the no-I/O layer out into an entirely separate Python package that is separately versioned, and that you treat as third-party code from the perspective of your other modules. This has a nice tendency of getting into the headspace of thinking of it as an entirely separate module. In the case of OSS code, it also lets you push out the no-I/O version early: if your implementation really is faster than psycopg2, you might be able to convince someone else to come along and write the sync code instead! Anyway, I’ll stop now, because at a certain point I’ll just start writing that PEP in this email. Cory
![](https://secure.gravatar.com/avatar/e8600d16ba667cc8d7f00ddc9f254340.jpg?s=120&d=mm&r=g)
On Mon, 8 Aug 2016 at 11:12 Cory Benfield <cory@lukasa.co.uk> wrote:
On 8 Aug 2016, at 17:33, Yury Selivanov <yselivanov@gmail.com> wrote:
On Aug 8, 2016, at 12:16 PM, Cory Benfield <cory@lukasa.co.uk> wrote:
On 8 Aug 2016, at 17:06, Yury Selivanov <yselivanov@gmail.com> wrote:
As to having a PEP or putting something in PEP 8, I feel pretty
lukewarm to those ideas. If the core Python team was able to legislate good coding practices via PEPs I think the world would be a very different place.
What does this mean? A lot of people, in any serious project at least, follow PEP 8 and coding with linters is a wide-spread practice.
Sure, but that only works for things that linters can enforce, and I sincerely doubt that this is one of them.
Correct. I don’t think it should be PEP 8 either. I think Guido’s idea on including h11 to the stdlib is cool, and that’s the better way to send the message. We can also add a *new* informational PEP instructing people on why (and how) they should write IO free protocol implementations. It would be great if you have time to champion such a PEP (I’d be glad to help if needed.)
I am certainly happy to take the lead on a PEP like that if we believe there is value in it. I suspect I’d want to run it past this group quite a few times, because there are many others in this SIG (Brett, Nathaniel, Dave, and Glyph all jump to mind) that would have lots of useful things to say.
The lighter option is we put the equivalent of a PEP on sans-io.rtfd.io instead of my dinky paragraph or three quickly explaining why this is an important concept. That way there's no python-dev discussion and we can update it w/o issue (I'm happy to add more contributors to the GH repo).
The other problem with sans-IO approach is that it takes longer to implement properly. You need at least a synchronous and an asynchronous versions to make sure that you got the design parts “right”.
Nah, that’s not necessary, at least for just the protocol part. You just need a test suite that tests it entirely in memory.
Right, so in the case of asyncpg it would be pulling out the binary protocol parser and having that stand on its own. Then the asyncio part that uses the binary protocol parser you wrote would be what asyncpg is w/ the protocol parser becomings pgparser or something.
Necessarily, if you can test it with tests that don’t involve writing mocks for anything in the socket, select, selectors, or asyncio modules, you’re probably in a pretty good place to be arguing that you’re I/O free. If your tests only use the public interfaces, then you’re totally set.
Which "public interfaces" are you referring to? For me, any I/O-free library shouldn't be driving the I/O, just a producer/consumer of data. That means even if something follows e.g. the public interface of a socket that it wouldn't qualify as that suggests the library gets to make the call on when the I/O occurs and that you expose a socket for it to use.
The *outer* layers are where you need to duplicate work, but you don’t need both of those right away. If you have the low-level no-I/O layer in place, you can start with just writing (say) the asyncio implementation and leave the sync implementation for later in the day.
The way I think of it is you make the I/O-free library do as much as possible that allows it to be used in synchronous code, asyncio, and curio. Then it's up to people who are interested in supporting those protocols using those frameworks to wrap it as appropriate (if ever, e.g. maybe no one ever cares about SMTP on curio). -Brett
I strongly, strongly advocate enforcing this distinction by splitting the no-I/O layer out into an entirely separate Python package that is separately versioned, and that you treat as third-party code from the perspective of your other modules. This has a nice tendency of getting into the headspace of thinking of it as an entirely separate module. In the case of OSS code, it also lets you push out the no-I/O version early: if your implementation really is faster than psycopg2, you might be able to convince someone else to come along and write the sync code instead!
Anyway, I’ll stop now, because at a certain point I’ll just start writing that PEP in this email.
Cory
_______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
![](https://secure.gravatar.com/avatar/214c694acb154321379cbc58dc91528c.jpg?s=120&d=mm&r=g)
On 9 Aug 2016, at 18:35, Brett Cannon <brett@python.org> wrote:
Necessarily, if you can test it with tests that don’t involve writing mocks for anything in the socket, select, selectors, or asyncio modules, you’re probably in a pretty good place to be arguing that you’re I/O free. If your tests only use the public interfaces, then you’re totally set.
Which "public interfaces" are you referring to? For me, any I/O-free library shouldn't be driving the I/O, just a producer/consumer of data. That means even if something follows e.g. the public interface of a socket that it wouldn't qualify as that suggests the library gets to make the call on when the I/O occurs and that you expose a socket for it to use.
Sorry, I just meant the public API of the no-I/O library. Cory
![](https://secure.gravatar.com/avatar/214c694acb154321379cbc58dc91528c.jpg?s=120&d=mm&r=g)
On 9 Aug 2016, at 18:35, Brett Cannon <brett@python.org> wrote:
The lighter option is we put the equivalent of a PEP on sans-io.rtfd.io <http://sans-io.rtfd.io/> instead of my dinky paragraph or three quickly explaining why this is an important concept. That way there's no python-dev discussion and we can update it w/o issue (I'm happy to add more contributors to the GH repo).
Alright, enough talking, let’s start with the doing. Here is a GitHub pull request that begins to shape out a document like an informational proto-PEP on sans-io.rtfd.org <http://sans-io.rtfd.org/>: https://github.com/brettcannon/sans-io/pull/4 <https://github.com/brettcannon/sans-io/pull/4>. This document is absolutely early work, and I’d love for this SIG to provide as much feedback as possible on it, as well as extra text, or elaborations, or requests for clarification, or anything else. Cory
![](https://secure.gravatar.com/avatar/1fee087d7a1ca17c8ad348271819a8d5.jpg?s=120&d=mm&r=g)
Hi,
However, I think we can safely say that the Python community has not effectively done this over our twenty-plus year lifetime.
I'd like to offer a couple remarks here: 1) implementing a protocol usually goes beyond parsing (which, it's true, can easily be done "sans IO"). 2) many non-trivial protocols are stateful, at least at the level of a single connection; the statefulness may require doing I/O spontaneously (example: sending a keepalive packet). You can partly solve this by having a lower layer implementing the stateless parts ("sans IO") and an upper layer implementing the rest above it, but depending on the protocol it may be impossible to offer an *entire* implementation that doesn't depend on at least some notion of I/O. 3) the Protocol abstraction in asyncio (massively inspired from Twisted, of course) is a pragmatic way to minimize the I/O coupling of protocol implementations (and one of the reasons why I pushed for it during the PEP discussion): it still has some I/O-related elements to it (a couple callbacks on Protocol, and a couple methods on Transport), but in a way that makes ignoring them much easier than when using "streams", sockets or similar concepts. Regards Antoine.
![](https://secure.gravatar.com/avatar/e8600d16ba667cc8d7f00ddc9f254340.jpg?s=120&d=mm&r=g)
On Thu, 11 Aug 2016 at 10:23 Antoine Pitrou <antoine@python.org> wrote:
Hi,
However, I think we can safely say that the Python community has not effectively done this over our twenty-plus year lifetime.
I'd like to offer a couple remarks here:
1) implementing a protocol usually goes beyond parsing (which, it's true, can easily be done "sans IO").
Yes, but parsing the data is at least one of the steps that people have historically not factored out into a stand-alone library.
2) many non-trivial protocols are stateful, at least at the level of a single connection; the statefulness may require doing I/O spontaneously (example: sending a keepalive packet). You can partly solve this by having a lower layer implementing the stateless parts ("sans IO") and an upper layer implementing the rest above it, but depending on the protocol it may be impossible to offer an *entire* implementation that doesn't depend on at least some notion of I/O.
Also true. While this can either be handled by a state machine emitting keepalive events or simply telling people they may need to emit something, this doesn't detract from the fact that at least parsing the data off the wire can be done as a standalone project (making a state machine that works for the protocol w/o any I/O will vary from protocol to protocol).
3) the Protocol abstraction in asyncio (massively inspired from Twisted, of course) is a pragmatic way to minimize the I/O coupling of protocol implementations (and one of the reasons why I pushed for it during the PEP discussion): it still has some I/O-related elements to it (a couple callbacks on Protocol, and a couple methods on Transport), but in a way that makes ignoring them much easier than when using "streams", sockets or similar concepts.
Yep. Once again, no one is saying that a I/O-free approach to protocols will work in all situations, but it should be considered and in cases where it does work it's good to have and can be used with asyncio's ABCs. IOW you're totally right that I/O-free protocol libraries will not always work, but sometimes they do and people should thus consider structuring their libraries that way when it makes sense.
![](https://secure.gravatar.com/avatar/214c694acb154321379cbc58dc91528c.jpg?s=120&d=mm&r=g)
On 11 Aug 2016, at 17:56, Antoine Pitrou <antoine@python.org> wrote:
1) implementing a protocol usually goes beyond parsing (which, it's true, can easily be done "sans IO”).
Yes, no question. And in fact, the hyper-h2 docs are very clear on this point: http://python-hyper.org/projects/h2/en/stable/basic-usage.html <http://python-hyper.org/projects/h2/en/stable/basic-usage.html> You cannot just drop in hyper-h2 and expect anything sensible to happen. It needs to be hooked up to I/O, and the user needs to make some decisions themselves. However, I’m totally happy to stand by my original point, which was: regardless of whether or not writing a protocol parser and state machine is “easy” to do without I/O, the Python community has not done that for the last twenty years. The fact that the current list of sans-IO implementations is three entries long, two of which are less than a year old, is a good indication of that point. I don’t believe anyone is saying that sans-IO protocol implementations will remove *all* work from writing full-fledged implementations. Such a dream is impossible. But ideally they will remove a large chunk of the work.
2) many non-trivial protocols are stateful, at least at the level of a single connection; the statefulness may require doing I/O spontaneously (example: sending a keepalive packet). You can partly solve this by having a lower layer implementing the stateless parts ("sans IO") and an upper layer implementing the rest above it, but depending on the protocol it may be impossible to offer an *entire* implementation that doesn't depend on at least some notion of I/O.
So the spontaneous I/O question is an interesting one, not because it involves doing I/O so much as because your example fundamentally involves access to a *clock*. I haven’t had to deal with this yet, but I’ve been thinking about it a bit. My current conclusion is that a clock is basically an I/O tool: it’s a thing that needs to be controlled from the outside implementation. This is largely because when we want to use clocks what we really want to do is use *timers*, and timers are a flow control tool. This means that they fit into the category of thing I alluded to briefly in my talk: they’re a thing that the sans-IO implementation can provide help and guidance with, but not ultimately do itself. Another example of this has been flow control management in HTTP/2: while the sans-IO implementation can do a lot of the work, fundamentally it still needs you to tell it “I just dealt with 100kB of data, please free up that much space in the window”. There is no getting around this for almost any protocol, but that’s ok. Once again, the goal is not to subsume *everything* about a protocol: as you point out, that’s simply not possible. Instead, the goal is to subsume as much as possible.
3) the Protocol abstraction in asyncio (massively inspired from Twisted, of course) is a pragmatic way to minimize the I/O coupling of protocol implementations (and one of the reasons why I pushed for it during the PEP discussion): it still has some I/O-related elements to it (a couple callbacks on Protocol, and a couple methods on Transport), but in a way that makes ignoring them much easier than when using "streams", sockets or similar concepts.
I agree: this is why I used Twisted Protocols in my discussion with Nick in python-ideas. However, they do only *minimize* it. Most asyncio/Twisted Protocols quite happily issue writes to their transports willy-nilly, and also a great many of them create Futures (which makes sense: the abstraction into the coroutine world has to happen somewhere!). Once you create a Future, you are no longer “sans-IO”: a Future is an event loop construct. (On a side note, this is why Twisted has a slight hypothetical edge in the “sans-IO” race: an asyncio.Future is an event-loop construct, but a twisted.internet.defer.Deferred is not. Deferreds work perfectly without an event loop, but a Future always requires one.) The biggest problem, though, is that an asyncio Protocol is written like this: class MyProtocol(asyncio.Protocol): pass This provides substantial programmer baggage. Because asyncio has a blessed I/O model, it is very, very difficult for most programmers to think about writing a Protocol that isn’t going to be used that way. Even though, as I demonstrated with Twisted Protocols in python-ideas, they absolutely do not require an event loop if written carefully. This is part of why divorcing your protocol library from asyncio *entirely* (don’t even import it!) is helpful: it forces a clean, clear line in the sand that says “I do not care how you do I/O”. Twisted has been fighting this battle for years, and asyncio isn’t going to have a better time of it. Cory
![](https://secure.gravatar.com/avatar/3041a99ff2b84bc3dc10805020d35516.jpg?s=120&d=mm&r=g)
2016-08-08 8:56 GMT+02:00 Yarko Tymciurak <yarkot1@gmail.com>:
On Monday, August 8, 2016, Yarko Tymciurak <yarkot1@gmail.com> wrote:
I still have to wonder, though, how an async repl, from the inside-out,which handles a single task by default (synchronous equivalent) would be anything less than explicit, or would complicate much (if anything - I suspect a significant amount of the opposite).
Regardless, I am grateful for the discussions.
- Yarko
On Sunday, August 7, 2016, Ludovic Gasc <gmludo@gmail.com> wrote:
+1 for PEP, nothing more to add from technical point of view. An extra step to the right direction, at least to me. Thank you Yury for that :-)
About side conversation on sync/async split world, except to force coroutines pattern usage like in Go, I don't see how we can become more implicit. Even if the zen of Python recommands to prefer an explicit approach, I see more explicit/implicit as a balance you must adjust between Simplicity/Flexibility than a binary choice.
To me, the success of Python as language is also because you have a good balance between theses approaches, and the last move from "yield from" to "await" illustrates that: Hide the internal mechanisms of implementation, but keep the explicit way to declare that.
Like Andrew Svetlov, I don't believe a lot in the implicit approach of Gevent, because very quickly, you need to understand the extra tools, like synchronization primitives. The fact to know if you need to prefix with "await" or not the call of the functions is the tree that hides the forest.
With the async pattern, it's impossible to hide everything and everything will work automagically: You must understand a little bit what's happening, or it will be very complicated to debug.
To me, you can hide everything only if you are really sure it will work 100% of time without human intervention, like with autonomous Google cars.
However, it might be interesting to have an async "linter", that should list all blocking I/O code in async coroutines, to help new comers to find this type of bugs. But with the dynamic nature of Python, I don't know if it's realistic to try to implement that. To me, it should be a better answer than to try to remove all sync/async code differences.
Moreover, I see the need of async libs as an extra opportunity to challenge and simplify the Python toolbox.
For now, with aiohttp, you have an unified API for HTTP in general, contrary in sync world with requests and flask for example. At least to me, a client and a server are only two sides of the same piece. More true with p2p protocols.
As discussed several times, the next level might be more code reuse like suggested by David Beazley with SansIO, split protocol and I/O handling: https://twitter.com/dabeaz/status/761599925444550656?lang=fr
Question: isn't SansIO / Corey's work just a specific instance of Bob Martin's "Clean Architecture"? It sounds familiar to me, when thinking of Brandon Rhode's 2014 PyOhio talk, and his recast of the topic in Warsaw in 2015. It seems like it... If so, then perhaps async aspects are just a second aspect.
Maybe that sync/async dichotomy might be a concrete use case to justify this split. Nevertheless, it shouldn't be the first case where, theoretically, it's better to split layers, and finally, it's counter-productive during the implementation. Only one method to know that: Try to code to see what's happen. Certainly some protocols/transports should be easier to have this split than others: Interesting to know if somebody has already tried to have QUIC and HTTP/2 in the same time with Python.
I don't know yet if the benefit to share more code between implementations
will be more important than the potential complexity code increase.
The only point I'm sure for now: I'm preparing the pop-corn to watch the next episodes: curious to see what are the next ideas/implementations will emerge ;-) At least to me, it's more interesting than follow a TV serie, thank you for that :-)
Have a nice week.
Ludovic Gasc (GMLudo) http://www.gmludo.eu/
On 29 Jul 2016 20:50, "Yarko Tymciurak" <yarkot1@gmail.com> wrote:
On Friday, July 29, 2016, Yury Selivanov <yselivanov@gmail.com> wrote:
Comments inlined:
On Jul 29, 2016, at 2:20 PM, Yarko Tymciurak <yarkot1@gmail.com> wrote:
Hmm... I think we need to think about a future where, programmatically, there's little-to no distinction between async and synchronous functions. Pushing this down deeper in the system is the way to go. For one, it will serve simple multi-core use once gilectomy is completed (it, or something effectively equivalent will complete). For another, this is the path to reducing the functionally "useless" rewrite efforts of libraries (e.g. github.com/aio-libs), which somehow resemble all the efforts of migrating libraries from 2-3 (loosely). The resistance and unexpected time that 2-3 migration experienced won't readily be mimicked in async tasks - too much effort to get computer and I/O bound benefits? Maintain two versions of needed libraries, or jump languages is what will increasingly happen in the distributed (and more so IOT) world.
When and *if* gilectomy is completed (or another project to remove the GIL), we will be able to do this:
1) Run existing async/await applications as is, but instead of running a process per core, we will be able to run a single process with many threads. Likely one asyncio (or other) event loop per thread. This is very speculative, but possible in theory.
2) Run existing blocking IO applications in several threads in one process. This is something that only sounds like an easy thing to do, I suspect that a lot of code will break (or dead lock) when the GIL is removed. Even if everything works perfectly well, threads aren’t answer to all problems — try to manage 1000s of them.
Long story short, even if we had no GIL at all, having async/await (and non-blocking IO) would make sense. And if you have async/await, with GIL or without, you will inevitably have different APIs and different IO low-level libs that drive them.
There are ways to lessen the pain. For instance, I like Cory’s approach with hyper - implement protocols separately from IO, so that it’s easy to port them to various sync and async frameworks.
Time to think about paving the way to async-as first class citizen
world.
That's probably too much for this PEP, but the topic (a- prefixing)
is a good canary for the bigger picture we need to start mulling over.
So in this context (and in general w/ async) asking the question
"can we make it so it doesn't matter?" is a good one to always be asking - it will get is there.
Unfortunately there is no way to use the same APIs for both async/await and synchronous world. At least for CPython builtin types they have to have different names.
I’m fine to discuss the ‘a’ prefix, but I’m a bit afraid that focusing on it too much will distract us from the PEP and details of it that really matter.
Yury
To keep it simple, try thinking like this (and yes, Yury, apologies - this is now a side discussion, and not about this pep): everything in CPython is async, and if you don't want async, you don't need to know about, you run a single async task and don't need to know more...
Can we get there? That would be cool...
- Yarko
_______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
![](https://secure.gravatar.com/avatar/214c694acb154321379cbc58dc91528c.jpg?s=120&d=mm&r=g)
On 8 Aug 2016, at 11:16, Ludovic Gasc <gmludo@gmail.com> wrote:
Certainly some protocols/transports should be easier to have this split than others: Interesting to know if somebody has already tried to have QUIC and HTTP/2 in the same time with Python.
AFAIK they haven’t. This is partly because there’s no good QUIC implementation to bind from Python at this time. Chromium’s QUIC library requires a giant pool of custom C++ to bind it appropriately, and Go’s implementation includes a gigantic runtime and is quite large. As and when a good OSS QUIC library starts to surface, I’ll be able to answer this question more effectively. But I’m not expecting a huge issue. =) Cory
![](https://secure.gravatar.com/avatar/3041a99ff2b84bc3dc10805020d35516.jpg?s=120&d=mm&r=g)
2016-08-08 13:24 GMT+02:00 Cory Benfield <cory@lukasa.co.uk>:
On 8 Aug 2016, at 11:16, Ludovic Gasc <gmludo@gmail.com> wrote:
Certainly some protocols/transports should be easier to have this split than others: Interesting to know if somebody has already tried to have QUIC and HTTP/2 in the same time with Python.
AFAIK they haven’t. This is partly because there’s no good QUIC implementation to bind from Python at this time. Chromium’s QUIC library requires a giant pool of custom C++ to bind it appropriately, and Go’s implementation includes a gigantic runtime and is quite large.
I had the same conclusion. For now, I don't know what's the most complex: Try to do a Python binding or reimplement QUIC in Python ;-)
As and when a good OSS QUIC library starts to surface, I’ll be able to answer this question more effectively. But I’m not expecting a huge issue. =)
We'll see when it will happen ;-) Implemented in 2012, pushed on production by Google in 2013, and 3 years later, only one Web browser and one programming language have the support, to my knowledge. Nobody uses that except Google, or everybody already migrated on Go ? ;-) Or simply, it's too much complicated to use/debug/... ?
![](https://secure.gravatar.com/avatar/214c694acb154321379cbc58dc91528c.jpg?s=120&d=mm&r=g)
On 8 Aug 2016, at 22:11, Ludovic Gasc <gmludo@gmail.com> wrote:
We'll see when it will happen ;-) Implemented in 2012, pushed on production by Google in 2013, and 3 years later, only one Web browser and one programming language have the support, to my knowledge. Nobody uses that except Google, or everybody already migrated on Go ? ;-) Or simply, it's too much complicated to use/debug/... ?
Well, that’s not entirely true. Akamai have an implementation, as I understand it, though it’s based off the Chromium one. And I expect that others have stacks built in similar ways. My understanding is also that Microsoft are working on an implementation as well. The main reason is just that it’s not ready yet. Google have been taking their time with it, and so it’s been very changeable. In particular, QUIC is moving to use TLS 1.3 as its crypto solution, which is extremely tricky for those of us in Python-land, as OpenSSL currently does not have TLS 1.3 on their roadmap for anytime in the near future. That’ll mean using a different TLS library, which adds an extra wrinkle that is quite inconvenient. So for the medium term I expect this to remain true. FWIW, I’m following the QUIC working group, so I’ll be keeping a very close eye on this over the next few years. Cory
![](https://secure.gravatar.com/avatar/e1554622707bedd9202884900430b838.jpg?s=120&d=mm&r=g)
On Aug 8, 2016, at 2:11 PM, Ludovic Gasc <gmludo@gmail.com> wrote:
2016-08-08 13:24 GMT+02:00 Cory Benfield <cory@lukasa.co.uk <mailto:cory@lukasa.co.uk>>:
On 8 Aug 2016, at 11:16, Ludovic Gasc <gmludo@gmail.com <mailto:gmludo@gmail.com>> wrote:
Certainly some protocols/transports should be easier to have this split than others: Interesting to know if somebody has already tried to have QUIC and HTTP/2 in the same time with Python.
AFAIK they haven’t. This is partly because there’s no good QUIC implementation to bind from Python at this time. Chromium’s QUIC library requires a giant pool of custom C++ to bind it appropriately, and Go’s implementation includes a gigantic runtime and is quite large.
I had the same conclusion. For now, I don't know what's the most complex: Try to do a Python binding or reimplement QUIC in Python ;-)
As and when a good OSS QUIC library starts to surface, I’ll be able to answer this question more effectively. But I’m not expecting a huge issue. =)
We'll see when it will happen ;-) Implemented in 2012, pushed on production by Google in 2013, and 3 years later, only one Web browser and one programming language have the support, to my knowledge. Nobody uses that except Google, or everybody already migrated on Go ? ;-) Or simply, it's too much complicated to use/debug/... ?
My understanding is that QUIC was always intended to be an experimental thing mostly (although not entirely) internal to Google. The output of QUIC experimentation has been funneled into standards efforts like TLS 1.3 and HTTP/2. Although I think the protocol as a whole may survive in some form, statements like this one: <https://docs.google.com/document/d/1g5nIXAIkN_Y-7XJW5K45IblHd_L2f5LTaDUDwvZ5L6g/edit?pref=2&pli=1> “The QUIC crypto protocol is the part of QUIC that provides transport security to a connection. The QUIC crypto protocol is destined to die. It will be replaced by TLS 1.3 in the future, but QUIC needed a crypto protocol before TLS 1.3 was even started.” make it difficult to get excited about implementing the protocol as it stands today. -glyph
![](https://secure.gravatar.com/avatar/4e332fe1cf22e027fd875b467a835ae4.jpg?s=120&d=mm&r=g)
When an asynchronous generator is about to be garbage collected, it calls its cached finalizer. The assumption is that the finalizer will schedule an ``aclose()`` call with the loop that was active when the iteration started.
For instance, here is how asyncio can be modified to allow safe finalization of asynchronous generators::
# asyncio/base_events.py
class BaseEventLoop:
def run_forever(self): ... old_finalizer = sys.get_asyncgen_finalizer() sys.set_asyncgen_finalizer(self._finalize_asyncgen) try: ... finally: sys.set_asyncgen_finalizer(old_finalizer) ...
def _finalize_asyncgen(self, gen): self.create_task(gen.aclose())
``sys.set_asyncgen_finalizer`` is thread-specific, so several event loops running in parallel threads can use it safely.
When asynchronous generator is about to be garbage collected event loop starts task to execute aclose, right? Can't such situation happen when this task is not finished at the moment event loop is closed? Something like: async def gen(): try: yield 1 yield 2 yield 3 except GeneratorExit as exc: await asyncio.sleep(100) raise exc async def main(): async for i in gen(): if i == 1: break # main() and event loop is about to finish here, # while task to aclose will last much longer. if __name__ == "__main__": loop = asyncio.get_event_loop() loop.run_until_complete(main())
![](https://secure.gravatar.com/avatar/60a2f1855ca0d8aac3fa75a57233a3f1.jpg?s=120&d=mm&r=g)
On Jul 30, 2016, at 3:32 AM, Герасимов Михаил <Gerasimov-M-N@yandex.ru> wrote:
When asynchronous generator is about to be garbage collected event loop starts task to execute aclose, right? Can't such situation happen when this task is not finished at the moment event loop is closed? Something like:
Yes, such situations can happen. BUT they can happen for regular coroutines too! Try running the following: async def coro1(): try: print('try') await asyncio.sleep(1) finally: await asyncio.sleep(0) print('finally') async def coro2(): await asyncio.sleep(0) loop = asyncio.get_event_loop() loop.create_task(coro1()) loop.run_until_complete(coro2()) loop.close() In the above script, coro1() enters its `try` block and prints “try”; loop ends the execution because coro2() is completed; and coro1() will never execute its `finally` block. Long story short - you have to be extra careful when you’re closing the event loop. I believe that not executing “finally” statements when the process is about to complete will be more common for coroutines than async generators. Also, in real life programs you don’t run and close event loops more than once per process lifetime. So it should be fine if you have a few non-exhausted generators and coroutines being GCed without proper finalization just before the process exits. And, BTW, Python interpreter will issue a ResourceWarning that it cannot finalize coroutines and/or async generators. Yury
![](https://secure.gravatar.com/avatar/4e332fe1cf22e027fd875b467a835ae4.jpg?s=120&d=mm&r=g)
Looks like to be sure everything finished correctly, we'll need to do something like this every time: loop.run_until_complete(main()) pending = asyncio.Task.all_tasks() loop.run_until_complete(asyncio.gather(*pending)) loop.close() Bad thing here I think is that we can't split manually created tasks (wich warnings we might would like to get to complete this tasks in other places) and AG's close tasks wich we can't complete other place then here. Don't you like idea to await all AG's close tasks done right before event loop is closed? I mean to modify event loop this way: # asyncio/base_events.py class BaseEventLoop: def _finalize_asyncgen(self, gen): task = self.create_task(gen.aclose()) self._close_tasks.append(task) def close(): self.run_until_complete( asyncio.gather(*self._close_tasks) ) ... In this case user would be able to use async generators without worring he will get warning.
![](https://secure.gravatar.com/avatar/7775d42d960a69e98fecf270bdeb6f57.jpg?s=120&d=mm&r=g)
It may wait forever in case of malformed generator implementation. On Sat, Jul 30, 2016 at 2:04 PM Герасимов Михаил <Gerasimov-M-N@yandex.ru> wrote:
Looks like to be sure everything finished correctly, we'll need to do something like this every time:
loop.run_until_complete(main())
pending = asyncio.Task.all_tasks() loop.run_until_complete(asyncio.gather(*pending))
loop.close()
Bad thing here I think is that we can't split manually created tasks (wich warnings we might would like to get to complete this tasks in other places) and AG's close tasks wich we can't complete other place then here.
Don't you like idea to await all AG's close tasks done right before event loop is closed? I mean to modify event loop this way:
# asyncio/base_events.py
class BaseEventLoop: def _finalize_asyncgen(self, gen): task = self.create_task(gen.aclose()) self._close_tasks.append(task)
def close(): self.run_until_complete( asyncio.gather(*self._close_tasks) ) ...
In this case user would be able to use async generators without worring he will get warning.
_______________________________________________ Async-sig mailing list Async-sig@python.org https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
![](https://secure.gravatar.com/avatar/7775d42d960a69e98fecf270bdeb6f57.jpg?s=120&d=mm&r=g)
When you have `await task` line you probably know what task it is, you can debug and fix it. But hanging on `loop.close()` because some unknown task hangs looks very confusing. The approach gives no clue for what task is malformed. Users will blame that `loop.close()` hangs forever without any reason or additional information. On Sat, Jul 30, 2016 at 2:15 PM Герасимов Михаил <Gerasimov-M-N@yandex.ru> wrote:
It may wait forever in case of malformed generator implementation.
Yes, but every "await task" can wait forever in case of malformed task implementation. We can't do anything in case user will write:
while True: pass
--
Thanks, Andrew Svetlov
![](https://secure.gravatar.com/avatar/4e332fe1cf22e027fd875b467a835ae4.jpg?s=120&d=mm&r=g)
When you have `await task` line you probably know what task it is, you can debug and fix it.
But hanging on `loop.close()` because some unknown task hangs looks very confusing. The approach gives no clue for what task is malformed.
Users will blame that `loop.close()` hangs forever without any reason or additional information.
ok, regardless placing this code inside `loop.close()` is there any alternative to sure async generators inside `main()` are closed? loop.run_until_complete(main()) pending = asyncio.Task.all_tasks() loop.run_until_complete(asyncio.gather(*pending)) loop.close() Currently, I just don't see any other way to use some async generator (that needs some time to be closed) without getting warning. In case malformed generator is unwantedbehaviour may be it'll be ok to show warning in debug mode if some close task isn't finished after some timeout?
![](https://secure.gravatar.com/avatar/7775d42d960a69e98fecf270bdeb6f57.jpg?s=120&d=mm&r=g)
Adding the feature under PYTHONASYNCIODEBUG flag sounds perfectly reasonable for me. But not in production mode. On Sat, Jul 30, 2016 at 6:08 PM Герасимов Михаил <Gerasimov-M-N@yandex.ru> wrote:
When you have `await task` line you probably know what task it is, you can debug and fix it.
But hanging on `loop.close()` because some unknown task hangs looks very confusing. The approach gives no clue for what task is malformed.
Users will blame that `loop.close()` hangs forever without any reason or additional information.
ok, regardless placing this code inside `loop.close()` is there any alternative to sure async generators inside `main()` are closed?
loop.run_until_complete(main())
pending = asyncio.Task.all_tasks() loop.run_until_complete(asyncio.gather(*pending))
loop.close()
Currently, I just don't see any other way to use some async generator (that needs some time to be closed) without getting warning.
In case malformed generator is unwanted behaviour may be it'll be ok to show warning in debug mode if some close task isn't finished after some timeout?
-- Thanks, Andrew Svetlov
![](https://secure.gravatar.com/avatar/60a2f1855ca0d8aac3fa75a57233a3f1.jpg?s=120&d=mm&r=g)
[..]
Currently, I just don't see any other way to use some async generator (that needs some time to be closed) without getting warning.
It will be a very rare case — you need a generator that produces an infinite series of values with a try..finally block. For those kinds of generators you will have a standard asyncio warning that a task was GCed in pending state. We can maintain a weak set of ‘aclose’ tasks in the event loop, and make it a public and documented property. That way you’ll be able to call gather() with a timeout on that list before closing the loop. The PEP doesn’t specify how exactly asyncio should be modified. This is something we can discuss when the PEP patch is reviewed. Yury
![](https://secure.gravatar.com/avatar/4e332fe1cf22e027fd875b467a835ae4.jpg?s=120&d=mm&r=g)
It will be a very rare case — you need a generator that produces an infinite series of values with a try..finally block.
If I'm not mistaken it'll be any generator with `break` before it yieldedlast value(and with try...finally block that is usually context manager). def gen(): async with cm_long_aexit(): yield 1 yield 2 yield 3 async for i in gen(): if i == 2: break
We can maintain a weak set of ‘aclose’ tasks in the event loop, and make it a public and documented property. That way you’ll be able to call gather() with a timeout on that list before closing the loop.
It solves the issue.
participants (11)
-
Andrew Svetlov
-
Antoine Pitrou
-
Brett Cannon
-
Cory Benfield
-
David Beazley
-
Glyph Lefkowitz
-
Guido van Rossum
-
Ludovic Gasc
-
Yarko Tymciurak
-
Yury Selivanov
-
Герасимов Михаил