[Python-Dev] PEP 492: What is the real goal?

Wed Apr 29 20:06:23 CEST 2015

Hi Jim,

On 2015-04-29 1:43 PM, Jim J. Jewett wrote:
> On Tue Apr 28 23:49:56 CEST 2015, Guido van Rossum quoted PEP 492:
>
>> Rationale and Goals
>> ===================
>>
>> Current Python supports implementing coroutines via generators (PEP
>> 342), further enhanced by the ``yield from`` syntax introduced in PEP
>> 380. This approach has a number of shortcomings:
>>
>> * it is easy to confuse coroutines with regular generators, since they
>>    share the same syntax; async libraries often attempt to alleviate
>>    this by using decorators (e.g. ``@asyncio.coroutine`` [1]_);
> So?  PEP 492 never says what coroutines *are* in a way that explains
> why it matters that they are different from generators.
>
> Do you really mean "coroutines that can be suspended while they wait
> for something slow"?
>
> As best I can guess, the difference seems to be that a "normal"
> generator is using yield primarily to say:
>
>      "I'm not done; I have more values when you want them",
>
> but an asynchronous (PEP492) coroutine is primarily saying:
>
>      "This might take a while, go ahead and do something else meanwhile."

Correct.

>
>> As shown later in this proposal, the new ``async
>> with`` statement lets Python programs perform asynchronous calls when
>> entering and exiting a runtime context, and the new ``async for``
>> statement makes it possible to perform asynchronous calls in iterators.
> Does it really permit *making* them, or does it just signal that you
> will be waiting for them to finish processing anyhow, and it doesn't
> need to be a busy-wait?

I does.

>
> As nearly as I can tell, "async with" doesn't start processing the
> managed block until the "asynchronous" call finishes its work -- the
> only point of the async is to signal a scheduler that the task is
> blocked.

Right.

>
> Similarly, "async for" is still linearized, with each step waiting
> until the previous "asynchronous" step was not merely launched, but
> fully processed.  If anything, it *prevents* within-task parallelism.

It enables cooperative parallelism.

>
>> It uses the ``yield from`` implementation with an extra step of
>> validating its argument.  ``await`` only accepts an *awaitable*, which
>> can be one of:
> What justifies this limitation?

We want to avoid people passing regular generators and random
objects to 'await', because it is a bug.

>
> Is there anything wrong awaiting something that eventually uses
> "return" instead of "yield", if the "this might take a while" signal
> is still true?

If it's an 'async def' then sure, you can use it in await.

> Is the problem just that the current implementation
> might not take proper advantage of task-switching?
>
>>    Objects with ``__await__`` method are called *Future-like* objects in
>>    the rest of this PEP.
>>
>>    Also, please note that ``__aiter__`` method (see its definition
>>    below) cannot be used for this purpose.  It is a different protocol,
>>    and would be like using ``__iter__`` instead of ``__call__`` for
>>    regular callables.
>>
>>    It is a ``TypeError`` if ``__await__`` returns anything but an
>>    iterator.
> What would be wrong if a class just did __await__ = __anext__  ?
> If the problem is that the result of __await__ should be iterable,
> then why isn't __await__ = __aiter__ OK?

For coroutines in PEP 492:

__await__ = __anext__ is the same as __call__ = __next__
__await__ = __aiter__ is the same as __call__ = __iter__

>
>> ``await`` keyword is defined differently from ``yield`` and ``yield
>> from``.  The main difference is that *await expressions* do not require
>> parentheses around them most of the times.
> Does that mean
>
> "The ``await`` keyword has slightly higher precedence than ``yield``,
> so that fewer expressions require parentheses"?
>
>>      class AsyncContextManager:
>>          async def __aenter__(self):
>>              await log('entering context')
> Other than the arbitrary "keyword must be there" limitations imposed
> by this PEP, how is that different from:
>
>       class AsyncContextManager:
>           async def __aenter__(self):
>               log('entering context')

This is OK. The point is that you can use 'await log' in
__aenter__.  If you don't need awaits in __aenter__ you can
use them in __aexit__. If you don't need them there too,
then just define a regular context manager.

>
> or even:
>
>       class AsyncContextManager:
>           def __aenter__(self):
>               log('entering context')
>
> Will anything different happen when calling __aenter__ or log?
> Is it that log itself now has more freedom to let other tasks run
> in the middle?

__aenter__ must return an awaitable.

>
>
>> It is an error to pass a regular context manager without ``__aenter__``
>> and ``__aexit__`` methods to ``async with``.  It is a ``SyntaxError``
>> to use ``async with`` outside of a coroutine.
> Why?  Does that just mean they won't take advantage of the freedom
> you offered them?

Not sure I understand the question.

It doesn't make any sense in using 'async with' outside of a
coroutine.  The interpeter won't know what to do with them:
you need an event loop for that.

> Or are you concerned that they are more likely to
> cooperate badly with the scheduler in practice?
>
>> It is a ``TypeError`` to pass a regular iterable without ``__aiter__``
>> method to ``async for``.  It is a ``SyntaxError`` to use ``async for``
>> outside of a coroutine.
> The same questions about why -- what is the harm?
>
>> The following code illustrates new asynchronous iteration protocol::
>>
>>      class Cursor:
>>          def __init__(self):
>>              self.buffer = collections.deque()
>>
>>          def _prefetch(self):
>>              ...
>>
>>          async def __aiter__(self):
>>              return self
>>
>>          async def __anext__(self):
>>              if not self.buffer:
>>                  self.buffer = await self._prefetch()
>>                  if not self.buffer:
>>                      raise StopAsyncIteration
>>              return self.buffer.popleft()
>>
>> then the ``Cursor`` class can be used as follows::
>>
>>      async for row in Cursor():
>>          print(row)
> Again, I don't see what this buys you except that a scheduler has
> been signaled that it is OK to pre-empt between rows.  That is worth
> signaling, but I don't see why a regular iterator should be
> forbidden.

It's not about signaling. It's about allowing cooperative
scheduling of long-running processes.

>
>> For debugging this kind of mistakes there is a special debug mode in
>> asyncio, in which ``@coroutine`` decorator wraps all functions with a
>> special object with a destructor logging a warning.
> ...
>> The only problem is how to enable these debug capabilities.  Since
>> debug facilities should be a no-op in production mode, ``@coroutine``
>> decorator makes the decision of whether to wrap or not to wrap based on
>> an OS environment variable ``PYTHONASYNCIODEBUG``.
> So the decision is made at compile-time, and can't be turned on later?
> Then what is wrong with just offering an alternative @coroutine that
> can be used to override the builtin?
>
> Or why not just rely on set_coroutine_wrapper entirely, and simply
> set it to None (so no wasted wrappings) by default?

It is set to None by default. Will clarify that in the PEP.

Thanks,
Yury