[Python-ideas] More general "for" loop handling

Guido van Rossum guido at python.org
Fri May 1 05:29:22 CEST 2015


On Thu, Apr 30, 2015 at 6:07 PM, Yury Selivanov <yselivanov.ml at gmail.com>
wrote:

> On 2015-04-30 9:02 PM, Ethan Furman wrote:
>
>> On 04/30, Yury Selivanov wrote:
>>
>>> On 2015-04-30 8:35 PM, Steven D'Aprano wrote:
>>>
>>>> I don't think it guarantees ordering in the sense I'm referring to. It
>>>> guarantees that the returned result will be [f(a), f(b), f(c), ...] in
>>>> that order, but not that f(a) will be calculated before f(b), which is
>>>> calculated before f(c), ... and so on. That's the point of parallelism:
>>>> if f(a) takes a long time to complete, another worker may have completed
>>>> f(b) in the meantime.
>>>>
>>> This is an *excellent* point.
>>>
>> So, PEP 492 asynch for also guarantees that the loop runs in order, one at
>> a time, with one loop finishing before the next one starts?
>>
>> *sigh*
>>
>> How disappointing.
>>
>>
>
> No.  Nothing prevents you from scheduling asynchronous
> parallel computation, or prefetching more data.  Since
> __anext__ is an awaitable you can do that.
>

That's not Ethan's point. The 'async for' statement indeed is a sequential
loop: e.g. if you write

  async for rec in db_cursor:
      print(rec)

you are guaranteed that the records are printed in the order in which they
are produced by the database cursor. There is no implicit parallellism of
the execution of the loop bodies. Of course you can introduce parallelism,
but you have to be explicit about it, e.g. by calling some async function
for each record *without* awaiting for the result, e.g. collecting the
awaitables in a separate list and then using e.g. the gather() operation
from the asyncio package:

  async def process_record(rec):
      print(rec)

  fs = []
  for rec in db_cursor:
      fs.append(process_record(rec))
  await asyncio.gather(*fs)

This may print the records in arbitrary order. Note that unlike threads,
you don't need locks, since there is no worry about parallel access to
sys.stdout by print(). The print() function does not guarantee atomicity
when it writes to sys.stdout, and in a threaded version of the above code
you might occasionally see two records followed by two \n characters,
because threads can be arbitrarily interleaved. Task switching between
coroutines only happens at await (or yield [from] :-) and at the await
points specified by PEP 492 in the 'async for' and 'async with' statements.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150430/67a390c3/attachment.html>


More information about the Python-ideas mailing list