[Python-ideas] More general "for" loop handling

Steven D'Aprano steve at pearwood.info
Thu Apr 30 13:36:45 CEST 2015


On Thu, Apr 30, 2015 at 11:48:21AM +0200, Todd wrote:
> Looking at pep 492, it seems to me the handling of "for" loops has use
> outside of just asyncio.  The primary use-case I can think of is
> multiprocessing and multithreading.
> 
> For example, you could create a multiprocessing pool, and let the pool
> handle the items in a "for" loop, like so:
> 
>     from multiprocessing import Pool
> 
>     mypool = Pool(10, maxtasksperchild=2)
> 
>     mypool for item in items:
>         do_something_here
>         do_something_else
>         do_yet_another_thing

That's a very pretty piece of pseudo-code (actually, I lie, I don't 
think it is pretty at all, but for the sake of the argument let's 
pretend it is) but what does it do? How does it do it?

Let's be concrete:

mypool = Pool(10, maxtasksperchild=2)
items = range(1000)
mypool for item in items:
     print(item)
     if item == 30:
         break
     x = item + 1

print(x)


What gets printed?

A parallel version of map makes sense, because the semantics of map are 
well defined: given a function f and a sequence [a, b, c, ...] it 
creates a new sequence [f(a), f(b), f(c), ...]. The assumption is that f 
is a pure-function which is side-effect free (if it isn't, you're going 
to have a bad time). The specific order in which a, b, c etc. are 
processed doesn't matter. If it does matter, then map is the wrong way 
to process it.

But a parallel version of for does not make sense to me. (I must admit, 
I'm having trouble understanding what the "async for" will do too.) By 
definition, a for-loop is supposed to be sequential. Loop the first 
time, *then* the second time, *then* the third time. There's no 
presumption of the body of the for-block being side-effect free, and 
you're certainly not free to perform the loops in some other order.


> Of course this sort of thing is possible with iterators and maps today, but
> I think a lot of the same advantages that apply to asyncio also apply to
> these sorts of cases.  So I think that, rather than having a special
> keyword just for asyncio, I think it would be better to have a more
> flexible approach.  Perhaps something like a "__for__" magic method that
> lets a class implement "for" loop handling, along with the corresponding
> changes in how the language processes the "for" loop.

"async for" hasn't proven itself yet, and you are already looking to 
generalise it? Shouldn't it prove itself as not a mistake first?


-- 
Steve


More information about the Python-ideas mailing list