[Python-ideas] More general "for" loop handling

Andrew Barnert abarnert at yahoo.com
Fri May 1 13:19:16 CEST 2015


On Apr 30, 2015, at 17:35, Steven D'Aprano <steve at pearwood.info> wrote:
> 
>> On Thu, Apr 30, 2015 at 07:12:11PM +0200, Todd wrote:
>> On Thu, Apr 30, 2015 at 1:36 PM, Steven D'Aprano <steve at pearwood.info>
>> wrote:
> 
>>> A parallel version of map makes sense, because the semantics of map are
>>> well defined: given a function f and a sequence [a, b, c, ...] it
>>> creates a new sequence [f(a), f(b), f(c), ...]. The assumption is that f
>>> is a pure-function which is side-effect free (if it isn't, you're going
>>> to have a bad time). The specific order in which a, b, c etc. are
>>> processed doesn't matter. If it does matter, then map is the wrong way
>>> to process it.
>> multiprocessing.Pool.map guarantees ordering.  It is
>> multiprocessing.Pool.imap_unordered that doesn't.
> 
> I don't think it guarantees ordering in the sense I'm referring to. It 
> guarantees that the returned result will be [f(a), f(b), f(c), ...] in 
> that order, but not that f(a) will be calculated before f(b), which is 
> calculated before f(c), ... and so on. That's the point of parallelism: 
> if f(a) takes a long time to complete, another worker may have completed 
> f(b) in the meantime.
> 
> The point I am making is that map() doesn't have any connotations of the 
> order of execution, where as for loops have a very strong connotation of 
> executing the block in a specific sequence. People don't tend to use map 
> with a function with side-effects:
> 
>    map(lambda i: print(i) or i, range(100))
> 
> will return [0, 1, 2, ..., 99] but it may not print 0 1 2 3 ... in that 
> order. But with a for-loop, it would be quite surprising if
> 
>   for i in range(100):
>       print(i)
> 
> printed the values out of order. In my opinion, sticking "mypool" in 
> front of the "for i" doesn't change the fact that adding parallelism to 
> a for loop would be surprising and hard to reason about.
> 
> If you still wish to argue for this, one thing which may help your case 
> is if you can identify other programming languages that have already 
> done something similar.

The obvious thing to look at here seems to be OpenMP's parallel for. I haven't used it in a long time, but IIRC, in the C bindings, you use it something like:

    #pragma omp_parallel_for
    for (int i=0; i!=100; ++i) {
        lots_of_work(i);
    }

... and it turns it into something like:

    for (int i=0; i!=100; ++i) {
        queue_put(current_team_queue, processed loop body thingy);
    }
    queue_wait(current_team_queue, 100);

> 
> -- 
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


More information about the Python-ideas mailing list