[Python-ideas] More general "for" loop handling
Andrew Barnert
abarnert at yahoo.com
Fri May 1 13:19:16 CEST 2015
On Apr 30, 2015, at 17:35, Steven D'Aprano <steve at pearwood.info> wrote:
>
>> On Thu, Apr 30, 2015 at 07:12:11PM +0200, Todd wrote:
>> On Thu, Apr 30, 2015 at 1:36 PM, Steven D'Aprano <steve at pearwood.info>
>> wrote:
>
>>> A parallel version of map makes sense, because the semantics of map are
>>> well defined: given a function f and a sequence [a, b, c, ...] it
>>> creates a new sequence [f(a), f(b), f(c), ...]. The assumption is that f
>>> is a pure-function which is side-effect free (if it isn't, you're going
>>> to have a bad time). The specific order in which a, b, c etc. are
>>> processed doesn't matter. If it does matter, then map is the wrong way
>>> to process it.
>> multiprocessing.Pool.map guarantees ordering. It is
>> multiprocessing.Pool.imap_unordered that doesn't.
>
> I don't think it guarantees ordering in the sense I'm referring to. It
> guarantees that the returned result will be [f(a), f(b), f(c), ...] in
> that order, but not that f(a) will be calculated before f(b), which is
> calculated before f(c), ... and so on. That's the point of parallelism:
> if f(a) takes a long time to complete, another worker may have completed
> f(b) in the meantime.
>
> The point I am making is that map() doesn't have any connotations of the
> order of execution, where as for loops have a very strong connotation of
> executing the block in a specific sequence. People don't tend to use map
> with a function with side-effects:
>
> map(lambda i: print(i) or i, range(100))
>
> will return [0, 1, 2, ..., 99] but it may not print 0 1 2 3 ... in that
> order. But with a for-loop, it would be quite surprising if
>
> for i in range(100):
> print(i)
>
> printed the values out of order. In my opinion, sticking "mypool" in
> front of the "for i" doesn't change the fact that adding parallelism to
> a for loop would be surprising and hard to reason about.
>
> If you still wish to argue for this, one thing which may help your case
> is if you can identify other programming languages that have already
> done something similar.
The obvious thing to look at here seems to be OpenMP's parallel for. I haven't used it in a long time, but IIRC, in the C bindings, you use it something like:
#pragma omp_parallel_for
for (int i=0; i!=100; ++i) {
lots_of_work(i);
}
... and it turns it into something like:
for (int i=0; i!=100; ++i) {
queue_put(current_team_queue, processed loop body thingy);
}
queue_wait(current_team_queue, 100);
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
More information about the Python-ideas
mailing list