Adding a Par construct to Python?

Sun May 17 12:53:02 EDT 2009

In article <0220260f$0$20645$c3e8da3 at news.astraweb.com>,
 Steven D'Aprano <steve at REMOVE-THIS-cybersource.com.au> wrote:

> But reduce()? I can't see how you can parallelize reduce(). By its 
> nature, it has to run sequentially: it can't operate on the nth item 
> until it is operated on the (n-1)th item.

Well, if you're willing to impose the additional constraint that f() must 
be associative, then you could load the items into a tree, and work your 
way up from the bottom of the tree, applying f() pairwise to the left and 
right child of each node, propagating upward.

It would take k1 * O(n) to create the (unsorted) tree, and if all the pairs 
in each layer really could be done in parallel, k2 * O(lg n) to propagate 
the intermediate values.  As long as k2 is large compared to k1, you win.

Of course, if the items are already in some random-access container (such 
as a list), you don't even need to do the first step, but in the general 
case of generating the elements on the fly with an iterable, you do.  Even 
with an iterable, you could start processing the first elements while 
you're still generating the rest of them, but that gets a lot more 
complicated and assuming k2 >> k1, of limited value.

If k2 is about the same as k1, then the whole thing is pointless.

But, this would be something to put in a library function, or maybe a 
special-purpose Python derivative, such as numpy.  Not in the core language.