[Cython] prange CEP updated

mark florisson markflorisson88 at gmail.com
Mon Apr 18 13:06:19 CEST 2011


On 16 April 2011 18:42, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no> wrote:
> (Moving discussion from http://markflorisson.wordpress.com/, where Mark
> said:)

Ok, sure, it was just an issue I was wondering about at that moment,
but it's a tricky issue, so thanks.

> """
> Started a new branch https://github.com/markflorisson88/cython/tree/openmp .
>
> Now the question is whether sharing attributes should be propagated
> outwards. e.g. if you do
>
> for i in prange(m):
>    for j in prange(n):
>        sum += i * j
>
> then ‘sum’ is a reduction for the inner parallel loop, but not for the outer
> one. So the user would currently have to rewrite this to
>
> for i in prange(m):
>    for j in prange(n):
>        sum += i * j
>    sum += 0
>
> which seems a bit silly  . Of course, we could just disable nested
> parallelism, or tell the users to use a prange and a ‘for from’ in such
> cases.
> """
>
> Dag: Interesting. The first one is definitely the behaviour we want, as long
> as it doesn't cause unintended consequences.
>
> I don't really think it will -- the important thing is that that the order
> of loop iteration evaluation must be unimportant. And that is still true
> (for the outer loop, as well as for the inner) in your first example.
>
> Question: When you have nested pranges, what will happen is that two nested
> OpenMP parallel blocks are used, right? And do you know if there is complete
> freedom/"reentrancy" in that variables that are thread-private in an outer
> parallel block and be shared in an inner one, and vice versa?

An implementation may or may not support it, and if it is supported
the behaviour can be configured through omp_set_nested(). So we should
consider the case where it is supported and enabled.

If you have a lastprivate or reduction, and after the loop these are
(reduced and) assigned to the original variable. So if that happens
inside a parallel construct which does not declare the variable
private to the construct, you actually have a race. So e.g. the nested
prange currently races in the outer parallel range.

> If so I'd think that this algorithm should work and feel natural:
>
>  - In each prange, for the purposes of variable private/shared/reduction
> inference, consider all internal "prange" just as if they had been "range";
> no special treatment.
>
>  - Recurse to children pranges.

Right, that is most natural. Algorithmically, reductions and
lastprivates (as those can have races if placed in inner parallel
constructs) propagate outwards towards the outermost parallel block,
or up to the first parallel with block, or up to the first construct
that already determined the sharing attribute.

e.g.

with parallel:
     with parallel:
        for i in prange(n):
            for j in prange(n):
                sum += i * j
     # sum is well-defined here
# sum is undefined here

Here 'sum' is a reduction for the two innermost loops. 'sum' is not
private for the inner parallel with block, as a prange in a parallel
with block is a worksharing loop that binds to that parallel with
block. However, the outermost parallel with block declares sum (and i
and j) private, so after that block all those variables become
undefined.

However, in the outermost parallel with block, sum will have to be
initialized to 0 before anything else, or be declared firstprivate,
otherwise 'sum' is undefined to begin with. Do you think declaring it
firstprivate would be the way to go, or should we make it private and
issue a warning or perhaps even an error?

> DS
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>


More information about the cython-devel mailing list