[Cython] prange CEP updated

mark florisson markflorisson88 at gmail.com
Mon Apr 18 16:05:44 CEST 2011


On 18 April 2011 16:01, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no> wrote:
> (apologies for top post)

No problem, it means I have to scroll less :)

> This all seems to scream 'disallow' to me, in particular since some openmp
> implementations may not support it etc.
>
> At any rate I feel 'parallel/parallel/prange/prange' is going to far; so
> next step could be to only allowing 'parallel/prange/parallel/prange'.
>
> But really, my feeling is that if you really do need this then you can
> always write a seperate function for the inner loop (I honestly can't think
> of a usecase anyway...). So I'd really drop it; at least until the rest of
> the gsoc project is completed :)

Ok, sure, I'll disallow it. Then the user won't be able to make
mistakes and I don't have to detect the case and issue a warning for
inner reductions or lastprivates :).

> DS
> --
> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>
> mark florisson <markflorisson88 at gmail.com> wrote:
>>
>> On 16 April 2011 18:42, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no>
>> wrote: > (Moving discussion from http://markflorisson.wordpress.com/, where
>> Mark > said:) Ok, sure, it was just an issue I was wondering about at that
>> moment, but it's a tricky issue, so thanks. > """ > Started a new branch
>> https://github.com/markflorisson88/cython/tree/openmp . > > Now the question
>> is whether sharing attributes should be propagated > outwards. e.g. if you
>> do > > for i in prange(m): >    for j in prange(n): >        sum += i * j >
>> > then ‘sum’ is a reduction for the inner parallel loop, but not for the
>> outer > one. So the user would currently have to rewrite this to > > for i
>> in prange(m): >    for j in prange(n): >        sum += i * j >    sum += 0 >
>> > which seems a bit silly  . Of course, we could just disable nested >
>> parallelism, or tell the users to use a prange and a ‘for from’ in such >
>> cases. > """ > > Dag: Interesting. The first one is definitely the behaviour
>> we want, as long > as it doesn't cause unintended consequences. > > I don't
>> really think it will -- the important thing is that that the order > of loop
>> iteration evaluation must be unimportant. And that is still true > (for the
>> outer loop, as well as for the inner) in your first example. > > Question:
>> When you have nested pranges, what will happen is that two nested > OpenMP
>> parallel blocks are used, right? And do you know if there is complete >
>> freedom/"reentrancy" in that variables that are thread-private in an outer >
>> parallel block and be shared in an inner one, and vice versa? An
>> implementation may or may not support it, and if it is supported the
>> behaviour can be configured through omp_set_nested(). So we should consider
>> the case where it is supported and enabled. If you have a lastprivate or
>> reduction, and after the loop these are (reduced and) assigned to the
>> original variable. So if that happens inside a parallel construct which does
>> not declare the variable private to the construct, you actually have a race.
>> So e.g. the nested prange currently races in the outer parallel range. > If
>> so I'd think that this algorithm should work and feel natural: > >  - In
>> each prange, for the purposes of variable private/shared/reduction >
>> inference, consider all internal "prange" just as if they had been "range";
>> > no special treatment. > >  - Recurse to children pranges. Right, that is
>> most natural. Algorithmically, reductions and lastprivates (as those can
>> have races if placed in inner parallel constructs) propagate outwards
>> towards the outermost parallel block, or up to the first parallel with
>> block, or up to the first construct that already determined the sharing
>> attribute. e.g. with parallel: with parallel: for i in prange(n): for j in
>> prange(n): sum += i * j # sum is well-defined here # sum is undefined here
>> Here 'sum' is a reduction for the two innermost loops. 'sum' is not private
>> for the inner parallel with block, as a prange in a parallel with block is a
>> worksharing loop that binds to that parallel with block. However, the
>> outermost parallel with block declares sum (and i and j) private, so after
>> that block all those variables become undefined. However, in the outermost
>> parallel with block, sum will have to be initialized to 0 before anything
>> else, or be declared firstprivate, otherwise 'sum' is undefined to begin
>> with. Do you think declaring it firstprivate would be the way to go, or
>> should we make it private and issue a warning or perhaps even an error? > DS
>> >
>> ________________________________
>> > cython-devel mailing list > cython-devel at python.org >
>> > http://mail.python.org/mailman/listinfo/cython-devel >
>> ________________________________
>> cython-devel mailing list cython-devel at python.org
>> http://mail.python.org/mailman/listinfo/cython-devel
>
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
>


More information about the cython-devel mailing list