[Cython] prange CEP updated

mark florisson markflorisson88 at gmail.com
Tue Apr 26 16:25:38 CEST 2011

On 21 April 2011 20:13, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no> wrote:
> On 04/21/2011 10:37 AM, Robert Bradshaw wrote:
>> On Mon, Apr 18, 2011 at 7:51 AM, mark florisson
>> <markflorisson88 at gmail.com>  wrote:
>>> On 18 April 2011 16:41, Dag Sverre Seljebotn<d.s.seljebotn at astro.uio.no>
>>>  wrote:
>>>> Excellent! Sounds great! (as I won't have my laptop for some days I
>>>> can't
>>>> have a look yet but I will later)
>>>> You're right about (the current) buffers and the gil. A testcase
>>>> explicitly
>>>> for them would be good.
>>>> Firstprivate etc: i think it'd be nice myself, but it is probably better
>>>> to
>>>> take a break from it at this point so that we can think more about that
>>>> and
>>>> not do anything rash; perhaps open up a specific thread on them and ask
>>>> for
>>>> more general input. Perhaps you want to take a break or task-switch to
>>>> something else (fused types?) until I can get around to review and merge
>>>> what you have so far? You'll know best what works for you though. If you
>>>> decide to implement explicit threadprivate variables because you've got
>>>> the
>>>> flow I certainly wom't object myself.
>>>  Ok, cool, I'll move on :) I already included a test with a prange and
>>> a numpy buffer with indexing.
>> Wow, you're just plowing away at this. Very cool.
>> +1 to disallowing nested prange, that seems to get really messy with
>> little benefit.
>> In terms of the CEP, I'm still unconvinced that firstprivate is not
>> safe to infer, but lets leave the initial values undefined rather than
>> specifying them to be NaNs (we can do that as an implementation if you
>> want), which will give us flexibility to change later once we've had a
>> chance to play around with it.
> I don't see any technical issues with inferring firstprivate, the question
> is whether we want to. I suggest not inferring it in order to make this
> safer: One should be able to just try to change a loop from "range" to
> "prange", and either a) have things fail very hard, or b) just work
> correctly and be able to trust the results.
> Note that when I suggest using NaN, it is as initial values for EACH
> ITERATION, not per-thread initialization. It is not about "firstprivate" or
> not, but about disabling thread-private variables entirely in favor of
> "per-iteration" variables.
> I believe that by talking about "readonly" and "per-iteration" variables,
> rather than "thread-shared" and "thread-private" variables, this can be used
> much more safely and with virtually no knowledge of the details of
> threading. Again, what's in my mind are scientific programmers with (too)
> little training.
> In the end it's a matter of taste and what is most convenient to more users.
> But I believe the case of needing real thread-private variables that
> preserves per-thread values across iterations (and thus also can possibly
> benefit from firstprivate) is seldomly enough used that an explicit
> declaration is OK, in particular when it buys us so much in safety in the
> common case.
> To be very precise,
> cdef double x, z
> for i in prange(n):
>    x = f(x)
>    z = f(i)
>    ...
> goes to
> cdef double x, z
> for i in prange(n):
>    x = z = nan
>    x = f(x)
>    z = f(i)
>    ...
> and we leave it to the C compiler to (trivially) optimize away "z = nan".
> And, yes, it is a stopgap solution until we've got control flow analysis so
> that we can outright disallow such uses of x (without threadprivate
> declaration, which also gives firstprivate behaviour).
Ah, I see, sure, that sounds sensible. I'm currently working on fused
types, so when I finish that up I'll return to that.
>> The "cdef threadlocal(int) foo" declaration syntax feels odd to me...
>> We also probably want some way of explicitly marking a variable as
>> shared and still be able to assign to/flush/sync it. Perhaps the
>> parallel context could be used for these declarations, i.e.
>>     with parallel(threadlocal=a, shared=(b,c)):
>>         ...
>> which would be considered an "expert" usecase.
> I'm not set on the syntax for threadlocal variables; although your proposal
> feels funny/very unpythonic, almost like a C macro. For some inspiration,
> here's the Python solution (with no obvious place to put the type):
> import threading
> mydata = threading.local()
> mydata.myvar = ... # value is threadprivate
>> For all the discussion of threadsavailable/threadid, the most common
>> usecase I see is for allocating a large shared buffer and partitioning
>> it. This seems better handled by allocating separate thread-local
>> buffers, no? I still like the context idea, but everything in a
>> parallel block before and after the loop(s) also seems like a natural
>> place to put any setup/teardown code (though the context has the
>> advantage that __exit__ is always called, even if exceptions are
>> raised, which makes cleanup a lot easier to handle).
> I'd *really* like to have try/finally available in cython.parallel block for
> this, although I realize that may have to wait for a while. A big part of
> our discussions at the workshop were about how to handle exceptions; I guess
> there'll be a "phase 2" of this where break/continue/raise is dealt with.

I'll leave that until I finish fused types and the typed memory views.
Before I'd start on that I'd first review the with gil block and
ensure the tests pass in all python versions, and perhaps that should
be merged first before I pull it into the parallel branch? Otherwise
you're kind of forced to review both branches.

> Dag Sverre
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel

More information about the cython-devel mailing list