[Cython] cython.parallel tasks, single, master, critical, barriers

Robert Bradshaw robertwb at math.washington.edu
Wed Oct 12 11:08:42 CEST 2011


On Wed, Oct 12, 2011 at 1:36 AM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> On 10/12/2011 09:55 AM, Robert Bradshaw wrote:
>>> I'm less sure about single, since making it a function indicates one
>>> could
>>> use it in other contexts and the whole thing becomes too magic (since
>>> it's
>>> tied to the position of invocation). I'm tempted to suggest
>>>
>>> for _ in prange(1):
>>>    ...
>>>
>>> as our syntax for single.
>
> Just to be clear: My point was that the above implements single behaviour
> even now, without any extra effort.
>
>>
>> The idea here is that you want a block of code executed once,
>> presumably by the first thread that gets here? I think this could also
>> be handled by a if statement, perhaps "if parallel.first()" or
>> something like that. Is there anything special about this construct
>> that couldn't simply be done by flushing/checking a variable?
>
> Good point. I think there's a problem with OpenMP that it has too many
> primitives for similar things.
>
> I'm -1 on single -- either using a for loop or flag+flush is more to type,
> but more readable to people who don't know cython.parallel (look: Python
> even makes "self." explicit -- the bias in language design is clearly on
> readability rather than writability).
>
> I thought of "if is_first()" as well, but my problem is again that it binds
> to the location of the call.
>
> if foo:
>    if parallel.is_first():
>        ...
> else:
>    if parallel.is_first():
>        ...
>
> can not be refactored to:
>
> if parallel.is_first():
>    if foo:
>        ...
>    else:
>        ...
>
> which I think is highly confusing for people who didn't write the code and
> don't know the details of cython.parallel. (Unlike is_master(), which works
> the same either way).
>
> I think we should aim for something that's as easy to read as possible for
> Python users with no cython.parallel knowledge.

Exactly. This is what's so beautiful about prange.

>>>>> with parallel.barrier():
>>>>> all threads wait until everyone has reached the barrier
>>>>> either no one or everyone should encounter the barrier
>>>>> shared variables are flushed
>>>
>>> I have problems with requiring a noop with block...
>>>
>>> I'd much rather write
>>>
>>> parallel.barrier()
>>>
>>> However, that ties a function call to the place of invocation, and
>>> suggests
>>> that one could do
>>>
>>> if rand()>  .5:
>>>    barrier()
>>> else:
>>>    i += 3
>>>    barrier()
>>>
>>> and have the same barrier in each case. Again,
>>>
>>> barrier(__file__, __line__)
>>>
>>> gets us purity at the cost of practicality. Another way is the pthreads
>>> approach (although one may have to use pthread rather then OpenMP to get
>>> it,
>>> unless there are named barriers?):
>>>
>>> barrier_a = parallel.barrier()
>>> barrier_b = parallel.barrier()
>>> with parallel:
>>>    barrier_a.wait()
>>>    if rand()>  .5:
>>>        barrier_b.wait()
>>>    else:
>>>        i += 3
>>>        barrier_b.wait()
>>>
>>>
>>> I'm really not sure here.
>>
>> I agree, the barrier doesn't seem like it belongs in a context. For
>> example, it's ambiguous whether the block is supposed to proceed or
>> succeed the barrier. I like the named barrier idea, but if that's not
>> feasible we could perhaps use control flow to disallow conditionally
>> calling barriers (or that every path calls the barrier (an equal
>> number of times?)).
>
> It is always an option to go beyond OpenMP. Pthread barriers are a lot more
> powerful in this way, and with pthread and Windows covered I think we should
> be good...
>
> IIUC, you can't have different path calling the barrier the same number of
> times, it's merely
>
> #pragma omp barrier
>
> and a seperate barrier statement gets another counter.

Makes sense, but this greatly restricts where we could use the OpenMP version.

> Which is why I think
> it is not powerful enough and we should use pthreads.
>
>> +1. I like the idea of providing more parallelism constructs, but
>> rather than risk fixating on OpenMP's model, perhaps we should look at
>> the problem we're trying to solve (e.g., what can't one do well now)
>> and create (or more likely borrow) the right Pythonic API to do it.
>
> Also, quick and flexible message-passing between threads/processes through
> channels is becoming an increasingly popular concept. Go even has a seperate
> syntax for channel communication, and zeromq is becoming popular for
> distributed work.
>
> The is a problem Cython may need to solve here, since one currently has to
> use very low-level C to do it quickly (either zeromq or pthreads in most
> cases -- I guess, an OpenMP critical section would help in implementing a
> queue though).
>
> I wouldn't resist a builtin "channel" type in Cython (since we don't have
> full templating/generics, it would be the only way of sending typed data
> conveniently?).

zeromq seems to be a nice level of abstraction--we could probably get
far with a zeromq "overlay" module that didn't require the GIL. Or is
the C API easy enough to use if we could provide convenient mechanisms
to initialize the tasks/threads. I think perhaps the communication
model could be solved by a library more easily than the treading
model.

> I ultimately feel things like that is more important than 100% coverage of
> the OpenMP standard. Of course, OpenMP is a lot lower-hanging fruit.

+1 Prange handles the (corse-grained) SIMD case nicely, and a
task/futures model based on closures would I think flesh this out to
the next level of generality (and complexity).

- Robert


More information about the cython-devel mailing list