[Cython] cython.parallel tasks, single, master, critical, barriers

mark florisson markflorisson88 at gmail.com
Wed Oct 12 16:07:21 CEST 2011


On 12 October 2011 09:36, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> On 10/12/2011 09:55 AM, Robert Bradshaw wrote:
>>
>> On Sun, Oct 9, 2011 at 5:57 AM, Dag Sverre Seljebotn
>> <d.s.seljebotn at astro.uio.no>  wrote:
>>>
>>> On 10/09/2011 02:18 PM, Dag Sverre Seljebotn wrote:
>>>>
>>>> On 10/09/2011 02:11 PM, mark florisson wrote:
>>>>>
>>>>> Hey,
>>>>>
>>>>> So far people have been enthusiastic about the cython.parallel
>>>>> features,
>>>>> I think we should introduce some new features.
>>
>> Excellent. I think this is going to become a killer feature like
>> buffer support.
>>
>>>>> I propose the following,
>>>>
>>>> Great!!
>>>>
>>>> I only have time for a very short feedback now, perhaps more will
>>>> follow.
>>>>
>>>>> assume parallel has been imported from cython:
>>>>>
>>>>> with parallel.master():
>>>>> this is executed in the master thread in a parallel (non-prange)
>>>>> section
>>>>>
>>>>> with parallel.single():
>>>>> same as master, except any thread may do the execution
>>>>>
>>>>> An optional keyword argument 'nowait' specifies whether there will be a
>>>>> barrier at the end. The default is to wait.
>>>
>>> I like
>>>
>>> if parallel.is_master():
>>>    ...
>>> explicit_barrier_somehow() # see below
>>>
>>> better as a Pythonization. One could easily support is_master to be used
>>> in
>>> other contexts as well, simply by assigning a status flag in the master
>>> block.
>>
>> +1, the if statement feels a lot more natural.
>>
>>> Using an if-test flows much better with Python I feel, but that naturally
>>> lead to making the barrier explicit. But I like the barrier always being
>>> explicit, rather than having it as a predicate on all the different
>>> constructs like in OpenMP....
>>>
>>> I'm less sure about single, since making it a function indicates one
>>> could
>>> use it in other contexts and the whole thing becomes too magic (since
>>> it's
>>> tied to the position of invocation). I'm tempted to suggest
>>>
>>> for _ in prange(1):
>>>    ...
>>>
>>> as our syntax for single.
>
> Just to be clear: My point was that the above implements single behaviour
> even now, without any extra effort.

Right I got that. In the same way you could use

for _ in prange(0): pass

to get a barrier. I'm just saying that it looks pretty weird.

>>
>> The idea here is that you want a block of code executed once,
>> presumably by the first thread that gets here? I think this could also
>> be handled by a if statement, perhaps "if parallel.first()" or
>> something like that. Is there anything special about this construct
>> that couldn't simply be done by flushing/checking a variable?
>
> Good point. I think there's a problem with OpenMP that it has too many
> primitives for similar things.

Definitely.

> I'm -1 on single -- either using a for loop or flag+flush is more to type,
> but more readable to people who don't know cython.parallel (look: Python
> even makes "self." explicit -- the bias in language design is clearly on
> readability rather than writability).
>
> I thought of "if is_first()" as well, but my problem is again that it binds
> to the location of the call.
>
> if foo:
>    if parallel.is_first():
>        ...
> else:
>    if parallel.is_first():
>        ...
>
> can not be refactored to:
>
> if parallel.is_first():
>    if foo:
>        ...
>    else:
>        ...
>
> which I think is highly confusing for people who didn't write the code and
> don't know the details of cython.parallel. (Unlike is_master(), which works
> the same either way).
>
> I think we should aim for something that's as easy to read as possible for
> Python users with no cython.parallel knowledge.

That's a good point. I suppose single and master is not really needed,
so just master ("is_master") could be sufficient there.

>>
>>>>> with parallel.task():
>>>>> create a task to be executed by some thread in the team
>>>>> once a thread takes up the task it shall only be executed by that
>>>>> thread and no other thread (so the task will be tied to the thread)
>>>>>
>>>>> C variables will be firstprivate
>>>>> Python objects will be shared
>>>>>
>>>>> parallel.taskwait() # wait on any direct descendent tasks to finish
>>>>
>>>> Regarding tasks, I think this is mapping OpenMP too close to Python.
>>>> Closures are excellent for the notion of a task, so I think something
>>>> based on the futures API would work better. I realize that makes the
>>>> mapping to OpenMP and implementation a bit more difficult, but I think
>>>> it is worth it in the long run.
>>
>> It's almost as if you're reading my thoughts. There are much more
>> natural task APIs, e.g. futures or the way the Python
>> threading/multiprocessing does things.
>>
>>>>> with parallel.critical():
>>>>> this section of code is mutually exclusive with other critical sections
>>>>> optional keyword argument 'name' specifies a name for the critical
>>>>> section,
>>>>> which means all sections with that name will exclude each other,
>>>>> but not
>>>>> critical sections with different names
>>>>>
>>>>> Note: all threads that encounter the section will execute it, just
>>>>> not at the same time
>>>
>>> Yes, this works well as a with-statement...
>>>
>>> ..except that it is slightly magic in that it binds to call position
>>> (unlike
>>> anything in Python). I.e. this would be more "correct", or at least
>>> Pythonic:
>>>
>>> with parallel.critical(__file__, __line__):
>>>    ...
>
> Mark: I stand corrected on this point. +1 on your critical proposal.
>
>> This feels a lot like a lock, which of course fits well with the with
>> statement.
>>
>>>>> with parallel.barrier():
>>>>> all threads wait until everyone has reached the barrier
>>>>> either no one or everyone should encounter the barrier
>>>>> shared variables are flushed
>>>
>>> I have problems with requiring a noop with block...
>>>
>>> I'd much rather write
>>>
>>> parallel.barrier()
>>>
>>> However, that ties a function call to the place of invocation, and
>>> suggests
>>> that one could do
>>>
>>> if rand()>  .5:
>>>    barrier()
>>> else:
>>>    i += 3
>>>    barrier()
>>>
>>> and have the same barrier in each case. Again,
>>>
>>> barrier(__file__, __line__)
>>>
>>> gets us purity at the cost of practicality. Another way is the pthreads
>>> approach (although one may have to use pthread rather then OpenMP to get
>>> it,
>>> unless there are named barriers?):
>>>
>>> barrier_a = parallel.barrier()
>>> barrier_b = parallel.barrier()
>>> with parallel:
>>>    barrier_a.wait()
>>>    if rand()>  .5:
>>>        barrier_b.wait()
>>>    else:
>>>        i += 3
>>>        barrier_b.wait()
>>>
>>>
>>> I'm really not sure here.
>>
>> I agree, the barrier doesn't seem like it belongs in a context. For
>> example, it's ambiguous whether the block is supposed to proceed or
>> succeed the barrier. I like the named barrier idea, but if that's not
>> feasible we could perhaps use control flow to disallow conditionally
>> calling barriers (or that every path calls the barrier (an equal
>> number of times?)).
>
> It is always an option to go beyond OpenMP. Pthread barriers are a lot more
> powerful in this way, and with pthread and Windows covered I think we should
> be good...
>
> IIUC, you can't have different path calling the barrier the same number of
> times, it's merely
>
> #pragma omp barrier
>
> and a seperate barrier statement gets another counter. Which is why I think
> it is not powerful enough and we should use pthreads.

I don't think we should quite jump to that conclusion. Indeed openmp
barriers may not do what we want, but I think you could implement
barriers yourself (I haven't looked at an implementation, but I think
a condition lock + OpenMP flush can do what you need).

Implementing all this in pthreads wouldn't be trivial and it would
also be hard to do portably for non-Posix systems, considering that
most Cython developers don't know much about/care a lot about windows
for instance.

>> +1. I like the idea of providing more parallelism constructs, but
>> rather than risk fixating on OpenMP's model, perhaps we should look at
>> the problem we're trying to solve (e.g., what can't one do well now)
>> and create (or more likely borrow) the right Pythonic API to do it.
>
> Also, quick and flexible message-passing between threads/processes through
> channels is becoming an increasingly popular concept. Go even has a seperate
> syntax for channel communication, and zeromq is becoming popular for
> distributed work.
>
> The is a problem Cython may need to solve here, since one currently has to
> use very low-level C to do it quickly (either zeromq or pthreads in most
> cases -- I guess, an OpenMP critical section would help in implementing a
> queue though).
>
> I wouldn't resist a builtin "channel" type in Cython (since we don't have
> full templating/generics, it would be the only way of sending typed data
> conveniently?).

I'm not sure if we should introduce more syntax, but what about
reusing arrays or memoryview slices? If you assign to elements or
subslices you send messages, if you read them but don't have the data
you get the messages (so the program which has the data will send it,
etc).

But really, I think this is a different beast all together. If you
want to do this then you must be sure to cover all aspects, otherwise
people will just use the respective libraries. I think if you really
want this kind of thing on a cluster, you'd be using fortran anyway
(maybe with co-arrays), and if you need to do distributed computing
you'd be using zeromq directly.

> I ultimately feel things like that is more important than 100% coverage of
> the OpenMP standard. Of course, OpenMP is a lot lower-hanging fruit.

Yeah I never wanted full OpenMP coverage, it's just the first
(easiest) thing that comes to mind, it's easy to implement and if
you're familiar with OpenMP, it makes sense.

It would also be easier to support orphaned worksharing in the future,
if we wanted. But I think that might just be even more confusing for
people.

> Dag Sverre
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>


More information about the cython-devel mailing list