[Cython] cython.parallel tasks, single, master, critical, barriers

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Fri Oct 21 23:27:50 CEST 2011

On 10/21/2011 09:31 PM, mark florisson wrote:
> On 21 October 2011 18:43, Dag Sverre Seljebotn
> <d.s.seljebotn at astro.uio.no>  wrote:
>> On 10/20/2011 02:51 PM, mark florisson wrote:
>>> On 20 October 2011 10:35, Dag Sverre Seljebotn
>>> <d.s.seljebotn at astro.uio.no>    wrote:
>>>> On 10/20/2011 11:13 AM, mark florisson wrote:
>>>>> On 20 October 2011 09:42, Dag Sverre Seljebotn
>>>>> <d.s.seljebotn at astro.uio.no>      wrote:
>>>>>> Meta: I've been meaning to respond to this thread, but can't find the
>>>>>> time.
>>>>>> What's the time-frame for implementing this? If it's hypothetical at
>>>>>> the
>>>>>> moment and just is a question of getting things spec-ed, one could
>>>>>> perhaps
>>>>>> look at discussing it at the next Cython workshop, or perhaps a Skype
>>>>>> call
>>>>>> with the three of us as some point...
>>>>> For me this is just about getting this spec-ed, so that when someone
>>>>> finds the time, we don't need to discuss it for weeks first. And the
>>>>> implementor won't necessarily have to support everything at once, e.g.
>>>>> just critical sections or barriers alone would be nice.
>>>>> Is there any plan for a new workshop then? Because if it's in two
>>>>> years I think we could be more time-efficient :)
>>>> At least in William's grant there's plans for 2-3 Cython workshops, so
>>>> hopefully there's funding for one next year if we want to. We should ask
>>>> him
>>>> before planning anything though.
>>>>>> Regarding the tasks: One of my biggest problems with Python is the lack
>>>>>> of
>>>>>> an elegant syntax for anonymous functions. But since Python has that
>>>>>> problem, I feel it is not necesarrily something we should fix (by using
>>>>>> the
>>>>>> with statements to create tasks). Sometimes Pythonic-ness is more
>>>>>> important
>>>>>> than elegance (for Cython).
>>>>> I agree it's not something we should fix, I just think tasks are most
>>>>> useful in inline blocks and not in separate functions or closures.
>>>>> Although it could certainly work, I think it restricts more, leads to
>>>>> more verbose code and possibly questionable semantics, and on top of
>>>>> that it would be a pain to implement (although that should not be used
>>>>> as a persuasive argument). I'm not saying there is no elegant way
>>>>> other than with blocks, I'm just saying that I think closures are not
>>>>> the right thing for it.
>>>>>> In general I'm happy as long as there's a chance of getting things to
>>>>>> work
>>>>>> in pure Python mode as well (with serial execution). So if, e.g., with
>>>>>> statements creating tasks have the same effect when running the same
>>>>>> code
>>>>>> (serially) in pure Python, I'm less opposed (didn't look at it in
>>>>>> detail).
>>>>> Yes, it would have the same effect. The thing with tasks (and OpenMP
>>>>> constructs in general) is that usually if your compiler ignores all
>>>>> your pragmas, your code just runs serially in the same way. The same
>>>>> would be true for the tasks in with blocks.
>>>> Short note: I like the vision of Konrad Hinsen:
>>>> http://www.euroscipy.org/talk/2011
>>>> The core idea is that the "task-ness" of a block of code is orthogonal to
>>>> the place you actually write it. That is, a block of code may often
>>>> either
>>>> be fit for execution as a task, or not, depending on how heavy it is (=
>>>> values of arguments it takes in, not its contents).
>>>> He introduces the "async" expression to drive this point through.
>>>> I think "with task" is fine if used in this way, if you simply call a
>>>> function (which itself doesn't know whether it is a task or not). But
>>>> once
>>>> you start to implement an entire function within the with-statement
>>>> there's
>>>> a code-smell.
>>> Definitely, do you'd just call the function from the task.
>>>> Anyway, it's growing on me. But I think his "async" expression is more
>>>> Pythonic in the way that it forces you away from making your code smell.
>>>> We could simply have
>>>> async(func)(arg, arg2, somekwarg=4)
>>> That looks good. The question is, does this constitute an expression
>>> or a statement? If it's an expression, then you expect a meaningful
>>> return value, which means you're going to have to wait for the task to
>>> complete. That would be fine if you submit multiple tasks in one
>>> expression, from the slides:
>>>      max(async expr1, async expr2)
>>> or even
>>>      [async expr for ... in ...]
>>> I must say, it does look really elegant and it doesn't leave the user
>>> to question when the task is executed (and if you need a taskwait
>>> directive to wait for your variables to become defined). What I don't
>>> see is how to do the producer consumer trick, unless you regard using
>>> the result of async as a taskwait, and not using it as not having a
>>> taskwait, e.g.
>>> async func(...) # generate a task and don't wait for it
>>> result = async func(...) # generate a task and wait for it.
>>> The latter is not useful unless you have multiple expressions in one
>>> statement, so we should also allow result1, result2 = async
>>> func(data=a), async func(data=b).
>> I think the idea is that you have a transparent, implicit future. You block
>> when you use the result; you are allowed to pass the result back to the
>> caller without blocking, and the caller does not need to know whether it is
>> a future or not.
>> Implemented in Python itself, the protocol would be something like
>> INCREF/DECREF does not block, but all other operations do block.
>> Of course, this is rather hard to implement in present-day Cython. Options:
>>   a) Have async(func)(x) return a future, must call result().
>>   b) Make async part of the type spec, such as "cdef async int x". And coerce
>> it to Python using a proxy. Seems messy, and going beyond what current
>> Python semantics allow. But I do like it a bit better than explicit futures
>> everywhere.
> Interesting. However, what happens when I do
> cdef async int x
> x = async(func)(y)
> x = async(func)(z)
> print x
> ? You don't really know what x will be, as you don't know which task
> will complete first. This case could be solved by having multiple
> different future result storage locations, but what if I do this in a
> loop?
> You could just define that as a race condition though, but I would
> expect the value from the task last specified.

The only intuitive thing to me is that the first x is discarded and you 
block for the second. Yes, that means heap-allocation and reference 
counting (the async function holds a reference, which would be the only 
reference in the case above, so that when the first call returns the 
target heap-allocated int gets deallocated).

Really a better model for changing CPython than Cython..

> What happens when you return an async value from the task? Do you get
> "cdef async async int x"? Or what if you pass in an async variable as
> async argument to a new task? Basically we have to restrict async
> value usage to "direct parents only". I think it also makes sense to
> restrict use to the parallel section/orphaned function only.

No, I imagined these to be heap-allocated things, so you just pass 
around these heap-allocated wrappers containing i) something you wait on 
(pthread semaphore?), ii) refcount, iii) value storage.

(After all, the inspiration is Konrad's slides on "Python 4" (as he'd 
wish it)).

Yes, there's some performance penalty for every read, but there's 
penalty with any task really. Also control flow analysis will likely 
take you as far as one wait per function.

Though I'm still not convinced that channels in the way Go uses them 
aren't "better".

Dag Sverre

More information about the cython-devel mailing list