[Cython] prange CEP updated

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Thu Apr 14 21:37:16 CEST 2011


On 04/14/2011 09:08 PM, mark florisson wrote:
> On 14 April 2011 20:58, Dag Sverre Seljebotn<d.s.seljebotn at astro.uio.no>  wrote:
>> On 04/14/2011 08:42 PM, mark florisson wrote:
>>>
>>> On 14 April 2011 20:29, Dag Sverre Seljebotn<d.s.seljebotn at astro.uio.no>
>>>   wrote:
>>>>
>>>> On 04/13/2011 11:13 PM, mark florisson wrote:
>>>>>
>>>>> Although there is omp_get_max_threads():
>>>>>
>>>>> "The omp_get_max_threads routine returns an upper bound on the number
>>>>> of threads that could be used to form a new team if a parallel region
>>>>> without a num_threads clause were encountered after execution returns
>>>>> from this routine."
>>>>>
>>>>> So we could have threadsvailable() evaluate to that if encountered
>>>>> outside a parallel region. Inside, it would evaluate to
>>>>> omp_get_num_threads(). At worst, people would over-allocate a bit.
>>>>
>>>> Well, over-allocating could well mean 1 GB, which could well mean getting
>>>> an
>>>> unecesarry MemoryError (or, like in my case, if I'm not careful to set
>>>> ulimit, getting a SIGKILL sent to you 2 minutes after the fact by the
>>>> cluster patrol process...)
>>>>
>>>> But even ignoring this, we also have to plan for people misusing the
>>>> feature. If we put it in there, somebody somewhere *will* write code like
>>>> this:
>>>>
>>>> nthreads = threadsavailable()
>>>> with parallel:
>>>>     for i in prange(nthreads):
>>>>         for j in range(100*i, 100*(i+1)): [...]
>>>>
>>>> (Yes, they shouldn't. Yes, they will.)
>>>>
>>>> Combined with a race condition that will only very seldomly trigger, this
>>>> starts to sound like a very bad idea indeed.
>>>>
>>>> So I agree with you that we should just leave it for now, and do
>>>> single/barrier later.
>>>
>>> omp_get_max_threads() doesn't have a race, as it returns the upper
>>> bound. So e.g. if between your call and your parallel section less
>>> OpenMP threads become available, then you might get less threads, but
>>> never more.
>>
>> Oh, now I'm following you.
>>
>> Well, my argument was that I think erroring in that direction is pretty bad
>> as well.
>>
>> Also, even if we're not making it available in cython.parallel, we're not
>> stopping people from calling omp_get_max_threads directly themselves, which
>> should be OK for the people who know enough to do this safely...
>
> True, but it wouldn't be as easy to wrap in a #ifdef _OPENMP. In any
> event, we could just put a warning in the docs stating that using
> threadsavailable outside parallel sections returns an upper bound on
> the actual number of threads in a subsequent parallel section.

I don't think outside or within makes a difference -- what about nested 
parallel sections? At least my intention in the CEP was that 
threadsavailable was always for the next section (so often it would be 1 
after entering the section).

Perhaps just calling it "maxthreads" instead solves the issue.

(Still, I favour just dropping threadsavailable/maxthreads for the time 
being. It is much simpler to add something later, when we've had some 
time to use it and reflect about it, than to remove something that 
shouldn't have been added.)

Dag Sverre


More information about the cython-devel mailing list