[Cython] Calling gil-requiring function not allowed without gil

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Wed Aug 17 10:08:03 CEST 2011


On 08/17/2011 09:12 AM, Robert Bradshaw wrote:
> On Fri, Aug 12, 2011 at 6:13 AM, Dag Sverre Seljebotn
> <d.s.seljebotn at astro.uio.no>  wrote:
>> On 08/12/2011 02:45 PM, Stefan Behnel wrote:
>>> [second try in moving this discussion to cython-devel]
>>>
>>> Dag Sverre Seljebotn, 12.08.2011 08:50:
>>>> On 08/12/2011 06:44 AM, Robert Bradshaw wrote:
>>>>> On Thu, Aug 11, 2011 at 5:53 AM, Dag Sverre Seljebotn
>>>>> <d.s.seljebotn at astro.uio.no>  wrote:
>>>>>> Are you still against this mini-CEP?:
>>>>>>
>>>>>> with cython.global_lock():
>>>>>> ...
>>>>>>
>>>>>> Where global_lock() is GIL-requiring noop.
>>>>>
>>>>> Just reading this, it's not immediately obvious what this means. (I
>>>>> thought at first this was different syntax for "with gil"...)
>>>>
>>>> True. "cython.synchronized"? The synchronized keyword in Java does not
>>>> quite the same thing but almost.
>>>>
>>>>>> This
>>>>>>
>>>>>> a) Improves code readability vastly. Having a critical section take
>>>>>> effect
>>>>>> because of the *lack* of a keyword is just very odd to anyone who's not
>>>>>> shoulder deep in CPython internals
>>>>>
>>>>> I'm not following you here. The only way to run into this is if you
>>>>> have explicitly release it. Presumably you can learn both keywords at
>>>>> the same time. Perhaps extern function could be nogil by default.
>>>>
>>>> I'll try to explain again. Consider code like this:
>>>>
>>>> cdef call_c():
>>>> magic_c_function(3)
>>>>
>>>> This code is perfectly fine even if magic_c_function is not reentrant --
>>>> but that's hardly well documented! So it's conceivable that another
>>>> programmer comes along (who doesn't know the C library well) and decides
>>>> that "somebody has just been too lazy to add the nogil specifier" and
>>>> slaps
>>>> it on -- at which point you have a bug.
>>>>
>>>> This is more to the point in creating readable code IMO:
>>>>
>>>> cdef call_c():
>>>> with cython.synchronized():
>>>> magic_c_function(3)
>>>
>>> I think I'm as confused as Robert here. Is that the GIL or some separate
>>> lock?
>>>
>>> If the latter, where would that lock be stored? Module-wide? While there
>>> are certainly use cases for that, I think it's much more common to
>>> either a) use the GIL or b) use an explicit lock at some well defined
>>> point, e.g. at an object or class level.
>>
>> I intended it to be the GIL in current CPython. In GIL-less environments
>> (whether Java, .NET or some future CPython) one would manuall ensure a fully
>> global lock (inserting a _cython module in sys.path etc.).
>>
>> But:
>>
>>> I don't think we disagree when I say that locking should be explicit,
>>> but that includes the "thing" that keeps the lock during its lifetime.
>>>
>>>
>>>> Also, consider your classical race conditions:
>>>>
>>>> cdef init():
>>>> global resource
>>>> if resource == NULL:
>>>> resource = malloc(sizeof(resource_t)) ...
>>>>
>>>> This is safe code.
>>>
>>> Well, it's safe code as long as there is no Python code interaction in
>>> between. Calling back into the interpreter may trigger a thread switch.
>>>
>>> Admittedly, this is not obvious and tends to introduce bugs. I learned
>>> that the hard way in lxml, actually twice.
>>>
>>> If your "synchronised" directive refers to the GIL, then it would suffer
>>> from that problem as well. I think that's very undesirable for an
>>> explicit critical section statement.
>>
>> This is a good point.
>>
>> Hmm. This all started with your comment "Even a trivial function may require
>> an exclusive global lock for some reason, and it's common to use the GIL for
>> that. So the programmer must be explicit here."
>>
>> And my problem is that it's too implicit -- when you /require/ the global
>> lock, you /leave out/ the nogil modifier.
>
> That's a good point.
>
>> I just wanted to find some way about being explicit about what you already
>> do. So it may be a question of naming. As you mention, using the GIL has
>> drawbacks, but it performs better than a seperate lock when you know there
>> won't be callbacks into Python...
>
> So your proposal is that "with cython.synchronized" has the same
> effect as a Python operation, in that its a compile time error to do
> it while not holding the gil? (As opposed to "with gil" which actively
> acquires the lock (which is more similar to java synchronized)?) Would

Yes, the former. It does not acquire a lock under curreent CPython.

But the intention is very much to prepare the ground for 
IronPython/Jython/GIL-less CPython/autogil-in-Cython too. Basically a 
tool to protect & document logic that depends on GIL to avoid races, so 
that it is future-safe.

So in Cython-for-Jython it would have to acquire a lock, and be more 
like Java synchronized.

> this cause the function to have a GIL requiring/acquiring signature? I

Not in isolation.

But, IF (after a long deprecation cycle) we start to infer "nogil" on 
function signatures, then the presence of this Python statement would 
stop nogil from getting inferred.

The intention is simply: If you have a cdef function which uses no 
Python operations, but still relies on holding the GIL to avoid race 
conditions (exactly the situation Stefan described initially), then you 
can use cython.synchronized to avoid any future inference from removing 
the GIL.

> suppose the reason that one implicitly holds the gil is that it allows
> you to do arbitrary Python operations by default. Having to explicitly
> mark when one wants to hold the GIL would be a big regression for
> applications that do a lot of Python interaction (e.g. wrappers).

Yes, I'm not proposing that at all.

> I'm still thinking it could be reasonable for extern functions to be
> nogil by default, as a user of C libraries is expected to manually and
> externally respect its threading constraints. (This would require a
> long deprecation period.)

+1.

Perhaps this would be safer if we introduce "expects gil" (explicit 
declaration for neither "nogil" or "with gil"), and then start to emit 
warnings if a GIL-mode isn't declared on cdef functions which has their 
address taken.

Thus, if you pass a callback to a C library, you'll get a reminder to 
declare a GIL mode.

Dag Sverre


More information about the cython-devel mailing list