[Cython] [cython-users] Calling gil-requiring function not allowed without gil

Robert Bradshaw robertwb at math.washington.edu
Wed Aug 17 09:12:10 CEST 2011


On Fri, Aug 12, 2011 at 6:13 AM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> On 08/12/2011 02:45 PM, Stefan Behnel wrote:
>> [second try in moving this discussion to cython-devel]
>>
>> Dag Sverre Seljebotn, 12.08.2011 08:50:
>>> On 08/12/2011 06:44 AM, Robert Bradshaw wrote:
>>>> On Thu, Aug 11, 2011 at 5:53 AM, Dag Sverre Seljebotn
>>>> <d.s.seljebotn at astro.uio.no> wrote:
>>>>> Are you still against this mini-CEP?:
>>>>>
>>>>> with cython.global_lock():
>>>>> ...
>>>>>
>>>>> Where global_lock() is GIL-requiring noop.
>>>>
>>>> Just reading this, it's not immediately obvious what this means. (I
>>>> thought at first this was different syntax for "with gil"...)
>>>
>>> True. "cython.synchronized"? The synchronized keyword in Java does not
>>> quite the same thing but almost.
>>>
>>>>> This
>>>>>
>>>>> a) Improves code readability vastly. Having a critical section take
>>>>> effect
>>>>> because of the *lack* of a keyword is just very odd to anyone who's not
>>>>> shoulder deep in CPython internals
>>>>
>>>> I'm not following you here. The only way to run into this is if you
>>>> have explicitly release it. Presumably you can learn both keywords at
>>>> the same time. Perhaps extern function could be nogil by default.
>>>
>>> I'll try to explain again. Consider code like this:
>>>
>>> cdef call_c():
>>> magic_c_function(3)
>>>
>>> This code is perfectly fine even if magic_c_function is not reentrant --
>>> but that's hardly well documented! So it's conceivable that another
>>> programmer comes along (who doesn't know the C library well) and decides
>>> that "somebody has just been too lazy to add the nogil specifier" and
>>> slaps
>>> it on -- at which point you have a bug.
>>>
>>> This is more to the point in creating readable code IMO:
>>>
>>> cdef call_c():
>>> with cython.synchronized():
>>> magic_c_function(3)
>>
>> I think I'm as confused as Robert here. Is that the GIL or some separate
>> lock?
>>
>> If the latter, where would that lock be stored? Module-wide? While there
>> are certainly use cases for that, I think it's much more common to
>> either a) use the GIL or b) use an explicit lock at some well defined
>> point, e.g. at an object or class level.
>
> I intended it to be the GIL in current CPython. In GIL-less environments
> (whether Java, .NET or some future CPython) one would manuall ensure a fully
> global lock (inserting a _cython module in sys.path etc.).
>
> But:
>
>> I don't think we disagree when I say that locking should be explicit,
>> but that includes the "thing" that keeps the lock during its lifetime.
>>
>>
>>> Also, consider your classical race conditions:
>>>
>>> cdef init():
>>> global resource
>>> if resource == NULL:
>>> resource = malloc(sizeof(resource_t)) ...
>>>
>>> This is safe code.
>>
>> Well, it's safe code as long as there is no Python code interaction in
>> between. Calling back into the interpreter may trigger a thread switch.
>>
>> Admittedly, this is not obvious and tends to introduce bugs. I learned
>> that the hard way in lxml, actually twice.
>>
>> If your "synchronised" directive refers to the GIL, then it would suffer
>> from that problem as well. I think that's very undesirable for an
>> explicit critical section statement.
>
> This is a good point.
>
> Hmm. This all started with your comment "Even a trivial function may require
> an exclusive global lock for some reason, and it's common to use the GIL for
> that. So the programmer must be explicit here."
>
> And my problem is that it's too implicit -- when you /require/ the global
> lock, you /leave out/ the nogil modifier.

That's a good point.

> I just wanted to find some way about being explicit about what you already
> do. So it may be a question of naming. As you mention, using the GIL has
> drawbacks, but it performs better than a seperate lock when you know there
> won't be callbacks into Python...

So your proposal is that "with cython.synchronized" has the same
effect as a Python operation, in that its a compile time error to do
it while not holding the gil? (As opposed to "with gil" which actively
acquires the lock (which is more similar to java synchronized)?) Would
this cause the function to have a GIL requiring/acquiring signature? I
suppose the reason that one implicitly holds the gil is that it allows
you to do arbitrary Python operations by default. Having to explicitly
mark when one wants to hold the GIL would be a big regression for
applications that do a lot of Python interaction (e.g. wrappers).

I'm still thinking it could be reasonable for extern functions to be
nogil by default, as a user of C libraries is expected to manually and
externally respect its threading constraints. (This would require a
long deprecation period.)

- Robert


More information about the cython-devel mailing list