[Cython] Acquisition counted cdef classes

Stefan Behnel stefan_ml at behnel.de
Tue Oct 25 13:22:08 CEST 2011

mark florisson, 25.10.2011 11:11:
> On 25 October 2011 08:33, Stefan Behnel wrote:
>> mark florisson, 24.10.2011 21:50:
>>> This is in response to
>>> http://groups.google.com/group/cython-users/browse_thread/thread/bcbc5fe0e329224f
>>> and http://trac.cython.org/cython_trac/ticket/498 , and some of the
>>> previous discussion on cython.parallel.
>>> Basically I think we should have something more powerful than 'cdef
>>> borrowed CdefClass obj', something that also doesn't rely on new
>>> syntax.
>> We will still need borrowed reference support in the compiler eventually,
>> whether we make it a language feature or not.
> I'm not sure I understand why, acquisition counting could solve these
> problems for cdef classes, and general objects may not be used without
> the GIL. Do you want this as an optimization?

Yes. Think of type(x), for example, or PyDict_GetItem(). They return 
borrowed references, and in many cases, Cython wouldn't have to INCREF and 
DECREF them when they are only being used as part of some specific kinds of 
expressions. The same applies to some utility functions in Cython that 
currently must INCREF their return value unconditionally, simply because 
they can't tell Cython that they could also return a borrowed reference 
instead. If there was a way to do that, we could optimise the reference 
counting away in a couple of more places, which would get us another bit 
closer to hand-tuned code.

However, note that this doesn't necessarily have an impact on nogil code. 
If you took a borrowed reference in one nogil thread, and a gil-holding 
thread deletes the object at the same time or during the lifetime of the 
borrowed reference (e.g. by updating a dict or assigning to a cdef 
attribute), the nogil thread would end up with a dead pointer in its hands. 
That's why the usage of borrowed references needs to be explicit in the 
code ("I know what I'm doing"), and the optimisations require the GIL to be 

>>> What if we support acquisition counting for every instance of a cdef
>>> class? In Python and Cython GIL mode you use reference counting, and
>>> in Cython nogil mode and for structs attributes, array dtypes etc you
>>> use acquisition counting. This allows you to pass around cdef objects
>>> without the GIL and use their nogil methods. If the acquisition count
>>> is greater than 1, the acquisition count owns a reference to the
>>> object. If it reaches 0 you discard your owned reference (you can
>>> simply acquire the GIL if you don't have it) and when you increment
>>> from zero you obtain it. Perhaps something like libatomic could be
>>> used to efficiently implement this.
>> Where would you store that count? In the object struct? That would increase
>> the size of each instance.
> Yes, not just the count, also the lock. This feature would be optional
> and may be very useful for people (I think).

Well, as long as it's an optional feature that requires a class decorator, 
the only obvious drawback is that it'll bloat the compiler even more than 
it is already.

>>> The advantages are:
>>> 1) allow users to pass around cdef typed objects in nogil mode
>>> 2) allow cdef typed objects in as struct attributes or array elements
>>> 3) make it easy to implement things like memoryviews (already done but
>>> would have been a lot easier), cython.parallel.async/future objects,
>>> cython.parallel.mutex objects and possibly other things in the future
>> Would it really be easier? You can already call cdef methods in nogil mode,
> Sure, but you cannot store cdef objects as struct attributes, array
> elements (you could implement it with reference counting, but not for
> nogil mode)

You could do that with borrowed references, though, assuming that you keep 
another reference around (or do your own ref-counting). However, I do see 
that keeping a real reference around may be hard to do in some cases.

> and you cannot pass them around without the GIL.

Yes, you can, as long as you only go through cdef functions. Obviously, you 
can't pass them into a Python function call, but you can (and could, if it 
was implemented) do loads of useful things with existing references even in 
nogil sections. The GIL checker is quite fine grained already but could do 
even better.

> This
> proposal is about making your life easier without the GIL, and
> currently it's kind of a pain.

The nogil sections I use are usually quite short, so I can't tell. It's 
certainly a pain to work without the GIL, because it means you have to take 
a lot more care when writing your code. But that won't change just by 
dropping reference counting. And nogil code will definitely become another 
bit harder to get right when using borrowed references.

> Ah I assumed cpdef nogil was invalid, I see it isn't, cool.

It makes perfect sense. Just because a function *can* be called without the 
GIL doesn't mean it can't be called from Python. So the Python wrapper 
requires the GIL, but the underlying cdef function doesn't.

> This breaks terribly for special methods though.

Why? It's just a matter of properly separating out their Python wrapper. 
That's why I was referring to the DefNode refactoring.

>>> All of this functionality should also get a sane C API (to be provided
>>> by cython.h). You'd get a Cy_INCREF(obj, have_gil)/Cy_DECREF() etc.
>>> Every class using this functionality is a subclass of CythonObject
>>> (that contains a PyObject + an acquisition count + a lock). Perhaps if
>>> the user is subclassing something other than object we could allow the
>>> user to specify custom __cython_(un)lock__ and
>>> __cython_acquisition_count__ methods and fields.
>>> Now, building on top of this functionality, Cython could provide
>>> built-in nogil-compatible types, like lists, dicts and maybe tuples
>>> (as a start). These will by default not lock for operations to allow
>>> e.g. one thread to iterate over the list and another thread to index
>>> it without lock contention and other general overhead. If one thread
>>> is somehow changing the size of the list, or writing to indices that
>>> another thread is reading from/writing to, the results will of course
>>> be undefined unless the user synchronizes on the object. So it would
>>> be the user's responsibility. The acquisition counting itself will
>>> always be thread-safe (i.e., it will be atomic if possible, otherwise
>>> it will lock).
>>> It's probably best to not enable this functionality by default as it
>>> would be more expensive to instantiate objects, but it could be
>>> supported through a cdef class decorator and a general directive.
>> It's well known that this would be expensive. One of the approaches that
>> tried to get rid of the GIL in CPython introduced fine grained locking, and
>> it turned out to be substantially slower, AFAIR by a factor of two.
> Sure, I am aware of that. Often you can just keep the GIL, in which
> case you wouldn't use these types. But when you want to leave the
> shiny world of the GIL you still want these goodies. Acquiring the GIL
> is too expensive as there is pretty much always contention.

Acquiring a more fine grained lock is more likely to reduce the contention, 
but is not necessarily less expensive. The lock still needs to get acquired 
and released. GIL protected reference counting is a lot cheaper than that, 
as is manual locking in a more coarse grained fashion.

>> You could potentially drop the locking for local variables, but you'd loose
>> that ability as soon as the 'object' is passed into a function.
> Definitely, but you cannot use them with the GIL anyway :)

Yes you can. For cdef functions, it's the responsibility of the caller to 
own the references of object arguments it passes. The called function 
doesn't have to do reference counting for them, as long as it doesn't try 
to reassign the variable. And even that could be fixed with borrowed 
references, and also partly by better control flow analysis.

>> Basically, what you are trying to do here is to duplicate the complete
>> ref-counting infrastructure of CPython, but without using CPython.
>>> Of course one may still use non-cdef borrowed objects, by simply
>>> casting to a PyObject *.
>> That's very ugly, though, because you loose all access to methods and
>> attributes of the object. Basically, it becomes useless that way, except for
>> storing away a pointer to it somewhere. You could just as well use a void*.
> Indeed, and that's really all you can do without the GIL.

I think you're underestimating what can (or could) be done without holding 
the GIL. There are still some open features that wait for being 
implemented, even without adding new syntax (and thus further increasing 
the complexity of the language).

> I think
> we're talking about different things, I'm talking about supporting
> nogil, and you're talking about borrowed references in general.

Both are related, though. It's certainly a lot easier and cleaner to 
support borrowed references in the compiler, than to implement a whole new 
scheme for handling extension type instances in addition to the normal 
object handling which we need anyway.

> I'm
> not sure why you'd not just take a reference instead in GIL mode,
> unless you were worried about incrementing a counter.

Decrementing it, not incrementing. :)

The problem is not so much the INCREF (which is just an indirect add), it's 
the DECREF, which contains a conditional jump based on an unknown external 
value, that may trigger external code. That can kill several C compiler 
optimisations for the surrounding code. (And that would only get worse by 
using a dedicated locking mechanism.)


More information about the cython-devel mailing list