[Cython] Acquisition counted cdef classes

Tue Oct 25 20:10:51 CEST 2011

mark florisson, 25.10.2011 18:58:
> On 25 October 2011 12:22, Stefan Behnel wrote:
>> mark florisson, 25.10.2011 11:11:
>>> On 25 October 2011 08:33, Stefan Behnel wrote:
>>>> mark florisson, 24.10.2011 21:50:
>>>>> What if we support acquisition counting for every instance of a cdef
>>>>> class? In Python and Cython GIL mode you use reference counting, and
>>>>> in Cython nogil mode and for structs attributes, array dtypes etc you
>>>>> use acquisition counting. This allows you to pass around cdef objects
>>>>> without the GIL and use their nogil methods. If the acquisition count
>>>>> is greater than 1, the acquisition count owns a reference to the
>>>>> object. If it reaches 0 you discard your owned reference (you can
>>>>> simply acquire the GIL if you don't have it) and when you increment
>>>>> from zero you obtain it. Perhaps something like libatomic could be
>>>>> used to efficiently implement this.
>>>>
>>>> Where would you store that count? In the object struct? That would
>>>> increase the size of each instance.
>>>
>>> Yes, not just the count, also the lock. This feature would be optional
>>> and may be very useful for people (I think).
>>
>> Well, as long as it's an optional feature that requires a class decorator,
>> the only obvious drawback is that it'll bloat the compiler even more than it
>> is already.
>
> Actually, I think it will help the implementation of mutexes and async
> objects if we want those, and possibly other stuff in the future.

If all you want is to support the regular with statement in nogil blocks, 
part of that is implemented already. I recently added support for 
implementing the context manager's __enter__() method as c(p)def method. 
However, __exit__() isn't there yet, as it's a bit more tricky - maybe 
taking off a C pointer to the cdef method and calling that, or calling the 
cdef method directly instead (not sure), but always making sure that there 
still is a reference to the context manager itself, and eventually freeing 
it. I'm sure it can be done, though, maybe with some restrictions in nogil 
mode. If we additionally fix it up to use the exception propagation and 
try-finally support that you wrote for the with-gil feature, we're 
basically there.

> The
> acquisition counting is basically already there (for memoryviews), so
> it's easy to track down where and when to apply this. However one
> major problem would be circular acquisition counts, so you'd also have
> to implement a garbage collector like CPython has (e.g. if you have a
> cdef class with a cython.parallel.dict). We should just have a real
> garbage collector instead of all the counting crap. Or we could make
> it a burden for the user...

Right, these things can grow endlessly. It took CPython something like a 
dozen years to a) recognise the need for and b) implement a garbage 
collector. Let's hope that Cython will never get one.

> I agree that this is really not as feasible as I first thought. It
> actually shows me a problem where I can have a memoryview object in a
> memoryview with dtype 'object', although the problem here is that the
> memoryview object doesn't traverse the object in the Py_buffer, or
> when coerced from a memoryview slice to a memoryview object, the
> memoryview slice struct object... I suppose I need to fix that (but
> I'm not sure how, as you can't provide a manual traverse function in
> Cython).

No, you may have to descend into C here. Or, you could disable a Python 
object dtype for the time being?

> But I really believe that these are much-wanted features. If you're
> using threads in Python you can only get concurrency not parallelism,
> unless you release the GIL, even if there is some performance overhead
> it will still be a lot better than sequential execution. Perhaps when
> cython.parallel will be more mature, we may get functionality to
> specify data distribution schemes and message passing, in which case
> the GIL won't be a problem. But many things would be harder or much
> more expensive, e.g. transposing, sending objects etc.

See? That's what I mean with language complexity. These things quickly turn 
into an open can of worms. I don't think the language should handle any of 
these. Message passing is up to libraries, for example. If you want 
language support, use Erlang.

>>>>> The advantages are:
>>>>>
>>>>> 1) allow users to pass around cdef typed objects in nogil mode
>>>>> 2) allow cdef typed objects in as struct attributes or array elements
>>>>> 3) make it easy to implement things like memoryviews (already done but
>>>>> would have been a lot easier), cython.parallel.async/future objects,
>>>>> cython.parallel.mutex objects and possibly other things in the future
>>>>
>>>> Would it really be easier? You can already call cdef methods in nogil
>>>> mode,
>>>> AFAIR.
>>>
>>> Sure, but you cannot store cdef objects as struct attributes, array
>>> elements (you could implement it with reference counting, but not for
>>> nogil mode)
>>
>> You could do that with borrowed references, though, assuming that you keep
>> another reference around (or do your own ref-counting). However, I do see
>> that keeping a real reference around may be hard to do in some cases.
>>
>>
>>> and you cannot pass them around without the GIL.
>>
>> Yes, you can, as long as you only go through cdef functions. Obviously, you
>> can't pass them into a Python function call, but you can (and could, if it
>> was implemented) do loads of useful things with existing references even in
>> nogil sections. The GIL checker is quite fine grained already but could do
>> even better.
>>
>
> Ok, so cdef arguments are borrowed, which gets you somewhere but not
> very far. It's rather baffling that f(x) is fine in nogil mode, but y
> = x isn't.

"y = x" could work if it's using borrowed references, though. The 
"borrowed" flag could be inferred automatically in nogil mode. Then it 
would only be an error if the user explicitly declared it as owned.

>>> This
>>> proposal is about making your life easier without the GIL, and
>>> currently it's kind of a pain.
>>
>> The nogil sections I use are usually quite short, so I can't tell. It's
>> certainly a pain to work without the GIL, because it means you have to take
>> a lot more care when writing your code. But that won't change just by
>> dropping reference counting. And nogil code will definitely become another
>> bit harder to get right when using borrowed references.
>>
>>
>>> Ah I assumed cpdef nogil was invalid, I see it isn't, cool.
>>
>> It makes perfect sense. Just because a function *can* be called without the
>> GIL doesn't mean it can't be called from Python. So the Python wrapper
>> requires the GIL, but the underlying cdef function doesn't.
>>
>>
>>> This breaks terribly for special methods though.
>>
>> Why? It's just a matter of properly separating out their Python wrapper.
>> That's why I was referring to the DefNode refactoring.
>
> I see, ok. All I meant was that it currently gives you compile errors.

I know. I've given ticket #3 enough (smaller) tries to know basically all 
problems by now.

> Anyway, sorry for the long mail. I agree this is likely not feasible
> to implement, although I would like the functionality to be there.
> Perhaps I'm trying to solve problems which don't really need to be
> solved. Maybe we should just use multiprocessing, or MPI and numpy
> with global arrays and pickling. Maybe memoryviews could help out with
> that as well.

In any case, I think we should let the existing features settle for a 
while, and see what users come up with. Not every feature that *can* be 
done is worth making a language feature.

Stefan