[Cython] Surprising behaviour wrt. generated tp_clear and tp_dealloc functions

Stefan Behnel stefan_ml at behnel.de
Mon Apr 22 14:28:01 CEST 2013


Torsten Landschoff, 22.04.2013 13:56:
> On 04/22/2013 01:50 PM, Stefan Behnel wrote:
>>> a good idea would be to warn if __dealloc__ actually references a
>>> Python attribute that tp_clear could have cleared, with a pointer to the
>>> class decorator that exempts the attribute/instance from tp_clear.
>>
>> That's a good idea.
>>
>> Any help with this is appreciated. :)
>
> How can I help? If you want, I can attempt to create a patch and ask you
> if I don't make any progress.

Please do. Just ask back on this list if there's anything that's not clear
to you.


> I do not have a good grip at Cython sourcecode yet, so if you can give
> me a head start by pointing out the locations where I can inject the new
> behaviour this would be most welcome.

The implementations of tp_clear() etc. are in ModuleNode.py. For example,
switching off the clearing of known Python builtin types can be done
directly in generate_clear_function() by testing the type of the "entry" (a
symbol table entry, specifically a name a type attribute in this case). See
the end of Builtin.py (and its usage) for known builtins. You might want to
add a list of safe builtin types to Builtin.py and use it in other places.
There's precedence in "types_that_construct_their_instance".

I'd definitely start by writing a test. Test suite and test runner are
explained here:

http://wiki.cython.org/HackerGuide#Thetestsuite

Regarding the decision about GC participation, there's a method needs_gc()
in the CClassScope class in Symtab.py. Making that smarter should be enough
to disable GC support in safe cases.

You can clone Cython on github and give us changes to review in your own
repo there.


> About the object graph analysis I do not know if that is easily added. I
> thought it might be because Cython already has some type inference in
> place!?

No type inference needed. Object attributes are either explicitly typed as
a specific extension (or builtin) type, or they are just plain objects, in
which case we must assume that they can introduce refcycles.

You can start by walking the attribute type graph in needs_gc(). (I don't
think it's called all that often, but in the worst case, the result should
be cachable.) Just let a debugger stop inside of it and take a close look
at what you see around you. Basically, the "scope" of an extension type
knows what is defined inside of that type. That's how you get at the
reference graph.

For a bonus, walking type graphs might be a generally usable feature, so it
may (or may not) be a good idea to implement a general way of doing this,
maybe as some kind of generator.

Oh, and your code needs to be compatible with Py2.4, as well as 2to3-able
to run in Py3.x. But these things are usually quite easily fixable after
the fact.

Stefan



More information about the cython-devel mailing list