[Cython] PEP 3135 -- New Super

Stefan Behnel stefan_ml at behnel.de
Tue Jul 5 10:04:14 CEST 2011


Vitja Makarov, 05.07.2011 09:17:
> 2011/7/5 Stefan Behnel:
>> Vitja Makarov, 05.07.2011 08:21:
>>> I was thinking about implementing new super() with no arguments.
>> http://trac.cython.org/cython_trac/ticket/696
>>
>>> The problem is where to store __class__, I see two options here:
>>>
>>> 1. Add func_class member to CyFunction, this way __class__ will be
>>> private and not visible for inner functions:
>>> 2. Put it into closure
>>
>> The second option has the advantage of requiring the field only when super()
>> is used, whereas the first impacts all functions.
>>
>> I would expect that programs commonly have a lot more functions than
>> specifically methods that use a no-argument call to super(), so this may
>> make a difference.
>>
>
> So, now classes are created the following way:
>
> class_dict = {}
> class_dict.foo = foo_func
> class = CreateClass(class_dict)
>
> So after class is created I should check its dict for CyFunction
> members (maybe only ones that actually require __class__)
> and set __class__:
>
> for value in class.__dict__.itervalues():
>     if isinstance(value, CyFunction) and value.func_class is WantClass:
>         value.func_class = class

Remember that no-args super() can only be used in functions that are 
literally written inside of a class body, so we actually know at compile 
time which functions need this field. We can thus do better than a generic 
loop over all fields. We even have the function object pointers directly 
available in the module init function where we create the class body.

BTW, we also need a way to make this work for cdef classes. No idea how 
different that would be.


>> OTOH, not all methods have a closure, so creating one just to store the
>> "__class__" field is very wasteful, in terms of both runtime and memory
>> overhead. A lot more wasteful than paying 8 bytes of memory for each
>> function, with no additional time overhead.
>
> Going this way it only requires to initialize closure:

Yes, and that's costly.


> Btw, first way requires cyfunction signature change, it would accept
> cyfunction object as first argument.

We currently pass the binding (i.e. owning) object, right?


> This also could help to solve default args problem.

And potentially other problems, too. Think of heap allocated modules, for 
example.

http://trac.cython.org/cython_trac/ticket/173
http://trac.cython.org/cython_trac/ticket/218

Seeing this, I'm all for using a field in CyFunction.


>>> And I don't think that __class__ should be use somewhere outside super()
>>
>> Agreed. CPython simply uses a compile time heuristic ("is there a function
>> call to something global named 'super'?") when creating this field, so it's
>> strictly reserved for this use case.
>>
>> BTW, I like the irony in the fact that CPython essentially gives Cython
>> semantics to the "super" builtin here, by (partially) evaluating it at
>> compile time.
>
> Yeah, I think Cython super with no args should be a little bit faster
> then classic one.

I think speed isn't really all that important here. Calling Python methods 
is costly enough anyway.

IMO, the main reason for the heuristic is to prevent class objects from 
being kept alive by their methods, except for the single case where super() 
is used. Keeping a class alive just because one of its methods is still 
used somewhere can be very costly, depending on the content of the class 
dict. It also creates a reference cycle, which is another costly thing in 
CPython as it requires a GC run over the whole class dict to get cleaned up.

The situation for modules is at least slightly different, as modules do not 
tend to get unloaded, so there's always a reference to them from 
sys.modules. If that ever gets removed, it's really ok if it takes time to 
clean things up.

Stefan


More information about the cython-devel mailing list