[Cython] [Python-Dev] C-level duck typing

Stefan Behnel stefan_ml at behnel.de
Thu May 17 08:09:08 CEST 2012


mark florisson, 16.05.2012 21:49:
> On 16 May 2012 20:15, Stefan Behnel <stefan_ml at behnel.de> wrote:
>> "Martin v. Löwis", 16.05.2012 20:33:
>>>> Does this use case make sense to everyone?
>>>>
>>>> The reason why we are discussing this on python-dev is that we are looking
>>>> for a general way to expose these C level signatures within the Python
>>>> ecosystem. And Dag's idea was to expose them as part of the type object,
>>>> basically as an addition to the current Python level tp_call() slot.
>>>
>>> The use case makes sense, yet there is also a long-standing solution
>>> already to expose APIs and function pointers: the capsule objects.
>>>
>>> If you want to avoid dictionary lookups on the server side, implement
>>> tp_getattro, comparing addresses of interned strings.
>>
>> I think Martin has a point there. Why not just use a custom attribute on
>> callables that hold a PyCapsule? Whenever we see inside of a Cython
>> implemented function that an object variable that was retrieved from the
>> outside, either as a function argument or as the result of a function call,
>> is being called, we try to unpack a C function pointer from it on all
>> assignments to the variable. If that works, we can scan for a suitable
>> signature (either right away or lazily on first access) and cache that. On
>> each subsequent call through that variable, the cached C function will be used.
>>
>> That means we'd replace Python variables that are being called by multiple
>> local variables, one that holds the object and one for each C function with
>> a different signature that it is being called with. We set the C function
>> variables to NULL when the Python function variable is being assigned to.
>> When the C function variable is NULL on call, we scan for a matching
>> signature and assign it to the variable.  When no matching signature can be
>> found, we set it to (void*)-1.
>>
>> Additionally, we allow explicit user casts of Python objects to C function
>> types, which would then try to unpack the C function, raising a TypeError
>> on mismatch.
>>
>> Assignments to callable variables can be expected to occur much less
>> frequently than calls to them, so this will give us a good trade-off in
>> most cases. I don't see why this kind of caching would be any slower inside
>> of loops than what we were discussing so far.
> 
> This works really well for local variables, but for globals, def
> methods or callbacks as attributes, this won't work so well, as they
> may be rebound at any time outside of the module scope.

Only half true for globals, which can be declared "cdef object", e.g. for
imported names. That would allow Cython to see all possible reassignments
in a module, which would then apply the above scheme.

I don't think def methods are a use case for this because you'd either
cpdef them or even cdef them if you want speed. If you want them to be
overridable, you'll have to live with the speed penalty that that implies.

For object attributes, you have to pay the penalty of a lookup anyway, no
way around that. We can't even cache anything here (e.g. with a borrowed
reference) because the attribute may be rebound to another object that
happens to live at the same address as the previous one. However, if you
want speed, you'd do it as in CPython and assign the object to a local
variable to pay the lookup of only once. Problem solved.


> I think in
> general Cython code could be easily sped up for most cases by provided
> a really fast dispatch mechanism here.

I feel inclined to doubt that by now.

Stefan


More information about the cython-devel mailing list