[Cython] [Python-Dev] C-level duck typing

mark florisson markflorisson88 at gmail.com
Thu May 17 11:15:15 CEST 2012

On 17 May 2012 07:09, Stefan Behnel <stefan_ml at behnel.de> wrote:
> mark florisson, 16.05.2012 21:49:
>> On 16 May 2012 20:15, Stefan Behnel <stefan_ml at behnel.de> wrote:
>>> "Martin v. Löwis", 16.05.2012 20:33:
>>>>> Does this use case make sense to everyone?
>>>>> The reason why we are discussing this on python-dev is that we are looking
>>>>> for a general way to expose these C level signatures within the Python
>>>>> ecosystem. And Dag's idea was to expose them as part of the type object,
>>>>> basically as an addition to the current Python level tp_call() slot.
>>>> The use case makes sense, yet there is also a long-standing solution
>>>> already to expose APIs and function pointers: the capsule objects.
>>>> If you want to avoid dictionary lookups on the server side, implement
>>>> tp_getattro, comparing addresses of interned strings.
>>> I think Martin has a point there. Why not just use a custom attribute on
>>> callables that hold a PyCapsule? Whenever we see inside of a Cython
>>> implemented function that an object variable that was retrieved from the
>>> outside, either as a function argument or as the result of a function call,
>>> is being called, we try to unpack a C function pointer from it on all
>>> assignments to the variable. If that works, we can scan for a suitable
>>> signature (either right away or lazily on first access) and cache that. On
>>> each subsequent call through that variable, the cached C function will be used.
>>> That means we'd replace Python variables that are being called by multiple
>>> local variables, one that holds the object and one for each C function with
>>> a different signature that it is being called with. We set the C function
>>> variables to NULL when the Python function variable is being assigned to.
>>> When the C function variable is NULL on call, we scan for a matching
>>> signature and assign it to the variable.  When no matching signature can be
>>> found, we set it to (void*)-1.
>>> Additionally, we allow explicit user casts of Python objects to C function
>>> types, which would then try to unpack the C function, raising a TypeError
>>> on mismatch.
>>> Assignments to callable variables can be expected to occur much less
>>> frequently than calls to them, so this will give us a good trade-off in
>>> most cases. I don't see why this kind of caching would be any slower inside
>>> of loops than what we were discussing so far.
>> This works really well for local variables, but for globals, def
>> methods or callbacks as attributes, this won't work so well, as they
>> may be rebound at any time outside of the module scope.
> Only half true for globals, which can be declared "cdef object", e.g. for
> imported names. That would allow Cython to see all possible reassignments
> in a module, which would then apply the above scheme.

I suppose by default they could be properties of a module subclass.
That would also allow faster lookup of globals visible from Python in
Cython space in the same module (but probably slower from outside).

> I don't think def methods are a use case for this because you'd either
> cpdef them or even cdef them if you want speed. If you want them to be
> overridable, you'll have to live with the speed penalty that that implies.

Which means you can no longer pass stuff around as a callback, but you
need to define an interface in Cython and have people pass around
objects on which you call methods. That is often less Pythonic and
furthermore restricts people to use Cython for all their code. What
you want is something that is fast when given a Cython callable, but
which still works when I write my stuff in Python. Having to inherit
from some cdef class and override its cpdef method just to pass a
callback to other code from Python is a chore and unnecessary burden.

We need to stop sacrificing our design decisions for speed. Speed
should be obtained through clever compiler or interpreter design, not
by telling its users to rewrite their code in a specific way that fits
the current incapabilities of the compiler.

> For object attributes, you have to pay the penalty of a lookup anyway, no
> way around that.

Not in a cdef class. But even in a cdef class any subclass method can
rebind your attribute at any time. We currently have the same problem
with memoryviews, that have to check whether they are initialized for
every access.

> We can't even cache anything here (e.g. with a borrowed
> reference) because the attribute may be rebound to another object that
> happens to live at the same address as the previous one. However, if you
> want speed, you'd do it as in CPython and assign the object to a local
> variable to pay the lookup of only once. Problem solved.
>> I think in
>> general Cython code could be easily sped up for most cases by provided
>> a really fast dispatch mechanism here.
> I feel inclined to doubt that by now.
> Stefan
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel

More information about the cython-devel mailing list