[Cython] [Python-Dev] C-level duck typing

mark florisson markflorisson88 at gmail.com
Wed May 16 23:06:59 CEST 2012

On 16 May 2012 21:16, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no> wrote:
> On Wed, 16 May 2012 20:49:18 +0100, mark florisson
> <markflorisson88 at gmail.com> wrote:
>> On 16 May 2012 20:15, Stefan Behnel <stefan_ml at behnel.de> wrote:
>>> "Martin v. Löwis", 16.05.2012 20:33:
>>>>> Does this use case make sense to everyone?
>>>>> The reason why we are discussing this on python-dev is that we are
>>>>> looking
>>>>> for a general way to expose these C level signatures within the Python
>>>>> ecosystem. And Dag's idea was to expose them as part of the type
>>>>> object,
>>>>> basically as an addition to the current Python level tp_call() slot.
>>>> The use case makes sense, yet there is also a long-standing solution
>>>> already to expose APIs and function pointers: the capsule objects.
>>>> If you want to avoid dictionary lookups on the server side, implement
>>>> tp_getattro, comparing addresses of interned strings.
>>> I think Martin has a point there. Why not just use a custom attribute on
>>> callables that hold a PyCapsule? Whenever we see inside of a Cython
>>> implemented function that an object variable that was retrieved from the
>>> outside, either as a function argument or as the result of a function
>>> call,
>>> is being called, we try to unpack a C function pointer from it on all
>>> assignments to the variable. If that works, we can scan for a suitable
>>> signature (either right away or lazily on first access) and cache that.
>>> On
>>> each subsequent call through that variable, the cached C function will be
>>> used.
>>> That means we'd replace Python variables that are being called by
>>> multiple
>>> local variables, one that holds the object and one for each C function
>>> with
>>> a different signature that it is being called with. We set the C function
>>> variables to NULL when the Python function variable is being assigned to.
>>> When the C function variable is NULL on call, we scan for a matching
>>> signature and assign it to the variable.  When no matching signature can
>>> be
>>> found, we set it to (void*)-1.
>>> Additionally, we allow explicit user casts of Python objects to C
>>> function
>>> types, which would then try to unpack the C function, raising a TypeError
>>> on mismatch.
>>> Assignments to callable variables can be expected to occur much less
>>> frequently than calls to them, so this will give us a good trade-off in
>>> most cases. I don't see why this kind of caching would be any slower
>>> inside
>>> of loops than what we were discussing so far.
>>> Stefan
>>> _______________________________________________
>>> cython-devel mailing list
>>> cython-devel at python.org
>>> http://mail.python.org/mailman/listinfo/cython-devel
>> This works really well for local variables, but for globals, def
>> methods or callbacks as attributes, this won't work so well, as they
>> may be rebound at any time outside of the module scope. I think in
> +1. The python-dev discussion is pretty focused on the world of a manually
> written C extension. But code generation is an entirely different matter.
> Python puts in place pretty efficient boundaries against full-program static
> analysis, so there's really not much we can do.
> Here's some of my actual code I have for wrapping a C++ library:
> cdef class CallbackEventReceiver(BasicEventReceiver):
>    cdef object callback
>    def __init__(self, callback):
>        self.callback = callback
>    cdef dispatch_event(self, ...):
>        self.callback(...)
> The idea is that you can subclass BasicEventReceiver in Cython for speed,
> but if you want to use a Python callable then this converter is used.
> This code is very performance critical. And, the *loop* in question sits
> deep inside a C++ library.
> Good luck pre-acquiring the function pointer of self.callback in any useful
> way. Even if it is not exported by the class, that could be overridden by a
> subclass. I stress the fact that this is real world code by yours truly
> (unfortunately not open source, it wraps a closed source library).
> Yes, you can tell users to be mindful of this and make as much as possible
> local variables, introduce final modifiers and __nomonkey__ and whatnot, but
> that's a large price to pay to avoid hacking tp_flags.
> Dag
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel

I suppose for this case it might be faster to check if the world is
sane (if the callback or function is still the object you expect it to
be) on top of looking at whether the function pointer is unpacked. You
don't really want to store that extra information in objects, but for
global variables it might be worth the while (unless you're doing
import * :)). So we definitely always need a fast dispatcher, but we
may do slightly better in some cases if we care to implement it. I bet
no one will care about shaving off those last 2 nano seconds though :)

More information about the cython-devel mailing list