[Cython] [Python-Dev] C-level duck typing

mark florisson markflorisson88 at gmail.com
Thu May 17 13:30:06 CEST 2012


On 17 May 2012 11:03, Stefan Behnel <stefan_ml at behnel.de> wrote:
> mark florisson, 17.05.2012 11:15:
>> On 17 May 2012 07:09, Stefan Behnel <stefan_ml at behnel.de> wrote:
>>> mark florisson, 16.05.2012 21:49:
>>>> On 16 May 2012 20:15, Stefan Behnel <stefan_ml at behnel.de> wrote:
>>>>> "Martin v. Löwis", 16.05.2012 20:33:
>>>>>>> Does this use case make sense to everyone?
>>>>>>>
>>>>>>> The reason why we are discussing this on python-dev is that we are looking
>>>>>>> for a general way to expose these C level signatures within the Python
>>>>>>> ecosystem. And Dag's idea was to expose them as part of the type object,
>>>>>>> basically as an addition to the current Python level tp_call() slot.
>>>>>>
>>>>>> The use case makes sense, yet there is also a long-standing solution
>>>>>> already to expose APIs and function pointers: the capsule objects.
>>>>>>
>>>>>> If you want to avoid dictionary lookups on the server side, implement
>>>>>> tp_getattro, comparing addresses of interned strings.
>>>>>
>>>>> I think Martin has a point there. Why not just use a custom attribute on
>>>>> callables that hold a PyCapsule? Whenever we see inside of a Cython
>>>>> implemented function that an object variable that was retrieved from the
>>>>> outside, either as a function argument or as the result of a function call,
>>>>> is being called, we try to unpack a C function pointer from it on all
>>>>> assignments to the variable. If that works, we can scan for a suitable
>>>>> signature (either right away or lazily on first access) and cache that. On
>>>>> each subsequent call through that variable, the cached C function will be used.
>>>>>
>>>>> That means we'd replace Python variables that are being called by multiple
>>>>> local variables, one that holds the object and one for each C function with
>>>>> a different signature that it is being called with. We set the C function
>>>>> variables to NULL when the Python function variable is being assigned to.
>>>>> When the C function variable is NULL on call, we scan for a matching
>>>>> signature and assign it to the variable.  When no matching signature can be
>>>>> found, we set it to (void*)-1.
>>>>>
>>>>> Additionally, we allow explicit user casts of Python objects to C function
>>>>> types, which would then try to unpack the C function, raising a TypeError
>>>>> on mismatch.
>>>>>
>>>>> Assignments to callable variables can be expected to occur much less
>>>>> frequently than calls to them, so this will give us a good trade-off in
>>>>> most cases. I don't see why this kind of caching would be any slower inside
>>>>> of loops than what we were discussing so far.
>>>>
>>>> This works really well for local variables, but for globals, def
>>>> methods or callbacks as attributes, this won't work so well, as they
>>>> may be rebound at any time outside of the module scope.
>>>
>>> Only half true for globals, which can be declared "cdef object", e.g. for
>>> imported names. That would allow Cython to see all possible reassignments
>>> in a module, which would then apply the above scheme.
>>
>> I suppose by default they could be properties of a module subclass.
>> That would also allow faster lookup of globals visible from Python in
>> Cython space in the same module (but probably slower from outside).
>
> Yes, that's another way to do it and yet another nice feature (which we've
> already been throwing into the discussions for years and years...)
>
>
>>> I don't think def methods are a use case for this because you'd either
>>> cpdef them or even cdef them if you want speed. If you want them to be
>>> overridable, you'll have to live with the speed penalty that that implies.
>>
>> Which means you can no longer pass stuff around as a callback
>
> Callbacks are not a problem because they behave like any other object
> that's being held in a variable (i.e. they are the normal case, not the
> exception). I was referring to globally defined def functions which can be
> reassigned in the module. That's a problem, but it's mostly the same as
> with any global name.
>

Oh, I see what you were referring too now. I think Vitja already
implemented something like the inline def calls, although I'm not sure
what the status of that is.

>>> For object attributes, you have to pay the penalty of a lookup anyway, no
>>> way around that.
>>
>> Not in a cdef class.
>
> Sure, also in cdef classes. Nothing keeps me from reassigning to the
> attribute of a cdef class. The difference is only that the attribute lookup
> is faster for them because it passes through a pointer indirection instead
> of a dict lookup. Apart from that, it's entirely the same thing.

I guess we're talking about different things again, I though you meant
dict lookups. Basically with the default of none-checking disabled, a
cdef class attribute lookup is just a struct attribute reference.
Anyway, my point was that caching pointers is a good idea, but only
works in limited cases, and we shouldn't limit the programming model
of users to enable fast calls. But if everyone agrees we need fast
dispatching, it seems we're already on the same page :)

> No, actually, it's even worse because you can't just hook into the dict (as
> Vitja did recently) and check if it has changed. You actually need to read
> the attribute again and then look up its C functions again, because even if
> the object pointer is the same, it doesn't mean that the object is the
> same, unless you keep an owned reference to it (which you can't without
> keeping the object alive). So there is even less of a chance for efficient
> caching.
>
> Stefan
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel


More information about the cython-devel mailing list