[Cython] CEP1000: Native dispatch through callables

Stefan Behnel stefan_ml at behnel.de
Fri Apr 13 13:38:56 CEST 2012

Robert Bradshaw, 13.04.2012 12:17:
> On Fri, Apr 13, 2012 at 1:52 AM, Dag Sverre Seljebotn wrote:
>> On 04/13/2012 01:38 AM, Robert Bradshaw wrote:
>>> Have you given any thought as to what happens if __call__ is
>>> re-assigned for an object (or subclass of an object) supporting this
>>> interface? Or is this out of scope?
>> Out-of-scope, I'd say. Though you can always write an object that detects if
>> you assign to __call__...

+1 for out of scope. This is a pure C level feature.

>>> Minor nit: I don't think should_dereference is worth branching on, if
>>> one wants to save the allocation one can still use a variable-sized
>>> type and point to oneself. Yes, that's an extra dereference, but the
>>> memory is already likely close and it greatly simplifies the logic.
>>> But I could be wrong here.
>> Those minor nits are exactly what I seek; since Travis will have the first
>> implementation in numba<->SciPy, I just want to make sure that what he does
>> will work efficiently work Cython.
> +1
> I have to admit building/invoking these var-arg-sized __nativecall__
> records seems painful. Here's another suggestion:
> struct {
>     void* pointer;
>     size_t signature; // compressed binary representation, 95% coverage
>     char* long_signature; // used if signature is not representable in
> a size_t, as indicated by signature = 0
> } record;
> These char* could optionally be allocated at the end of the record*
> for optimal locality. We could even dispense with the binary
> signature, but having that option allows us to avoid strcmp for stuff
> like d)d and ffi)f.

Assuming we use literals and a const char* for the signature, the C
compiler would cut down the number of signature strings automatically for
us. And a pointer comparison is the same as a size_t comparison.

That would only apply at a per-module level, though, so it would require an
indirection for the signature IDs. But it would avoid a global registry.

Another idea would be to set the signature ID field to 0 at the beginning
and call a C-API function to let the current runtime assign an ID > 0,
unique for the currently running application. Then every user would only
have to parse the signature once to adapt to the respective ID and could
otherwise branch based on it directly.

For Cython, we could generate a static ID variable for each typed call that
we found in the sources. When encountering a C signature on a callable,
either a) the ID variable is still empty (initial case), then we parse the
signature to see if it matches the expected signature. If it does, we
assign the corresponding ID to the static ID variable and issue a direct
call. If b) the ID field is already set (normal case), we compare the
signature IDs directly and issue a C call it they match. If the IDs do not
match, we issue a normal Python call.

>> Right... if we do some work to synchronize the types for Cython modules
>> generated by the same version of Cython, we're left with 3-4 types for
>> Cython, right? Then a couple for numba and one for f2py; so on the order of
>> 10?
> No, I think each closure is its own type.

And that even applies to fused functions, right? They'd have one closure
for each type combination.

>> An alternative is do something funny in the type object to get across the
>> offset-in-object information (abusing the docstring, or introduce our own
>> flag which means that the type object has an additional non-standard field
>> at the end).
> It's a hack, but the flag + non-standard field idea might just work...

Plus, it wouldn't have to stay a non-standard field. If it's accepted into
CPython 3.4, we could safely use it in all existing versions of CPython.


More information about the cython-devel mailing list