[Cython] CEP1000: Native dispatch through callables

Robert Bradshaw robertwb at gmail.com
Fri Apr 13 12:17:09 CEST 2012

On Fri, Apr 13, 2012 at 1:52 AM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> On 04/13/2012 01:38 AM, Robert Bradshaw wrote:
>> On Thu, Apr 12, 2012 at 3:34 PM, Dag Sverre Seljebotn
>> <d.s.seljebotn at astro.uio.no>  wrote:
>>> On 04/13/2012 12:11 AM, Dag Sverre Seljebotn wrote:
>>>> Travis Oliphant recently raised the issue on the NumPy list of what
>>>> mechanisms to use to box native functions produced by his Numba so that
>>>> SciPy functions can call it, e.g. (I'm making the numba part up):
>>>> @numba # Compiles function using LLVM
>>>> def f(x):
>>>> return 3 * x
>>>> print scipy.integrate.quad(f, 1, 2) # do many callbacks natively!
>>>> Obviously, we want something standard, so that Cython functions can also
>>>> be called in a fast way.
>>>> This is very similar to CEP 523
>>>> (http://wiki.cython.org/enhancements/nativecall), but rather than
>>>> Cython-to-Cython, we want something that both SciPy, NumPy, numba,
>>>> Cython, f2py, fwrap can implement.
>>>> Here's my proposal; Travis seems happy to implement something like it
>>>> for numba and parts of SciPy:
>>>> http://wiki.cython.org/enhancements/nativecall
>>> I'm sorry. HERE is the CEP:
>>> http://wiki.cython.org/enhancements/cep1000
>>> Since writing that yesterday, I've moved more in the direction of wanting
>>> a
>>> zero-terminated list of overloads instead of providing a count, and have
>>> the
>>> fast protocol jump over the header (since version is available
>>> elsewhere),
>>> and just demand that the structure is sizeof(void*)-aligned in the first
>>> place rather than the complicated padding.
>> Great idea to coordinate with the many other projects here. Eventually
>> this could maybe even be a PEP.
>> Somewhat related, I'd like to add support for Go-style interfaces.
>> These would essentially be vtables of pre-fetched function pointers,
>> and could play very nicely with this interface.
> Yep; but you agree that this can be done in isolation without considering
> vtables first?

Yes, for sure.

>> Have you given any thought as to what happens if __call__ is
>> re-assigned for an object (or subclass of an object) supporting this
>> interface? Or is this out of scope?
> Out-of-scope, I'd say. Though you can always write an object that detects if
> you assign to __call__...
>> Minor nit: I don't think should_dereference is worth branching on, if
>> one wants to save the allocation one can still use a variable-sized
>> type and point to oneself. Yes, that's an extra dereference, but the
>> memory is already likely close and it greatly simplifies the logic.
>> But I could be wrong here.
> Those minor nits are exactly what I seek; since Travis will have the first
> implementation in numba<->SciPy, I just want to make sure that what he does
> will work efficiently work Cython.


I have to admit building/invoking these var-arg-sized __nativecall__
records seems painful. Here's another suggestion:

struct {
    void* pointer;
    size_t signature; // compressed binary representation, 95% coverage
    char* long_signature; // used if signature is not representable in
a size_t, as indicated by signature = 0
} record;

These char* could optionally be allocated at the end of the record*
for optimal locality. We could even dispense with the binary
signature, but having that option allows us to avoid strcmp for stuff
like d)d and ffi)f.

> Can we perhaps just require that the information is embedded in the object?

I think not, this would require variably-sized objects (and also use
up the variable sized nature). Given that this is in a portion of the
program that is iterating over a Python tuple, I think the extra
deference here is non-consequential.

> I must admit that when I wrote that I was mostly thinking of JIT-style code
> generation, where you only use should_dereference for code-generation. But
> yes, by converting the table to a C structure you can do without a JIT.
>> Also, I'm not sure the type registration will scale, especially if
>> every callable type wanted to get registered. (E.g. currently closures
>> and generators are new types...) Where to draw the line? (Perhaps
>> things could get registered lazily on the first __nativecall__ lookup,
>> as they're likely to be looked up again?)
> Right... if we do some work to synchronize the types for Cython modules
> generated by the same version of Cython, we're left with 3-4 types for
> Cython, right? Then a couple for numba and one for f2py; so on the order of
> 10?

No, I think each closure is its own type.

> An alternative is do something funny in the type object to get across the
> offset-in-object information (abusing the docstring, or introduce our own
> flag which means that the type object has an additional non-standard field
> at the end).

It's a hack, but the flag + non-standard field idea might just work...
Ah, don't you just love C :)

- Robert

More information about the cython-devel mailing list