[Cython] CEP1000: Native dispatch through callables

Stefan Behnel stefan_ml at behnel.de
Fri Apr 13 21:52:33 CEST 2012


Robert Bradshaw, 13.04.2012 20:21:
> On Fri, Apr 13, 2012 at 10:26 AM, Robert Bradshaw wrote:
>> On Fri, Apr 13, 2012 at 4:59 AM, Dag Sverre Seljebotn wrote:
>>> On 04/13/2012 01:38 PM, Stefan Behnel wrote:
>>>> That would only apply at a per-module level, though, so it would
>>>> require an indirection for the signature IDs. But it would avoid a
>>>> global registry.
>>>>
>>>> Another idea would be to set the signature ID field to 0 at the beginning
>>>> and call a C-API function to let the current runtime assign an ID>  0,
>>>> unique for the currently running application. Then every user would only
>>>> have to parse the signature once to adapt to the respective ID and could
>>>> otherwise branch based on it directly.
>>>>
>>>> For Cython, we could generate a static ID variable for each typed call
>>>> that
>>>> we found in the sources. When encountering a C signature on a callable,
>>>> either a) the ID variable is still empty (initial case), then we parse the
>>>> signature to see if it matches the expected signature. If it does, we
>>>> assign the corresponding ID to the static ID variable and issue a direct
>>>> call. If b) the ID field is already set (normal case), we compare the
>>>> signature IDs directly and issue a C call it they match. If the IDs do not
>>>> match, we issue a normal Python call.
>>
>> If I understand correctly, you're proposing
>>
>> struct {
>>  char* sig;
>>  long id;
>> } sig_t;
>>
>> Where comparison would (sometimes?) compute id from sig by augmenting
>> a global counter and dict? Might be expensive to bootstrap, but
>> eventually all relevant ids would be filled in and it would be quick.

Yes. If a function is only called once, the overhead won't matter. And
starting from the second call, it would either be fast if the function
signature matches or slow anyway if it doesn't match.


>> Interesting. I wonder what the performance penalty would be over
>> assuming id is statically computed lots of the time, and using that to
>> compare against fixed values. And there's memory locality issues as
>> well.
> 
> To clarify, I'd really like to have the following as fast as possible:
> 
> if (callable.sig.id == X) {
>    // yep, that's what I thought
> } else {
>    // generic call
> }
> 
> Alternatively, one can imagine wanting to do:
> 
> switch (callable.sig.id) {
>     case X:
>         // I can do this
>     case Y:
>         // this is common and fast as well
>     ...
>     default:
>         // generic call
> }

Yes, that's the idea.


> There is some question about how promotion should work (e.g. should
> this flexibility reside in the caller or the callee (or both, though
> that could result in a quadratic number of comparisons)?)

Callees could expose multiple signatures (which would result in a direct
call for each, without further comparisons), then the caller would have to
choose between those. However, if none matches exactly, the caller might
want to promote its arguments and try more signatures. In any case, it's
the caller that does the work, never the callee.

We could generate code like this:

    /* cdef int x = ...
     * cdef long y = ...
     * cdef int z       # interesting: what if z is not typed?
     * z = func(x, y)
     */

    if (func.sig.id == id("[int,long] -> int")) {
         z = ((cast)func.cfunc) (x,y);
    } else if (sizeof(long) > sizeof(int) &&
               (func.sig.id == id("[long,long] -> int"))) {
         z = ((cast)func.cfunc) ((long)x, y);
    } etc. ... else {
         /* pack and call as Python function */
    }

Meaning, the C compiler could reduce the amount of optimistic call code at
compile time.

Stefan


More information about the cython-devel mailing list