[Cython] CEP1000: Native dispatch through callables

Stefan Behnel stefan_ml at behnel.de
Thu Apr 19 08:41:40 CEST 2012


Dag Sverre Seljebotn, 18.04.2012 23:35:
> from numpy import sqrt, sin
> 
> cdef double f(double x):
>     return sqrt(x * x) # or sin(x * x)
> 
> Of course, here one could get the pointer in the module at import time.

That optimisation would actually be very worthwhile all by itself. I mean,
we know what signatures we need for globally imported functions throughout
the module, so we can reduce the call to a single jump through a function
pointer (although likely with a preceding NULL check, which the branch
prediction would be happy to give us for free). At least as long as sqrt is
not being reassigned, but that should hit the 99% case.


> However, here:
> 
> from numpy import sqrt
> 
> cdef double f(double x):
>     return np.sqrt(x * x) # or np.sin(x * x)
> 
> the __getattr__ on np sure is larger than any effect we discuss.

Yes, that would have to stay a .pxd case, I guess.


> From the numbers above, I think I'm ready to accept the "getfuncptr"
> approach penalty (1.6 ns for a direct hit, larger when the caller accepts
> more signatures) as acceptable, given the added flexibility. When you care
> about the 1.6 ns, you're always going to want to do early binding anyway.
> 
> However, just as I'm convinced about interning, there appears to be two new
> arguments for keys:
> 
>  - For a large number of overloads with getfuncptr, it can be much faster
> than interning. A 20ns difference starts to get interesting.

I don't think any of the numbers you presented marks any of the solutions
as "expensive" or "wrong". The advantage of a callback function for this is
that it is the most flexible solution that will most easily hit all use cases.

The only problem I see with getfuncptr() is that it shifts not only the
runtime work to the callee but also the development work, debugging,
optimisation, etc. We should provide a default implementation for non-JITs
in that case, preferably one that fits into a header file rather than
requiring a library. It could still become a set of C-API functions when
(if?) CPython starts to adopt this (and exposes it also for its builtins).


>  - PSF GSoC proposals are not public yet, but I think I can say as much as
> that there's a PEP 3121 (multiple interpreter states) proposal, and that
> Martin von Lowis is favourable about it. If that goes anywhere it doesn't
> make interning impossible but it requires a shared C component and changing
> the spec from PyBytesObject to char*. Perhaps that can be done in a
> PEP-ification revision though.

I asked him what he thinks about the status of that PEP and he seems to be
unhappy about the current massive lack of evaluation data regarding the
general applicability and completeness of the infrastructure. One of the
outcomes of the GSoC would be that we learn what problems actually exist
and what needs to be done (if anything) to make this work for more code out
there. IMHO, that would be a very valuable result, also for us.

Note that the focus for the GSoC project is on the stdlib C modules.
Without those, general support in Cython wouldn't be very helpful for any
real-world code.

We should move any PEP3121 related discussion to a separate (mammoth?)
thread and a new CEP, though (the tracker tickets are already there). This
is a large topic that is only loosely related to your CEP. Note that module
global C variables would no longer exist with PEP3121 either. They would
move into a module struct (basically a module closure). So we'd pay with an
indirection already, for everything.

Stefan


More information about the cython-devel mailing list