[Cython] New early-binding concept [was: CEP1000]

Stefan Behnel stefan_ml at behnel.de
Fri Apr 20 08:21:41 CEST 2012


Robert Bradshaw, 20.04.2012 02:52:
> On Thu, Apr 19, 2012 at 3:53 AM, mark florisson wrote:
>> On 19 April 2012 08:17, Dag Sverre Seljebotn wrote:
>>> On 04/19/2012 08:41 AM, Stefan Behnel wrote:
>>>> Dag Sverre Seljebotn, 18.04.2012 23:35:
>>>>>
>>>>> from numpy import sqrt, sin
>>>>>
>>>>> cdef double f(double x):
>>>>>     return sqrt(x * x) # or sin(x * x)
>>>>>
>>>>> Of course, here one could get the pointer in the module at import time.
>>>>
>>>> That optimisation would actually be very worthwhile all by itself. I mean,
>>>> we know what signatures we need for globally imported functions throughout
>>>> the module, so we can reduce the call to a single jump through a function
>>>> pointer (although likely with a preceding NULL check, which the branch
>>>> prediction would be happy to give us for free). At least as long as sqrt
>>>> is not being reassigned, but that should hit the 99% case.
>>>>
>>>>> However, here:
>>>>>
>>>>> from numpy import sqrt
>>> Correction: "import numpy as np"
>>>>>
>>>>> cdef double f(double x):
>>>>>     return np.sqrt(x * x) # or np.sin(x * x)
>>>>>
>>>>> the __getattr__ on np sure is larger than any effect we discuss.
>>>>
>>>> Yes, that would have to stay a .pxd case, I guess.
>>>
>>> How about this mini-CEP:
>>>
>>> Modules are allowed to specify __nomonkey__ (or __const__, or
>>> __notreassigned__), a list of strings naming module-level variables where
>>> "we don't hold you responsible if you assume no monkey-patching of these".
>>>
>>> When doing "import numpy as np", then (assuming "np" is never reassigned in
>>> the module), at import time we check all names looked up from it in
>>> __nomonkey__, and if so treat them as "from numpy import sqrt as 'np.sqrt'",
>>> i.e. the "np." is just a namespace mechanism.
>>
>> I like the idea. I think this could be generalized to a 'final'
>> keyword, that could also enable optimizations for cdef class
>> attributes. So you'd say
>>
>> cdef final object np
>> import numpy as np
>>
>> For class attributes this would tell the compiler that it will not be
>> rebound, which means you could check if attributes are initialized in
>> the initializer, or just pull such checks (as wel as bounds checks),
>> at least for memoryviews, out of loops, without worrying whether it
>> will be reassigned in the meantime.
> 
> final is a nice way to describe this. If we were to introduce a new
> keyword, static might do as well.
> 
> It seems more natural to do this in the numpy.pxd file (perhaps it
> could just be declared as a final object) and that would allow us to
> not worry about re-assignment. Cython could then try to keep that
> contract for any modules it compiles. (This is, however, a bit more
> restrictive, though one can always cimport and import modules under
> different names.)

However, it's actually not the module that's "final" in this regard but the
functions it exports - *they* do not change and neither do their C
signatures. So the "final" modifier should stick to the functions (possibly
declared at the "cdef extern" line), which would then allow us to resolve
and cache the C function pointers at import time.

That mimics the case of the current "final" classes and methods, where we
take off the method pointers at compile time. And inside of numpy.pxd is
the perfect place to declare this, not as part of the import.

Stefan


More information about the cython-devel mailing list