[Cython] New early-binding concept [was: CEP1000]

Robert Bradshaw robertwb at gmail.com
Fri Apr 20 02:52:53 CEST 2012


On Thu, Apr 19, 2012 at 3:53 AM, mark florisson
<markflorisson88 at gmail.com> wrote:
> On 19 April 2012 08:17, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no> wrote:
>> On 04/19/2012 08:41 AM, Stefan Behnel wrote:
>>>
>>> Dag Sverre Seljebotn, 18.04.2012 23:35:
>>>>
>>>> from numpy import sqrt, sin
>>>>
>>>> cdef double f(double x):
>>>>     return sqrt(x * x) # or sin(x * x)
>>>>
>>>> Of course, here one could get the pointer in the module at import time.
>>>
>>>
>>> That optimisation would actually be very worthwhile all by itself. I mean,
>>> we know what signatures we need for globally imported functions throughout
>>> the module, so we can reduce the call to a single jump through a function
>>> pointer (although likely with a preceding NULL check, which the branch
>>> prediction would be happy to give us for free). At least as long as sqrt
>>> is
>>> not being reassigned, but that should hit the 99% case.
>>>
>>>
>>>> However, here:
>>>>
>>>> from numpy import sqrt
>>
>>
>> Correction: "import numpy as np"
>>
>>>>
>>>> cdef double f(double x):
>>>>     return np.sqrt(x * x) # or np.sin(x * x)
>>>>
>>>> the __getattr__ on np sure is larger than any effect we discuss.
>>>
>>>
>>> Yes, that would have to stay a .pxd case, I guess.
>>
>>
>> How about this mini-CEP:
>>
>> Modules are allowed to specify __nomonkey__ (or __const__, or
>> __notreassigned__), a list of strings naming module-level variables where
>> "we don't hold you responsible if you assume no monkey-patching of these".
>>
>> When doing "import numpy as np", then (assuming "np" is never reassigned in
>> the module), at import time we check all names looked up from it in
>> __nomonkey__, and if so treat them as "from numpy import sqrt as 'np.sqrt'",
>> i.e. the "np." is just a namespace mechanism.
>
> I like the idea. I think this could be generalized to a 'final'
> keyword, that could also enable optimizations for cdef class
> attributes. So you'd say
>
> cdef final object np
> import numpy as np
>
> For class attributes this would tell the compiler that it will not be
> rebound, which means you could check if attributes are initialized in
> the initializer, or just pull such checks (as wel as bounds checks),
> at least for memoryviews, out of loops, without worrying whether it
> will be reassigned in the meantime.

final is a nice way to describe this. If we were to introduce a new
keyword, static might do as well.

It seems more natural to do this in the numpy.pxd file (perhaps it
could just be declared as a final object) and that would allow us to
not worry about re-assignment. Cython could then try to keep that
contract for any modules it compiles. (This is, however, a bit more
restrictive, though one can always cimport and import modules under
different names.)

- Robert


More information about the cython-devel mailing list