[Cython] Bindings performance issue

mark florisson markflorisson88 at gmail.com
Thu Jun 2 23:03:38 CEST 2011

On 2 June 2011 22:16, Robert Bradshaw <robertwb at math.washington.edu> wrote:
> On Thu, Jun 2, 2011 at 1:00 PM, mark florisson
> <markflorisson88 at gmail.com> wrote:
>> On 2 June 2011 18:31, Robert Bradshaw <robertwb at math.washington.edu> wrote:
>>> On Thu, Jun 2, 2011 at 1:30 AM, Vitja Makarov <vitja.makarov at gmail.com> wrote:
>>>> 2011/6/2 Robert Bradshaw <robertwb at math.washington.edu>:
>>>>> On Wed, Jun 1, 2011 at 7:26 AM, Vitja Makarov <vitja.makarov at gmail.com> wrote:
>>>>>> 2011/6/1 mark florisson <markflorisson88 at gmail.com>:
>>>>>>> On 31 May 2011 20:25, Vitja Makarov <vitja.makarov at gmail.com> wrote:
>>>>>>>> Hi!
>>>>>>>> Is bindings performance issue valuable?
>>>>>>>> $ cat bindbench.pyx
>>>>>>>> def wo_bindings():
>>>>>>>>    pass
>>>>>>>> def outer():
>>>>>>>>    def inner():
>>>>>>>>        pass
>>>>>>>>    return inner
>>>>>>>> with_bindings = outer()
>>>>>>>> $ python
>>>>>>>>>>> import timeit
>>>>>>>>>>> timeit.repeat('with_bindings()', setup='from bindbench import wo_bindings, with_bindings', repeat=1, number=100000000)
>>>>>>>> [6.169871807098389]
>>>>>>>>>>> timeit.repeat('wo_bindings()', setup='from bindbench import wo_bindings, with_bindings', repeat=1, number=100000000)
>>>>>>>> [4.609416961669922]
>>>>>>>> PyCBindings makes it 1.3 (difference is about 15ns on my laptop) times
>>>>>>>> slower for CPython interpreter execution.
>>>>>>>> As CPython has some optimizations for CFunctions and PyCFunctions.
>>>>>>>> Does it make sense for us? Or we can easily switch to bindings?
>>>>>>>> --
>>>>>>>> vitja.
>>>>>>>> _______________________________________________
>>>>>>>> cython-devel mailing list
>>>>>>>> cython-devel at python.org
>>>>>>>> http://mail.python.org/mailman/listinfo/cython-devel
>>>>>>> I think switching should be fine, if you'd desperately need the speed
>>>>>>> you'd be calling c(p)def functions from Cython. In fact, when the
>>>>>>> fused cfunction will be ready it will be even slightly slower, as it
>>>>>>> overrides the tp_call. But perhaps that should really be made into a
>>>>>>> subclass...
>>>>> Yes, this should be a subclass, slowing all calls down is a bad idea.
>>>>>>> Anyway, would you use these for Python classes and module-level def
>>>>>>> functions (and closures) only, or would you also use them in extension
>>>>>>> classes? Because I found at least a bit of problem there, as extension
>>>>>>> class methods get 'self' passed as the first argument to the C
>>>>>>> function (as PyCFunctionObject.m_self), whereas methods in Python
>>>>>>> classes get 'self' passed in the args tuple (and the m_self is
>>>>>>> unused).
>>>>>> Recently I've found a problem with static methods (__new__ for
>>>>>> example) of inner classes (classes created inside a function, for
>>>>>> instance)
>>>>>> I think binding version should be used for all regular def functions
>>>>>> (defs, staticmethods, classmethods and so on).
>>>>> +1 to moving that way. (Well, static and class methods bind
>>>>> differently, right?)
>>>>>> I also think that it's better to rename binding_PyCFunction_Type to
>>>>>> something like CyFunctionType and make it more like PyFunction.
>>>>> +1
>>>>> Is there any advantage to the method descriptors (other than that
>>>>> they're easier for a human to write?)
>>>> Initially bindings was written to support bound class methods (am I right?)
>>>> So when we use it for regular functions 'binding' in the name doesn't
>>>> reflect its purpose.
>>> There's three kinds of functions we create: PyCFunctions, binding
>>> PyCFunctions, and extension class methods (via descriptors). I was
>>> asking about the latter.
>> Yeah those descriptors have no advantage at all, it's simple
>> convenience. So we could just use our binding descriptor. The only
>> difference is that those descriptors return PyCFunctionObjects (which
>> may have special-cased performance-related advantages, at Vitja
>> demonstrated), and we would return a binding function (that would have
>> the same dict as the original).
> We could do this too on binding if we wanted...

That's what I mean, this all happens when we bind the function. If we
return a bound PyCFunctionObject, we lose our dict.

>> I still believe that you'd have to do something different for
>> extension classes (e.g. depending on a flag pass 'self' as m_self or
>> in the args tuple, so tp_call would have to be overridden). Besides,
>> if you're going to return a binding pycfunction from descr_get, you'll
>> want to do type checking for 'self' for unbound method calls. I got
>> all that working for normal python classes here:
>> https://github.com/markflorisson88/cython/tree/fusedtypes . Currently
>> it is intermixed with support for fused Python functions, but seeing
>> you'd override tp_call anyway the overhead should be negligible.
> I was thinking of avoiding descriptors altogether, and simply
> populating the cdef class dict.

Yes, that is my proposal. Without populating the class dict, you
cannot have custom pycfunctions.

>>> As for the first two, it reflects the difference in behavior. If I
>>> take the function and assign it to a class, it will bind as a method
>>> when I go to look it up. This is arguably the biggest difference
>>> between built-in functions and Python functions.
>>>> One the other hand it's much more easy to write manually.
>>>> About staticmethods: I think that CyFunction type should handle it as well
>>>> because staticmethods can have custom attributes and act just like
>>>> normal def one.
>>>> The difference is insde descr_get it should return bound method for
>>>> normal and self for staticmethods.
>>> Yep. Does that merit a new class, or a flag that's checked on every
>>> function invocation (false 99.9% of the time)?
>> Well, it only needs to do the check for when it gets bound (i.e., in
>> tp_descr_get). But I wonder if that check is really necessary? Why not
>> simply call the builtin staticmethod() to wrap a (binding) pycfunction
>> and put it in the class's dict? I think that that is how it works
>> currently. And when you wrap something with staticmethod() you expect
>> it to be a staticmethod object, i.e. uncallable and without a dict.
> If we don't care much about the speed of static and class methods, we
> could just use Pythons stanadard wrappers (and decorators) rather than
> implement this ourselves. I'm leaning towards that. Maybe eventually
> we'll want to support cdef static/class methods, but that would be an
> entirely different mechanism.

Right, it seems that that's the thing that already happens. For cdef
classes we are using PyDescr_NewClassMethod and for Python classes we
are using PyClassMethod_New (in __Pyx_Method_ClassMethod). So this
conveniently fixes the 'self' (or 'cls') problem for classmethods. Of
course for normal methods the 'self' problem remains as you'd want to
keep the dict after binding and only want to implement one class
(implementing two types, one for Python classes and one for cdef
classes would be a bit inconvenient, considering also that fused types
need to be supported in either class (or in either subclass)).

>>>> Here I've tried to support staticmethods inside bindings type:
>>>> https://github.com/vitek/cython/commit/c0725ab340a8173d8e6724c62be3a135df58980e
>>>>> I think that small speed regression for better compatibility is OK if
>>>>> we add a directive to not create binding functions. (It'd be nice if
>>>>> we could work around it, but that's hard as CPython has special
>>>>> hooks...) The bigger issue is that the binding behavior is backwards
>>>>> incompatible, and perhaps in a subtle way. Perhaps we need to have a
>>>>> phase where all currently non-binding functions that will become
>>>>> binding functions will raise an error (or at least a warning) to wean
>>>>> people off of the old behavior before making a switch.
>>>> Is the difference so significant?
>>> If anyone is assigning a Cython function to an object and then using
>>> it they're counting on the current non-binding behavior, and it will
>>> break. The speed is probably a lesser issue, which is what benchmarks
>>> are for.
>> If you're binding functions to classes without expecting it to ever
>> bind, you don't really have bitching rights when stuff breaks later
>> on. You should have been using staticmethod() to begin with. And we
>> never said that our functions would never bind :)
> No, you're assigning it to an object, counting on being able to call
> it later on. E.g. the following is legal (though contrived in this
> example):
> sage: class A:
> ....:     pass
> ....:
> sage: a = A()
> sage: a.foo = max
> sage: a.foo([1,2,3])
> 3
> If instead of len, it was one of our functions, then it would be bad
> to suddenly change the semantics, because it could still run but
> produce bad answers (e.g. if we had implemented max, suddenly a would
> be included in the comparison). This is why I proposed raising an
> explicit error as an intermediate step.
> If we don't promise anything, the contract is whatever the code does.
> That's the problem with not having a specification (which would be
> really nice, but is a lot of work).

Functions on objects never get bound, they only get bound if they are
on the class. So your code would still work with binding functions.

> - Robert
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel

More information about the cython-devel mailing list