[Cython] local variable handling in generators
Dag Sverre Seljebotn
d.s.seljebotn at astro.uio.no
Sun May 22 21:48:57 CEST 2011
On 05/22/2011 02:33 PM, Stefan Behnel wrote:
> Hi,
>
> I've been looking at the nqueens benchmark for a while, and I think it's
> actually not that a bad benchmark for generators.
>
> http://hg.python.org/benchmarks/file/tip/performance/bm_nqueens.py
>
> A better implementation only for Py2.7/Py3 is here:
>
> https://github.com/cython/cython/blob/master/Demos/benchmarks/nqueens.py
>
> Cython currently runs the first implementation about as fast as Py3.3:
>
> https://sage.math.washington.edu:8091/hudson/job/cython-devel-pybenchmarks-py3k/lastSuccessfulBuild/artifact/chart.html
>
>
> and the second one more than 3x as fast:
>
> https://sage.math.washington.edu:8091/hudson/view/bench/job/cython-devel-cybenchmarks-py3k/lastSuccessfulBuild/artifact/chart.html
>
>
> However, I think there's still some space for improvements, and local
> variables are part of that. For generator functions that do non-trivial
> things between yields, I think that local variables will quickly become
> a bottleneck. Currently, they are always closure fields, so any access
> to them will use a pointer indirection to a foreign struct, originally
> passed in as an argument to the function. Given that generators often do
> Python object manipulation through C-API calls, any such call will
> basically require the C compiler to assume that all values in the
> closure may have changed, thus disabling any optimisations for them. The
> same applies to many other object related operations or pointer
> operations (even DECREF!), as the C compiler cannot know that the
> generator function owns the closure during its lifetime exclusively.
>
> I think it would be worth changing the current implementation to use
> local C variables for local Cython variables in the generator, and to
> copy the values back into/from the closure around yields. I'd even let
> local Python references start off as NULL when the generator is created,
> given that Vitek's branch can eliminate None initialisations now.
Keep in mind that if speed is the objective, another idea is to use real
C coroutines. This would likely be faster than anything we can make up
ourselves; a single stack jump is bound to be faster than copying things
in and out of the stack.
Of course, probably more work. And then portability as the API is
seperate for Windows (fibers) and POSIX (makecontext). But I don't think
there's a lack of compatability layer libraries which unite the
platform-specific APIs.
Dag Sverre
More information about the cython-devel
mailing list