[Cython] local variable handling in generators

Vitja Makarov vitja.makarov at gmail.com
Mon May 23 10:13:38 CEST 2011


2011/5/22 Stefan Behnel <stefan_ml at behnel.de>:
> Hi,
>
> I've been looking at the nqueens benchmark for a while, and I think it's
> actually not that a bad benchmark for generators.
>
> http://hg.python.org/benchmarks/file/tip/performance/bm_nqueens.py
>
> A better implementation only for Py2.7/Py3 is here:
>
> https://github.com/cython/cython/blob/master/Demos/benchmarks/nqueens.py
>
> Cython currently runs the first implementation about as fast as Py3.3:
>
> https://sage.math.washington.edu:8091/hudson/job/cython-devel-pybenchmarks-py3k/lastSuccessfulBuild/artifact/chart.html
>
> and the second one more than 3x as fast:
>
> https://sage.math.washington.edu:8091/hudson/view/bench/job/cython-devel-cybenchmarks-py3k/lastSuccessfulBuild/artifact/chart.html
>
> However, I think there's still some space for improvements, and local
> variables are part of that. For generator functions that do non-trivial
> things between yields, I think that local variables will quickly become a
> bottleneck. Currently, they are always closure fields, so any access to them
> will use a pointer indirection to a foreign struct, originally passed in as
> an argument to the function. Given that generators often do Python object
> manipulation through C-API calls, any such call will basically require the C
> compiler to assume that all values in the closure may have changed, thus
> disabling any optimisations for them. The same applies to many other object
> related operations or pointer operations (even DECREF!), as the C compiler
> cannot know that the generator function owns the closure during its lifetime
> exclusively.
>
> I think it would be worth changing the current implementation to use local C
> variables for local Cython variables in the generator, and to copy the
> values back into/from the closure around yields. I'd even let local Python
> references start off as NULL when the generator is created, given that
> Vitek's branch can eliminate None initialisations now.
>
> I started looking into this a bit, but it's not a quick change. The main
> problem I see is the current code duplication between the generator body
> node and the DefNode. It would be better if both could share more code.
> Basically, if local variables become truly local, the only differences will
> be that they come from call arguments in one case and from the closure in
> the other, and that generators have the additional jump-to-yield entry step.
> Everything else could hopefully be identical.
>
> Once again, this goes hand in hand with the still pending DefNode
> refactoring...
>

With live variable analysis that should be easy to save/restore only
active variables at the yield point.
Btw now only reaching definitions analysis is implemented. I'm going
to optimize by replacing sets with bitsets. And then try to implement
live varaiables.

I'm going to delete variable reference using active variable info, but
that could introduce small incompatiblity with CPython:
a = X
print a # <- a will be decrefed here
print 'the end'

-- 
vitja.


More information about the cython-devel mailing list