local variable handling in generators
Hi, I've been looking at the nqueens benchmark for a while, and I think it's actually not that a bad benchmark for generators. http://hg.python.org/benchmarks/file/tip/performance/bm_nqueens.py A better implementation only for Py2.7/Py3 is here: https://github.com/cython/cython/blob/master/Demos/benchmarks/nqueens.py Cython currently runs the first implementation about as fast as Py3.3: https://sage.math.washington.edu:8091/hudson/job/cython-devel-pybenchmarks-p... and the second one more than 3x as fast: https://sage.math.washington.edu:8091/hudson/view/bench/job/cython-devel-cyb... However, I think there's still some space for improvements, and local variables are part of that. For generator functions that do non-trivial things between yields, I think that local variables will quickly become a bottleneck. Currently, they are always closure fields, so any access to them will use a pointer indirection to a foreign struct, originally passed in as an argument to the function. Given that generators often do Python object manipulation through C-API calls, any such call will basically require the C compiler to assume that all values in the closure may have changed, thus disabling any optimisations for them. The same applies to many other object related operations or pointer operations (even DECREF!), as the C compiler cannot know that the generator function owns the closure during its lifetime exclusively. I think it would be worth changing the current implementation to use local C variables for local Cython variables in the generator, and to copy the values back into/from the closure around yields. I'd even let local Python references start off as NULL when the generator is created, given that Vitek's branch can eliminate None initialisations now. I started looking into this a bit, but it's not a quick change. The main problem I see is the current code duplication between the generator body node and the DefNode. It would be better if both could share more code. Basically, if local variables become truly local, the only differences will be that they come from call arguments in one case and from the closure in the other, and that generators have the additional jump-to-yield entry step. Everything else could hopefully be identical. Once again, this goes hand in hand with the still pending DefNode refactoring... Stefan
On 05/22/2011 02:33 PM, Stefan Behnel wrote:
Hi,
I've been looking at the nqueens benchmark for a while, and I think it's actually not that a bad benchmark for generators.
http://hg.python.org/benchmarks/file/tip/performance/bm_nqueens.py
A better implementation only for Py2.7/Py3 is here:
https://github.com/cython/cython/blob/master/Demos/benchmarks/nqueens.py
Cython currently runs the first implementation about as fast as Py3.3:
https://sage.math.washington.edu:8091/hudson/job/cython-devel-pybenchmarks-p...
and the second one more than 3x as fast:
https://sage.math.washington.edu:8091/hudson/view/bench/job/cython-devel-cyb...
However, I think there's still some space for improvements, and local variables are part of that. For generator functions that do non-trivial things between yields, I think that local variables will quickly become a bottleneck. Currently, they are always closure fields, so any access to them will use a pointer indirection to a foreign struct, originally passed in as an argument to the function. Given that generators often do Python object manipulation through C-API calls, any such call will basically require the C compiler to assume that all values in the closure may have changed, thus disabling any optimisations for them. The same applies to many other object related operations or pointer operations (even DECREF!), as the C compiler cannot know that the generator function owns the closure during its lifetime exclusively.
I think it would be worth changing the current implementation to use local C variables for local Cython variables in the generator, and to copy the values back into/from the closure around yields. I'd even let local Python references start off as NULL when the generator is created, given that Vitek's branch can eliminate None initialisations now.
Keep in mind that if speed is the objective, another idea is to use real C coroutines. This would likely be faster than anything we can make up ourselves; a single stack jump is bound to be faster than copying things in and out of the stack. Of course, probably more work. And then portability as the API is seperate for Windows (fibers) and POSIX (makecontext). But I don't think there's a lack of compatability layer libraries which unite the platform-specific APIs. Dag Sverre
Dag Sverre Seljebotn, 22.05.2011 21:48:
Keep in mind that if speed is the objective, another idea is to use real C coroutines. This would likely be faster than anything we can make up ourselves; a single stack jump is bound to be faster than copying things in and out of the stack.
Of course, probably more work. And then portability as the API is seperate for Windows (fibers) and POSIX (makecontext). But I don't think there's a lack of compatability layer libraries which unite the platform-specific APIs.
What if we have to call back into CPython? Would that work if the call ends up on a different stack than previous ones? What about CPython thread switches and signal handlers? Wikipedia has a couple of things to say about coroutines in C: http://en.wikipedia.org/wiki/Coroutine#Implementations_for_C including links to some libraries that are supposed to be portable: http://xmailserver.org/libpcl.html http://software.schmorp.de/pkg/libcoro.html http://code.google.com/p/libconcurrency/ http://www.dekorte.com/projects/opensource/libCoroutine/ http://www.goron.de/~froese/coro/ Overall, I think this has some potential but also major uncertainties. And it's a substantially larger change than making local variables C variables. May have made a good GSoC topic... Stefan
2011/5/22 Stefan Behnel <stefan_ml@behnel.de>:
Hi,
I've been looking at the nqueens benchmark for a while, and I think it's actually not that a bad benchmark for generators.
http://hg.python.org/benchmarks/file/tip/performance/bm_nqueens.py
A better implementation only for Py2.7/Py3 is here:
https://github.com/cython/cython/blob/master/Demos/benchmarks/nqueens.py
Cython currently runs the first implementation about as fast as Py3.3:
https://sage.math.washington.edu:8091/hudson/job/cython-devel-pybenchmarks-p...
and the second one more than 3x as fast:
https://sage.math.washington.edu:8091/hudson/view/bench/job/cython-devel-cyb...
However, I think there's still some space for improvements, and local variables are part of that. For generator functions that do non-trivial things between yields, I think that local variables will quickly become a bottleneck. Currently, they are always closure fields, so any access to them will use a pointer indirection to a foreign struct, originally passed in as an argument to the function. Given that generators often do Python object manipulation through C-API calls, any such call will basically require the C compiler to assume that all values in the closure may have changed, thus disabling any optimisations for them. The same applies to many other object related operations or pointer operations (even DECREF!), as the C compiler cannot know that the generator function owns the closure during its lifetime exclusively.
I think it would be worth changing the current implementation to use local C variables for local Cython variables in the generator, and to copy the values back into/from the closure around yields. I'd even let local Python references start off as NULL when the generator is created, given that Vitek's branch can eliminate None initialisations now.
I started looking into this a bit, but it's not a quick change. The main problem I see is the current code duplication between the generator body node and the DefNode. It would be better if both could share more code. Basically, if local variables become truly local, the only differences will be that they come from call arguments in one case and from the closure in the other, and that generators have the additional jump-to-yield entry step. Everything else could hopefully be identical.
Once again, this goes hand in hand with the still pending DefNode refactoring...
With live variable analysis that should be easy to save/restore only active variables at the yield point. Btw now only reaching definitions analysis is implemented. I'm going to optimize by replacing sets with bitsets. And then try to implement live varaiables. I'm going to delete variable reference using active variable info, but that could introduce small incompatiblity with CPython: a = X print a # <- a will be decrefed here print 'the end' -- vitja.
Vitja Makarov, 23.05.2011 10:13:
With live variable analysis that should be easy to save/restore only active variables at the yield point.
"Active" in the sense of "modified", I suppose? That's what I was expecting.
Btw now only reaching definitions analysis is implemented. I'm going to optimize by replacing sets with bitsets. And then try to implement live varaiables.
I'm going to delete variable reference using active variable info, but that could introduce small incompatiblity with CPython: a = X print a #<- a will be decrefed here print 'the end'
That incompatibility is not small at all. It breaks this code: x = b'abc' cdef char* c = x Even if 'x' is no longer used after this point, it *must not* get freed before 'c' is going away as well. That's basically impossible to decide, as users may pass 'c' into a function that stores it away for alter use. I'm fine with deallocating variables that are no longer used after the user explicitly assigned None to them (i.e. replace the None assignment by a simple "DECREF + set to NULL" in that case). I don't think we should be doing more than that. Stefan
2011/5/23 Stefan Behnel <stefan_ml@behnel.de>:
Vitja Makarov, 23.05.2011 10:13:
With live variable analysis that should be easy to save/restore only active variables at the yield point.
"Active" in the sense of "modified", I suppose? That's what I was expecting.
Active means that variable value will be used. In my example after 'print a' a isn't used anymore.
Btw now only reaching definitions analysis is implemented. I'm going to optimize by replacing sets with bitsets. And then try to implement live varaiables.
I'm going to delete variable reference using active variable info, but that could introduce small incompatiblity with CPython: a = X print a #<- a will be decrefed here print 'the end'
That incompatibility is not small at all. It breaks this code:
x = b'abc' cdef char* c = x
Even if 'x' is no longer used after this point, it *must not* get freed before 'c' is going away as well. That's basically impossible to decide, as users may pass 'c' into a function that stores it away for alter use.
Yeah. That's hard to detect. But x could be marked as "don't decref when not-active"
I'm fine with deallocating variables that are no longer used after the user explicitly assigned None to them (i.e. replace the None assignment by a simple "DECREF + set to NULL" in that case). I don't think we should be doing more than that.
Hmm. Why should that be NULL if user sets it to None? For instance: for i in args: print i this code will be translated into: PyObject *i = NULL; for (;;) { tmp = next(); if (!tmp) break; Pyx_XDECREF(i); i = tmp; print(i); } using active variables information this could be translated into: PyObject *i = NULL; for (;;) { tmp = next(); if (!tmp) break; i = tmp; print(i); Pyx_DECREF(i); } -- vitja.
On 05/23/2011 10:50 AM, Vitja Makarov wrote:
2011/5/23 Stefan Behnel<stefan_ml@behnel.de>:
Vitja Makarov, 23.05.2011 10:13:
With live variable analysis that should be easy to save/restore only active variables at the yield point.
"Active" in the sense of "modified", I suppose? That's what I was expecting.
Active means that variable value will be used. In my example after 'print a' a isn't used anymore.
Btw now only reaching definitions analysis is implemented. I'm going to optimize by replacing sets with bitsets. And then try to implement live varaiables.
I'm going to delete variable reference using active variable info, but that could introduce small incompatiblity with CPython: a = X print a #<- a will be decrefed here print 'the end'
That incompatibility is not small at all. It breaks this code:
x = b'abc' cdef char* c = x
Even if 'x' is no longer used after this point, it *must not* get freed before 'c' is going away as well. That's basically impossible to decide, as users may pass 'c' into a function that stores it away for alter use.
Yeah. That's hard to detect. But x could be marked as "don't decref when not-active"
def f(object o): cdef char* buf buf = get_buffer_of_obj(o) call_c_func(buf) So there's a lot of variables that would have to be marked this way (but not all, I can see that). Dag Sverre
Vitja Makarov, 23.05.2011 10:50:
2011/5/23 Stefan Behnel:
Vitja Makarov, 23.05.2011 10:13:
With live variable analysis that should be easy to save/restore only active variables at the yield point.
"Active" in the sense of "modified", I suppose? That's what I was expecting.
Active means that variable value will be used. In my example after 'print a' a isn't used anymore.
That's not correct then. In a generator, a modified value must be kept alive over a yield, even if it is no longer used afterwards. We can safely reduce the write-back code to modified values, but we cannot reduce it to values to that will be used later on.
Btw now only reaching definitions analysis is implemented. I'm going to optimize by replacing sets with bitsets. And then try to implement live varaiables.
I'm going to delete variable reference using active variable info, but that could introduce small incompatiblity with CPython: a = X print a #<- a will be decrefed here print 'the end'
That incompatibility is not small at all. It breaks this code:
x = b'abc' cdef char* c = x
Even if 'x' is no longer used after this point, it *must not* get freed before 'c' is going away as well. That's basically impossible to decide, as users may pass 'c' into a function that stores it away for alter use.
Yeah. That's hard to detect. But x could be marked as "don't decref when not-active"
How would you know that it needs to be marked like that? You won't necessarily see it in the code that a pointer was taken from the value, that might have happened within a called function.
I'm fine with deallocating variables that are no longer used after the user explicitly assigned None to them (i.e. replace the None assignment by a simple "DECREF + set to NULL" in that case). I don't think we should be doing more than that.
Hmm. Why should that be NULL if user sets it to None?
Because there is no user visible difference. None will always be available, even if the Cython code no longer holds a reference to it. So changing "x = None" into "Py_DECREF(x); x=NULL" is just fine, as long as we can make sure 'x' is never accessed after this point.
For instance:
for i in args: print i
this code will be translated into:
PyObject *i = NULL;
for (;;) { tmp = next(); if (!tmp) break;
Pyx_XDECREF(i); i = tmp; print(i); }
using active variables information this could be translated into:
PyObject *i = NULL;
for (;;) { tmp = next(); if (!tmp) break;
i = tmp; print(i); Pyx_DECREF(i); }
That's not correct, though. Python semantics dictate that 'i' must keep its value until the end of the function or until it's being reassigned to, whatever comes first. Remember that objects can have deallocators in Python. That must not be called at an undefined point. The only thing that can safely be special cased is None. It's common in Python code to set a variable to None when the value is worth being deallocated (e.g. a large data structure). Cython can optimise this as I indicated above. Stefan
2011/5/23 Stefan Behnel <stefan_ml@behnel.de>:
Vitja Makarov, 23.05.2011 10:50:
2011/5/23 Stefan Behnel:
Vitja Makarov, 23.05.2011 10:13:
With live variable analysis that should be easy to save/restore only active variables at the yield point.
"Active" in the sense of "modified", I suppose? That's what I was expecting.
Active means that variable value will be used. In my example after 'print a' a isn't used anymore.
That's not correct then. In a generator, a modified value must be kept alive over a yield, even if it is no longer used afterwards.
We can safely reduce the write-back code to modified values, but we cannot reduce it to values to that will be used later on.
I'm not sure how to get modified variables list at yield point. Now I only know which assignments reach yield point.
Btw now only reaching definitions analysis is implemented. I'm going to optimize by replacing sets with bitsets. And then try to implement live varaiables.
I'm going to delete variable reference using active variable info, but that could introduce small incompatiblity with CPython: a = X print a #<- a will be decrefed here print 'the end'
That incompatibility is not small at all. It breaks this code:
x = b'abc' cdef char* c = x
Even if 'x' is no longer used after this point, it *must not* get freed before 'c' is going away as well. That's basically impossible to decide, as users may pass 'c' into a function that stores it away for alter use.
Yeah. That's hard to detect. But x could be marked as "don't decref when not-active"
How would you know that it needs to be marked like that? You won't necessarily see it in the code that a pointer was taken from the value, that might have happened within a called function.
I'm fine with deallocating variables that are no longer used after the user explicitly assigned None to them (i.e. replace the None assignment by a simple "DECREF + set to NULL" in that case). I don't think we should be doing more than that.
Hmm. Why should that be NULL if user sets it to None?
Because there is no user visible difference. None will always be available, even if the Cython code no longer holds a reference to it. So changing "x = None" into "Py_DECREF(x); x=NULL" is just fine, as long as we can make sure 'x' is never accessed after this point.
For instance:
for i in args: print i
this code will be translated into:
PyObject *i = NULL;
for (;;) { tmp = next(); if (!tmp) break;
Pyx_XDECREF(i); i = tmp; print(i); }
using active variables information this could be translated into:
PyObject *i = NULL;
for (;;) { tmp = next(); if (!tmp) break;
i = tmp; print(i); Pyx_DECREF(i); }
That's not correct, though. Python semantics dictate that 'i' must keep its value until the end of the function or until it's being reassigned to, whatever comes first. Remember that objects can have deallocators in Python. That must not be called at an undefined point.
The only thing that can safely be special cased is None. It's common in Python code to set a variable to None when the value is worth being deallocated (e.g. a large data structure). Cython can optimise this as I indicated above.
Ohh, I see that variable references couldn't be simply removed. Unused result reference removal seems safe to me: a = foo() # a will be assigned to NULL here print -- vitja.
Vitja Makarov, 23.05.2011 11:24:
2011/5/23 Stefan Behnel:
Vitja Makarov, 23.05.2011 10:50:
2011/5/23 Stefan Behnel:
Vitja Makarov, 23.05.2011 10:13:
With live variable analysis that should be easy to save/restore only active variables at the yield point.
"Active" in the sense of "modified", I suppose? That's what I was expecting.
Active means that variable value will be used. In my example after 'print a' a isn't used anymore.
That's not correct then. In a generator, a modified value must be kept alive over a yield, even if it is no longer used afterwards.
We can safely reduce the write-back code to modified values, but we cannot reduce it to values to that will be used later on.
I'm not sure how to get modified variables list at yield point. Now I only know which assignments reach yield point.
But then you already know which variables were assigned to, right? That's all you need.
For instance:
for i in args: print i
this code will be translated into:
PyObject *i = NULL;
for (;;) { tmp = next(); if (!tmp) break;
Pyx_XDECREF(i); i = tmp; print(i); }
using active variables information this could be translated into:
PyObject *i = NULL;
for (;;) { tmp = next(); if (!tmp) break;
i = tmp; print(i); Pyx_DECREF(i); }
That's not correct, though. Python semantics dictate that 'i' must keep its value until the end of the function or until it's being reassigned to, whatever comes first. Remember that objects can have deallocators in Python. That must not be called at an undefined point.
The only thing that can safely be special cased is None. It's common in Python code to set a variable to None when the value is worth being deallocated (e.g. a large data structure). Cython can optimise this as I indicated above.
Ohh, I see that variable references couldn't be simply removed.
Unused result reference removal seems safe to me:
a = foo() # a will be assigned to NULL here print
No, same thing: the reference that 'a' holds here may be important, so we cannot just drop it at an arbitrary point. There's (likely) a reason the user chose to use a variable, instead of just letting the return value fall off the shelve silently. Stefan
Stefan Behnel, 23.05.2011 11:15:
Vitja Makarov, 23.05.2011 10:50:
2011/5/23 Stefan Behnel:
I'm fine with deallocating variables that are no longer used after the user explicitly assigned None to them (i.e. replace the None assignment by a simple "DECREF + set to NULL" in that case). I don't think we should be doing more than that.
Hmm. Why should that be NULL if user sets it to None?
Because there is no user visible difference. None will always be available, even if the Cython code no longer holds a reference to it. So changing "x = None" into "Py_DECREF(x); x=NULL" is just fine, as long as we can make sure 'x' is never accessed after this point.
The difference is clearer when I spell out the code for the first example, too: # x = None Py_INCREF(None) Py_DECREF(x) x = None would be optimised into # x = None Py_DECREF(x) x = NULL That's likely not a big difference, assuming that setting 'x' to None was worth it for the user, i.e. it will clean up the referenced object at that point. It may still be worth it inside of generators and on function return, which may now have one variable less to clean up or store away. A None value would still have to be DECREF-ed at least. Stefan
Stefan Behnel, 23.05.2011 11:29:
Stefan Behnel, 23.05.2011 11:15:
Vitja Makarov, 23.05.2011 10:50:
2011/5/23 Stefan Behnel:
I'm fine with deallocating variables that are no longer used after the user explicitly assigned None to them (i.e. replace the None assignment by a simple "DECREF + set to NULL" in that case). I don't think we should be doing more than that.
Hmm. Why should that be NULL if user sets it to None?
Because there is no user visible difference. None will always be available, even if the Cython code no longer holds a reference to it. So changing "x = None" into "Py_DECREF(x); x=NULL" is just fine, as long as we can make sure 'x' is never accessed after this point.
The difference is clearer when I spell out the code for the first example, too:
# x = None Py_INCREF(None) Py_DECREF(x) x = None
would be optimised into
# x = None Py_DECREF(x) x = NULL
That's likely not a big difference, assuming that setting 'x' to None was worth it for the user, i.e. it will clean up the referenced object at that point. It may still be worth it inside of generators and on function return, which may now have one variable less to clean up or store away. A None value would still have to be DECREF-ed at least.
Then again, "del x" would be a more obvious way to spell this ... Stefan
On Mon, May 23, 2011 at 2:57 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Stefan Behnel, 23.05.2011 11:29:
Stefan Behnel, 23.05.2011 11:15:
Vitja Makarov, 23.05.2011 10:50:
2011/5/23 Stefan Behnel:
I'm fine with deallocating variables that are no longer used after the user explicitly assigned None to them (i.e. replace the None assignment by a simple "DECREF + set to NULL" in that case). I don't think we should be doing more than that.
Hmm. Why should that be NULL if user sets it to None?
Because there is no user visible difference. None will always be available, even if the Cython code no longer holds a reference to it. So changing "x = None" into "Py_DECREF(x); x=NULL" is just fine, as long as we can make sure 'x' is never accessed after this point.
The difference is clearer when I spell out the code for the first example, too:
# x = None Py_INCREF(None) Py_DECREF(x) x = None
would be optimised into
# x = None Py_DECREF(x) x = NULL
That's likely not a big difference, assuming that setting 'x' to None was worth it for the user, i.e. it will clean up the referenced object at that point. It may still be worth it inside of generators and on function return, which may now have one variable less to clean up or store away. A None value would still have to be DECREF-ed at least.
Then again, "del x" would be a more obvious way to spell this ...
And then setting it to NULL would actually be correct. In any case, I'm -1 to deleting references once their no longer used, we must wait 'till the end of the function, and I'm not sure the savings would be that great either. In terms of packing/unpacking the variables onto the local C stack, until we have live variable analysis (and even then) it may still be more expensive in some cases than just leaving it there, right? - Robert
Robert Bradshaw, 23.05.2011 20:09:
In any case, I'm -1 to deleting references once their no longer used, we must wait 'till the end of the function, and I'm not sure the savings would be that great either.
Agreed.
In terms of packing/unpacking the variables onto the local C stack, until we have live variable analysis (and even then) it may still be more expensive in some cases than just leaving it there, right?
I agree that it can be substantially more expensive to write back *all* variables on each yield, although it doesn't have to be. It largely depends on the generator code. However, once we really know which values change between yield calls, i.e. which ones need to be stored away, it will actually be less expensive in most cases. We currently pay the indirection penalty for each access, even read access, whereas the C compiler can then keep important variables in registers and only write them back once per yield. Stefan
2011/5/23 Stefan Behnel <stefan_ml@behnel.de>:
However, once we really know which values change between yield calls, i.e. which ones need to be stored away, it will actually be less expensive in most cases. We currently pay the indirection penalty for each access, even read access, whereas the C compiler can then keep important variables in registers and only write them back once per yield.
I think that all not NULL variables should be saved/restored inside yield. I can not really track changes only assignments. for i in a: yield i # j is NULL here for j in b: yield j # a, b, i ,j should be saved/restored -- vitja.
Vitja Makarov, 23.05.2011 21:33:
2011/5/23 Stefan Behnel:
However, once we really know which values change between yield calls, i.e. which ones need to be stored away, it will actually be less expensive in most cases. We currently pay the indirection penalty for each access, even read access, whereas the C compiler can then keep important variables in registers and only write them back once per yield.
I think that all not NULL variables should be saved/restored inside yield. I can not really track changes only assignments.
I mean "assignments" when I write "changes". When a variable is being assigned to, we should assume that its value changed. Users can be expected to be smart enough to avoid unnecessary assignments in most cases, there's no need to put work into optimising them away automatically.
for i in a: yield i # j is NULL here for j in b: yield j # a, b, i ,j should be saved/restored
Right. However, in general, we can expect that most variables will have been initialised on a yield. We can still avoid storing away C typed variables that are not being used later on, because they are not reference counted. And if the user sets 'a' and 'i' to None between the loops, they won't need to be saved on the second yield either. But that's an optimisation, not a requirement. Stefan
2011/5/24 Stefan Behnel <stefan_ml@behnel.de>:
Vitja Makarov, 23.05.2011 21:33:
2011/5/23 Stefan Behnel:
However, once we really know which values change between yield calls, i.e. which ones need to be stored away, it will actually be less expensive in most cases. We currently pay the indirection penalty for each access, even read access, whereas the C compiler can then keep important variables in registers and only write them back once per yield.
I think that all not NULL variables should be saved/restored inside yield. I can not really track changes only assignments.
I mean "assignments" when I write "changes". When a variable is being assigned to, we should assume that its value changed. Users can be expected to be smart enough to avoid unnecessary assignments in most cases, there's no need to put work into optimising them away automatically.
for i in a: yield i # j is NULL here for j in b: yield j # a, b, i ,j should be saved/restored
Right. However, in general, we can expect that most variables will have been initialised on a yield.
We can still avoid storing away C typed variables that are not being used later on, because they are not reference counted.
And if the user sets 'a' and 'i' to None between the loops, they won't need to be saved on the second yield either. But that's an optimisation, not a requirement.
What about copying C++ structs and objects with constructor (and/or overloaded = method)? That wouldn't work well. -- vitja.
On 05/31/2011 01:07 PM, Vitja Makarov wrote:
2011/5/24 Stefan Behnel<stefan_ml@behnel.de>:
Vitja Makarov, 23.05.2011 21:33:
2011/5/23 Stefan Behnel:
However, once we really know which values change between yield calls, i.e. which ones need to be stored away, it will actually be less expensive in most cases. We currently pay the indirection penalty for each access, even read access, whereas the C compiler can then keep important variables in registers and only write them back once per yield.
I think that all not NULL variables should be saved/restored inside yield. I can not really track changes only assignments.
I mean "assignments" when I write "changes". When a variable is being assigned to, we should assume that its value changed. Users can be expected to be smart enough to avoid unnecessary assignments in most cases, there's no need to put work into optimising them away automatically.
for i in a: yield i # j is NULL here for j in b: yield j # a, b, i ,j should be saved/restored
Right. However, in general, we can expect that most variables will have been initialised on a yield.
We can still avoid storing away C typed variables that are not being used later on, because they are not reference counted.
And if the user sets 'a' and 'i' to None between the loops, they won't need to be saved on the second yield either. But that's an optimisation, not a requirement.
What about copying C++ structs and objects with constructor (and/or overloaded = method)? That wouldn't work well.
Currently I believe C++ objects are only supported through pointers, which of course are copyable. If or when we support C++ objects on the stack, then we should definitely avoid copying them. If you declare a C++ class as a struct Cython-side then I think we should assume that we can freely invoke the copy constructor and assignment operator in this context. Dag Sverre
2011/5/31 Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no>:
On 05/31/2011 01:07 PM, Vitja Makarov wrote:
2011/5/24 Stefan Behnel<stefan_ml@behnel.de>:
Vitja Makarov, 23.05.2011 21:33:
2011/5/23 Stefan Behnel:
However, once we really know which values change between yield calls, i.e. which ones need to be stored away, it will actually be less expensive in most cases. We currently pay the indirection penalty for each access, even read access, whereas the C compiler can then keep important variables in registers and only write them back once per yield.
I think that all not NULL variables should be saved/restored inside yield. I can not really track changes only assignments.
I mean "assignments" when I write "changes". When a variable is being assigned to, we should assume that its value changed. Users can be expected to be smart enough to avoid unnecessary assignments in most cases, there's no need to put work into optimising them away automatically.
for i in a: yield i # j is NULL here for j in b: yield j # a, b, i ,j should be saved/restored
Right. However, in general, we can expect that most variables will have been initialised on a yield.
We can still avoid storing away C typed variables that are not being used later on, because they are not reference counted.
And if the user sets 'a' and 'i' to None between the loops, they won't need to be saved on the second yield either. But that's an optimisation, not a requirement.
What about copying C++ structs and objects with constructor (and/or overloaded = method)? That wouldn't work well.
Currently I believe C++ objects are only supported through pointers, which of course are copyable. If or when we support C++ objects on the stack, then we should definitely avoid copying them.
If you declare a C++ class as a struct Cython-side then I think we should assume that we can freely invoke the copy constructor and assignment operator in this context.
May be it's better to declare struct with all the locals and then copy it with memcpy()? Btw cpp still wouldn't love it :( So probably it's better to look forward for stack switching library... -- vitja.
Vitja Makarov, 31.05.2011 13:07:
2011/5/24 Stefan Behnel:
Vitja Makarov, 23.05.2011 21:33:
2011/5/23 Stefan Behnel:
However, once we really know which values change between yield calls, i.e. which ones need to be stored away, it will actually be less expensive in most cases. We currently pay the indirection penalty for each access, even read access, whereas the C compiler can then keep important variables in registers and only write them back once per yield.
I think that all not NULL variables should be saved/restored inside yield. I can not really track changes only assignments.
I mean "assignments" when I write "changes". When a variable is being assigned to, we should assume that its value changed. Users can be expected to be smart enough to avoid unnecessary assignments in most cases, there's no need to put work into optimising them away automatically.
for i in a: yield i # j is NULL here for j in b: yield j # a, b, i ,j should be saved/restored
Right. However, in general, we can expect that most variables will have been initialised on a yield.
We can still avoid storing away C typed variables that are not being used later on, because they are not reference counted.
And if the user sets 'a' and 'i' to None between the loops, they won't need to be saved on the second yield either. But that's an optimisation, not a requirement.
What about copying C++ structs and objects with constructor (and/or overloaded = method)? That wouldn't work well.
I'd just make that an "unsupported feature" error for now. Remember, we are talking about C++ types used inside of generators here, which can only appear in newly written code. It's best to prevent users from doing that, at least until someone comes up with an actual requirement. Stefan
2011/5/23 Robert Bradshaw <robertwb@math.washington.edu>:
Then again, "del x" would be a more obvious way to spell this ...
And then setting it to NULL would actually be correct.
That is already done ;) But usually you wouldn't use del statement in your code.
In any case, I'm -1 to deleting references once their no longer used, we must wait 'till the end of the function, and I'm not sure the savings would be that great either.
Ok. Example with char * makes me think that this is not a good idea.
In terms of packing/unpacking the variables onto the local C stack, until we have live variable analysis (and even then) it may still be more expensive in some cases than just leaving it there, right?
We can try, that isn't hard to implement and compare. -- vitja.
participants (4)
-
Dag Sverre Seljebotn -
Robert Bradshaw -
Stefan Behnel -
Vitja Makarov