Optimize out setting unused underscored local variables
It is common to use _ as a placeholder for variable whose value is not used. For example: for _ in range(n) head, _, tail = name.partition(':') first, *_, last = items I though about optimizing out unnecessary assignments. Actually I wrote a patch half year ago and tested it. It did not add much to performance, and did not reduce the bytecode, so I ccoled down to him and put it off. Later PEP 622 was declared the use of _ in pattern matching, so I was waiting for what it would come to. And now PEP 640 is created to solve the same problem. My patch was too conservative. It was limited to local variables and underscored names. For global variables we can't determine if the variable is not used. And eliminating unused non-underscored variables breaks too many tests (mainly for debugger, tracing, etc). I was not sure whether it should be limited to underscored names or just '_' (I seen also uses of '__' as a drop out variable in wild). Maybe we can extend this to global '_'. Global '_' is used as a holder for the last result in REPL and as an alias to gettext (this is the reason of PEP 640), but none of them is actually set in the assignment statement. You can use `globals()['_'] = ...` or `globals().update({'_': ...})` or `sys.modules[__name__]._ = ...` to set global '_'. I do not want to create tens of alternate PEPs with minor variations, and this issue is not worth a PEP. What is your opinion about this? Is it worth to include such optimization? For what kind of variables should it be applied? Should it include global '_'? Should it be merely an optimization (maybe controlled by the -O option) or change in the language?
I think it's a micro-optimization that's probably not worth it for most code and more likely to occasionally disappoint people who are using the debugger. Once we have a JIT mode or other super-optimization mode it can be done. On Tue, Oct 20, 2020 at 7:32 AM Serhiy Storchaka <storchaka@gmail.com> wrote:
It is common to use _ as a placeholder for variable whose value is not used. For example:
for _ in range(n) head, _, tail = name.partition(':') first, *_, last = items
I though about optimizing out unnecessary assignments. Actually I wrote a patch half year ago and tested it. It did not add much to performance, and did not reduce the bytecode, so I ccoled down to him and put it off. Later PEP 622 was declared the use of _ in pattern matching, so I was waiting for what it would come to. And now PEP 640 is created to solve the same problem.
My patch was too conservative. It was limited to local variables and underscored names. For global variables we can't determine if the variable is not used. And eliminating unused non-underscored variables breaks too many tests (mainly for debugger, tracing, etc).
I was not sure whether it should be limited to underscored names or just '_' (I seen also uses of '__' as a drop out variable in wild).
Maybe we can extend this to global '_'. Global '_' is used as a holder for the last result in REPL and as an alias to gettext (this is the reason of PEP 640), but none of them is actually set in the assignment statement. You can use `globals()['_'] = ...` or `globals().update({'_': ...})` or `sys.modules[__name__]._ = ...` to set global '_'.
I do not want to create tens of alternate PEPs with minor variations, and this issue is not worth a PEP. What is your opinion about this? Is it worth to include such optimization? For what kind of variables should it be applied? Should it include global '_'? Should it be merely an optimization (maybe controlled by the -O option) or change in the language? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UF22D3... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Tue, Oct 20, 2020 at 05:30:54PM +0300, Serhiy Storchaka wrote:
I do not want to create tens of alternate PEPs with minor variations, and this issue is not worth a PEP. What is your opinion about this? Is it worth to include such optimization? For what kind of variables should it be applied? Should it include global '_'? Should it be merely an optimization (maybe controlled by the -O option) or change in the language?
Assuming that this actually has a benefit, in either speed or memory, how about this? 1. Only eliminate local variables, never globals. 2. By default, only variables with a leading underscore are eliminated. This will(?) avoid breaking tests for the debugger etc. that you mentioned. 3. Under -O, allow more aggressive optimization that eliminates non-underscore variables too. 4. This is an implementation feature, not a language promise. Other interpreters do not have to follow. 5. To be clear, variables are only eliminated if they are assigned to, but never written to: def func(): _a = 1 # not eliminated _b = 2 # eliminated return _a -- Steve
So this would break many uses of locals(). I'm not sure I like that. On Tue, Oct 20, 2020 at 10:16 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Oct 20, 2020 at 05:30:54PM +0300, Serhiy Storchaka wrote:
I do not want to create tens of alternate PEPs with minor variations, and this issue is not worth a PEP. What is your opinion about this? Is it worth to include such optimization? For what kind of variables should it be applied? Should it include global '_'? Should it be merely an optimization (maybe controlled by the -O option) or change in the language?
Assuming that this actually has a benefit, in either speed or memory, how about this?
1. Only eliminate local variables, never globals.
2. By default, only variables with a leading underscore are eliminated. This will(?) avoid breaking tests for the debugger etc. that you mentioned.
3. Under -O, allow more aggressive optimization that eliminates non-underscore variables too.
4. This is an implementation feature, not a language promise. Other interpreters do not have to follow.
5. To be clear, variables are only eliminated if they are assigned to, but never written to:
def func(): _a = 1 # not eliminated _b = 2 # eliminated return _a
-- Steve _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/DBQIS6... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On Tue, Oct 20, 2020 at 04:26:16PM -0700, Guido van Rossum wrote:
So this would break many uses of locals(). I'm not sure I like that.
That's a good point I didn't think of. We could rescue the concept by saying that any reference to locals() inside the function disables the elimination, but of course locals can be shadowed or aliased, and we can't expect the interpreter to do a full analysis of the entire program. # Module A wibble = locals # Module B from A import wibble # this is actually locals def func(): _a = 1 return wibble() I think that if there was a big performance gain from elimination, we could chalk this up to consenting adults and just say "Don't do anything weird!" but if the performance gain is marginal, it probably isn't worth the hassle. On the third hand, and far more speculative, if we had a way of giving compiler directives, we could make these sorts of optimizations opt-in or opt-out on a fine-grained basis. -- Steve
Steven D'Aprano wrote:
We could rescue the concept by saying that any reference to locals() inside the function disables the elimination, but of course locals can be shadowed or aliased, and we can't expect the interpreter to do a full analysis of the entire program.
We already do something similar for zero-argument super():
duper = super class C: ... def __init__(self): ... duper() ... C() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in __init__ RuntimeError: super(): __class__ cell not found
Besides, locals() is already a bit weird at best:
[locals() for _ in [None]] [{'.0': <tuple_iterator object at 0x7ffa6c609280>, '_': None}]
Of course there is also the problem of detecting any functions assigned to _ or _T that may be called - otherwise there is a risk of breaking the typical gettext usage in I18n! Steve Barnes Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
21.10.20 05:23, Steve Barnes пише:
Of course there is also the problem of detecting any functions assigned to _ or _T that may be called - otherwise there is a risk of breaking the typical gettext usage in I18n!
gettext is usually mapped to global _, so it is not affected. Also, even if eliminate globals, gettext is mapped using gettext.install() which does not use assignment to _. It does: import builtins builtins.__dict__['_'] = self.gettext
On Wed, Oct 21, 2020 at 3:02 AM Serhiy Storchaka <storchaka@gmail.com> wrote:
21.10.20 05:23, Steve Barnes пише:
Of course there is also the problem of detecting any functions assigned to _ or _T that may be called - otherwise there is a risk of breaking the typical gettext usage in I18n!
gettext is usually mapped to global _, so it is not affected.
Also, even if eliminate globals, gettext is mapped using gettext.install() which does not use assignment to _. It does:
import builtins builtins.__dict__['_'] = self.gettext
In a program I work on, I found that this strategy did not work: I (apparently) do need to assign _ locally within each function where needed. This program includes, as an option, a custom Python REPL where _ has its usual meaning, while having the possibility to change language during a session. Attempting to use the standard global gettext _ breaks that. Furthermore, I want the user to be able to specify either a two letter code (say 'fr') or a four letter code (say fr_CA) such that if the latter were not found, it would look for the corresponding generic two-letter code (fr) before falling back to the default (en) as a last resort. As far as I know, a straight gettext.install does not allow that. I admit that my usage might not be typical. André Roberge
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/73KKPX... Code of Conduct: http://python.org/psf/codeofconduct/
20.10.20 20:13, Steven D'Aprano пише:
On Tue, Oct 20, 2020 at 05:30:54PM +0300, Serhiy Storchaka wrote:
I do not want to create tens of alternate PEPs with minor variations, and this issue is not worth a PEP. What is your opinion about this? Is it worth to include such optimization? For what kind of variables should it be applied? Should it include global '_'? Should it be merely an optimization (maybe controlled by the -O option) or change in the language?
Assuming that this actually has a benefit, in either speed or memory, how about this?
And this is a large IF.
1. Only eliminate local variables, never globals.
2. By default, only variables with a leading underscore are eliminated. This will(?) avoid breaking tests for the debugger etc. that you mentioned.
It is actually what my patch does.
3. Under -O, allow more aggressive optimization that eliminates non-underscore variables too.
It is easy to add, but as Guido noted it will break the code that uses locals(). I though about eliminating underscore variables only with -O. It would make the feature safer, but even less useful.
4. This is an implementation feature, not a language promise. Other interpreters do not have to follow.
It is a question about which I am not sure.
5. To be clear, variables are only eliminated if they are assigned to, but never written to:
Yes. And for implementation simplification "del" counts as the use of the variable. So _ is not eliminated in the following code: a, *_ = foo() del _
participants (6)
-
André Roberge
-
Brandt Bucher
-
Guido van Rossum
-
Serhiy Storchaka
-
Steve Barnes
-
Steven D'Aprano