Optimize out setting unused underscored local variables

It is common to use _ as a placeholder for variable whose value is not used. For example: for _ in range(n) head, _, tail = name.partition(':') first, *_, last = items I though about optimizing out unnecessary assignments. Actually I wrote a patch half year ago and tested it. It did not add much to performance, and did not reduce the bytecode, so I ccoled down to him and put it off. Later PEP 622 was declared the use of _ in pattern matching, so I was waiting for what it would come to. And now PEP 640 is created to solve the same problem. My patch was too conservative. It was limited to local variables and underscored names. For global variables we can't determine if the variable is not used. And eliminating unused non-underscored variables breaks too many tests (mainly for debugger, tracing, etc). I was not sure whether it should be limited to underscored names or just '_' (I seen also uses of '__' as a drop out variable in wild). Maybe we can extend this to global '_'. Global '_' is used as a holder for the last result in REPL and as an alias to gettext (this is the reason of PEP 640), but none of them is actually set in the assignment statement. You can use `globals()['_'] = ...` or `globals().update({'_': ...})` or `sys.modules[__name__]._ = ...` to set global '_'. I do not want to create tens of alternate PEPs with minor variations, and this issue is not worth a PEP. What is your opinion about this? Is it worth to include such optimization? For what kind of variables should it be applied? Should it include global '_'? Should it be merely an optimization (maybe controlled by the -O option) or change in the language?

I think it's a micro-optimization that's probably not worth it for most code and more likely to occasionally disappoint people who are using the debugger. Once we have a JIT mode or other super-optimization mode it can be done. On Tue, Oct 20, 2020 at 7:32 AM Serhiy Storchaka <storchaka@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Tue, Oct 20, 2020 at 05:30:54PM +0300, Serhiy Storchaka wrote:
Assuming that this actually has a benefit, in either speed or memory, how about this? 1. Only eliminate local variables, never globals. 2. By default, only variables with a leading underscore are eliminated. This will(?) avoid breaking tests for the debugger etc. that you mentioned. 3. Under -O, allow more aggressive optimization that eliminates non-underscore variables too. 4. This is an implementation feature, not a language promise. Other interpreters do not have to follow. 5. To be clear, variables are only eliminated if they are assigned to, but never written to: def func(): _a = 1 # not eliminated _b = 2 # eliminated return _a -- Steve

So this would break many uses of locals(). I'm not sure I like that. On Tue, Oct 20, 2020 at 10:16 AM Steven D'Aprano <steve@pearwood.info> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Tue, Oct 20, 2020 at 04:26:16PM -0700, Guido van Rossum wrote:
So this would break many uses of locals(). I'm not sure I like that.
That's a good point I didn't think of. We could rescue the concept by saying that any reference to locals() inside the function disables the elimination, but of course locals can be shadowed or aliased, and we can't expect the interpreter to do a full analysis of the entire program. # Module A wibble = locals # Module B from A import wibble # this is actually locals def func(): _a = 1 return wibble() I think that if there was a big performance gain from elimination, we could chalk this up to consenting adults and just say "Don't do anything weird!" but if the performance gain is marginal, it probably isn't worth the hassle. On the third hand, and far more speculative, if we had a way of giving compiler directives, we could make these sorts of optimizations opt-in or opt-out on a fine-grained basis. -- Steve

Steven D'Aprano wrote:
We could rescue the concept by saying that any reference to locals() inside the function disables the elimination, but of course locals can be shadowed or aliased, and we can't expect the interpreter to do a full analysis of the entire program.
We already do something similar for zero-argument super():
Besides, locals() is already a bit weird at best:
[locals() for _ in [None]] [{'.0': <tuple_iterator object at 0x7ffa6c609280>, '_': None}]

Of course there is also the problem of detecting any functions assigned to _ or _T that may be called - otherwise there is a risk of breaking the typical gettext usage in I18n! Steve Barnes Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10

On Wed, Oct 21, 2020 at 3:02 AM Serhiy Storchaka <storchaka@gmail.com> wrote:
In a program I work on, I found that this strategy did not work: I (apparently) do need to assign _ locally within each function where needed. This program includes, as an option, a custom Python REPL where _ has its usual meaning, while having the possibility to change language during a session. Attempting to use the standard global gettext _ breaks that. Furthermore, I want the user to be able to specify either a two letter code (say 'fr') or a four letter code (say fr_CA) such that if the latter were not found, it would look for the corresponding generic two-letter code (fr) before falling back to the default (en) as a last resort. As far as I know, a straight gettext.install does not allow that. I admit that my usage might not be typical. André Roberge

20.10.20 20:13, Steven D'Aprano пише:
And this is a large IF.
It is actually what my patch does.
3. Under -O, allow more aggressive optimization that eliminates non-underscore variables too.
It is easy to add, but as Guido noted it will break the code that uses locals(). I though about eliminating underscore variables only with -O. It would make the feature safer, but even less useful.
4. This is an implementation feature, not a language promise. Other interpreters do not have to follow.
It is a question about which I am not sure.
5. To be clear, variables are only eliminated if they are assigned to, but never written to:
Yes. And for implementation simplification "del" counts as the use of the variable. So _ is not eliminated in the following code: a, *_ = foo() del _

I think it's a micro-optimization that's probably not worth it for most code and more likely to occasionally disappoint people who are using the debugger. Once we have a JIT mode or other super-optimization mode it can be done. On Tue, Oct 20, 2020 at 7:32 AM Serhiy Storchaka <storchaka@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Tue, Oct 20, 2020 at 05:30:54PM +0300, Serhiy Storchaka wrote:
Assuming that this actually has a benefit, in either speed or memory, how about this? 1. Only eliminate local variables, never globals. 2. By default, only variables with a leading underscore are eliminated. This will(?) avoid breaking tests for the debugger etc. that you mentioned. 3. Under -O, allow more aggressive optimization that eliminates non-underscore variables too. 4. This is an implementation feature, not a language promise. Other interpreters do not have to follow. 5. To be clear, variables are only eliminated if they are assigned to, but never written to: def func(): _a = 1 # not eliminated _b = 2 # eliminated return _a -- Steve

So this would break many uses of locals(). I'm not sure I like that. On Tue, Oct 20, 2020 at 10:16 AM Steven D'Aprano <steve@pearwood.info> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Tue, Oct 20, 2020 at 04:26:16PM -0700, Guido van Rossum wrote:
So this would break many uses of locals(). I'm not sure I like that.
That's a good point I didn't think of. We could rescue the concept by saying that any reference to locals() inside the function disables the elimination, but of course locals can be shadowed or aliased, and we can't expect the interpreter to do a full analysis of the entire program. # Module A wibble = locals # Module B from A import wibble # this is actually locals def func(): _a = 1 return wibble() I think that if there was a big performance gain from elimination, we could chalk this up to consenting adults and just say "Don't do anything weird!" but if the performance gain is marginal, it probably isn't worth the hassle. On the third hand, and far more speculative, if we had a way of giving compiler directives, we could make these sorts of optimizations opt-in or opt-out on a fine-grained basis. -- Steve

Steven D'Aprano wrote:
We could rescue the concept by saying that any reference to locals() inside the function disables the elimination, but of course locals can be shadowed or aliased, and we can't expect the interpreter to do a full analysis of the entire program.
We already do something similar for zero-argument super():
Besides, locals() is already a bit weird at best:
[locals() for _ in [None]] [{'.0': <tuple_iterator object at 0x7ffa6c609280>, '_': None}]

Of course there is also the problem of detecting any functions assigned to _ or _T that may be called - otherwise there is a risk of breaking the typical gettext usage in I18n! Steve Barnes Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10

On Wed, Oct 21, 2020 at 3:02 AM Serhiy Storchaka <storchaka@gmail.com> wrote:
In a program I work on, I found that this strategy did not work: I (apparently) do need to assign _ locally within each function where needed. This program includes, as an option, a custom Python REPL where _ has its usual meaning, while having the possibility to change language during a session. Attempting to use the standard global gettext _ breaks that. Furthermore, I want the user to be able to specify either a two letter code (say 'fr') or a four letter code (say fr_CA) such that if the latter were not found, it would look for the corresponding generic two-letter code (fr) before falling back to the default (en) as a last resort. As far as I know, a straight gettext.install does not allow that. I admit that my usage might not be typical. André Roberge

20.10.20 20:13, Steven D'Aprano пише:
And this is a large IF.
It is actually what my patch does.
3. Under -O, allow more aggressive optimization that eliminates non-underscore variables too.
It is easy to add, but as Guido noted it will break the code that uses locals(). I though about eliminating underscore variables only with -O. It would make the feature safer, but even less useful.
4. This is an implementation feature, not a language promise. Other interpreters do not have to follow.
It is a question about which I am not sure.
5. To be clear, variables are only eliminated if they are assigned to, but never written to:
Yes. And for implementation simplification "del" counts as the use of the variable. So _ is not eliminated in the following code: a, *_ = foo() del _
participants (6)
-
André Roberge
-
Brandt Bucher
-
Guido van Rossum
-
Serhiy Storchaka
-
Steve Barnes
-
Steven D'Aprano