elimination of scope bleeding of iteration variables

apologies if this has been brought up on python-dev already. a suggestion i have, perhaps for python 3.0 since it may break some code (but imo it could go into 2.6 or 2.7 because the likely breakage would be very small, see below), is the elimination of the misfeature whereby the iteration variable used in for-loops, list comprehensions, etc. bleeds out into the surrounding scope. [i'm aware that there is a similar proposal for python 3.0 for list comprehensions specifically, but that's not enough.] i've been bitten by this problem more than once. most recently, i had a simple function like this: # Replace property named PROP with NEW in PROPLIST, a list of tuples. def property_name_replace(prop, new, proplist): for i in xrange(len(proplist)): if x[i][0] == prop: x[i] = (new, x[i][1]) the use of `x' in here is an error, as it should be `proplist'; i previously had a loop `for x in proplist', but ran into the problem that tuples can't be modified, and lists can't be modified except by index. despite this obviously bad code, however, it ran without any complaint from python -- which amazed me, when i finally tracked this problem down. turns out i had two offenders, both way down in the "main" code at the bottom of the file -- a for-loop with loop variable x, and a list comprehension [x for x in output_file_map] -- and suddenly `x' is a global variable. yuck. i suggest that the scope of list comprehension iteration variables be only the list comprehension itself (see python 3000), and the scope of for-loop iteration variables be only the for-loop (including any following `else' statement -- this way you can access it and store it in some other variable if you really want it). in practice, i seriously doubt this will break much code and probably could be introduced like the previous scope change: using `from __future__' in python 2.5 or 2.6, and by default in the next version. it should be possible, in most circumstances, to issue a warning about code that relies on the old behavior. ben

Ben Wing wrote:
List comprehensions will be fixed in Py3k. However, the scoping of for loop variables won't change, as the current behaviour is essential for search loops that use a break statement to terminate the loop when the item is found. Accordingly, there is plenty of code in the wild that *would* break if the for loop variables were constrained to the for loop, even if your own code wouldn't have such a problem. Outside pure scripts, significant control flow logic (like for loops) should be avoided at module level. You are typically much better off moving the logic inside a _main() function and invoking it at the end of the module. This avoids the 'accidental global' problem for all of the script-only variables, not only the ones that happen to be used as for loop variables.
This wouldn't have helped with your name-change problem, but you've got a lot of unnecessary indexing going on there: def property_name_replace(prop, new, proplist): for i, (name, value) in enumerate(proplist): if name == prop: proplist[i] = (new, value) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

Nick Coghlan wrote:
i did in fact end up doing that. however, in the process i ran into another python annoyance i've tripped over repeatedly: you can't assign to a global variable without explicitly declaring it as `global'. instead, you "magically" get a shadowing local variable. this behavior is extremely hostile to newcomers: e.g. foo = 1 def set_foo(): foo = 2 print foo --> 1 the worst part is, not a single warning from Python about this. in a large program, such a bug can be very tricky to track down. now i can see how an argument against changing this behavior might hinge upon global names like `hash' and `list'; you certainly wouldn't want an intended local variable called `hash' or `list' to trounce upon these. but this argument confuses lexical and dynamic scope: global variables declared inside a module are (or can be viewed as) globally lexically scoped in the module, whereas `hash' and `list' are dynamically scoped. so i'd suggest: [1] ideally, change this behavior, either for 2.6 or 3.0. maybe have a `local' keyword if you really want a new scope. [2] until this change, python should always print a warning in this situation. [3] the current 'UnboundLocal' exception should probably be more helpful, e.g. suggesting that you might need to use a `global foo' declaration. ben

On Sun, Apr 30, 2006 at 10:47:07PM -0500, Ben Wing wrote:
PyLint gives a warning here "local foo shadows global variable". Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.

On 4/30/06, Ben Wing <ben@666.com> wrote:
You're joking right? -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Nick Coghlan wrote:
It occurs to me that there's a middle ground here: leave the loop variable scope alone, but make it an error to use the same variable in two different loops at the same time. e.g. for x in stuff: if its_what_were_looking_for(x): break snarfle(x) for x in otherstuff: dosomethingelse(x) would be fine, but for x in stuff: for x in otherstuff: dosomethingelse(x) would be a SyntaxError because the inner loop is trying to use x while it's still in use by the outer loop. -- Greg

Ben Wing wrote:
List comprehensions will be fixed in Py3k. However, the scoping of for loop variables won't change, as the current behaviour is essential for search loops that use a break statement to terminate the loop when the item is found. Accordingly, there is plenty of code in the wild that *would* break if the for loop variables were constrained to the for loop, even if your own code wouldn't have such a problem. Outside pure scripts, significant control flow logic (like for loops) should be avoided at module level. You are typically much better off moving the logic inside a _main() function and invoking it at the end of the module. This avoids the 'accidental global' problem for all of the script-only variables, not only the ones that happen to be used as for loop variables.
This wouldn't have helped with your name-change problem, but you've got a lot of unnecessary indexing going on there: def property_name_replace(prop, new, proplist): for i, (name, value) in enumerate(proplist): if name == prop: proplist[i] = (new, value) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

Nick Coghlan wrote:
i did in fact end up doing that. however, in the process i ran into another python annoyance i've tripped over repeatedly: you can't assign to a global variable without explicitly declaring it as `global'. instead, you "magically" get a shadowing local variable. this behavior is extremely hostile to newcomers: e.g. foo = 1 def set_foo(): foo = 2 print foo --> 1 the worst part is, not a single warning from Python about this. in a large program, such a bug can be very tricky to track down. now i can see how an argument against changing this behavior might hinge upon global names like `hash' and `list'; you certainly wouldn't want an intended local variable called `hash' or `list' to trounce upon these. but this argument confuses lexical and dynamic scope: global variables declared inside a module are (or can be viewed as) globally lexically scoped in the module, whereas `hash' and `list' are dynamically scoped. so i'd suggest: [1] ideally, change this behavior, either for 2.6 or 3.0. maybe have a `local' keyword if you really want a new scope. [2] until this change, python should always print a warning in this situation. [3] the current 'UnboundLocal' exception should probably be more helpful, e.g. suggesting that you might need to use a `global foo' declaration. ben

On Sun, Apr 30, 2006 at 10:47:07PM -0500, Ben Wing wrote:
PyLint gives a warning here "local foo shadows global variable". Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.

On 4/30/06, Ben Wing <ben@666.com> wrote:
You're joking right? -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Nick Coghlan wrote:
It occurs to me that there's a middle ground here: leave the loop variable scope alone, but make it an error to use the same variable in two different loops at the same time. e.g. for x in stuff: if its_what_were_looking_for(x): break snarfle(x) for x in otherstuff: dosomethingelse(x) would be fine, but for x in stuff: for x in otherstuff: dosomethingelse(x) would be a SyntaxError because the inner loop is trying to use x while it's still in use by the outer loop. -- Greg
participants (5)
-
Ben Wing
-
Greg Ewing
-
Guido van Rossum
-
Nick Coghlan
-
Oleg Broytmann