New scope for exception handlers

I would like to propose a change to exception handlers to make it harder to accidently leak names defined only in the exception handler blocks. This change follows from the decision to delete the name of an exception at the end of a handler. The goal of this change is to prevent people from relying on names that are defined only in a handler. As an example, let's looks at a function with a try except: def f(): try: ... except: a = 1 return a This function will only work if the body raises some exception, otherwise we will get an UnBoundLocalError. I propose that we should make `a` fall out of scope when we leave the except handler regardless to prevent people from depending on this behavior. This will make it easier to catch bugs like this in testing. There is one case where I think the name should not fall out of scope, and that is when the name is already defined outside of the handler. For example: def g(): a = 1 try: ... except: a = 2 return a I think this code is well behaved and should continue to work as it already does. There are a couple of ways to implment this new behavior but I think the simplest way to do this would be to treat the handler as a closure where all the free variables defined as nonlocal. This would need to be a small change to the compiler but may have some performance implications for code that does not hit the except handler. If the handler is longer than the bytecode needed to create the inner closure then it may be faster to run the function when the except handler is not hit. This changes our definition of f from: 2 0 SETUP_EXCEPT 8 (to 11) 3 3 LOAD_CONST 1 (Ellipsis) 6 POP_TOP 7 POP_BLOCK 8 JUMP_FORWARD 14 (to 25) 4 >> 11 POP_TOP 12 POP_TOP 13 POP_TOP 5 14 LOAD_CONST 2 (1) 17 STORE_FAST 0 (a) 20 POP_EXCEPT 21 JUMP_FORWARD 1 (to 25) 24 END_FINALLY 6 >> 25 LOAD_FAST 0 (a) 28 RETURN_VALUE to something more like: f - 3 0 SETUP_EXCEPT 8 (to 11) 4 3 LOAD_CONST 0 (Ellipsis) 6 POP_TOP 7 POP_BLOCK 8 JUMP_FORWARD 20 (to 31) 5 >> 11 POP_TOP 12 POP_TOP 13 POP_TOP 14 LOAD_CONST 1 (<code object <excepthandler> at 0x7febcd6e2300, file "<code>", line 1>) 17 LOAD_CONST 2 ('<excepthandler>') 20 MAKE_FUNCTION 0 23 CALL_FUNCTION 0 (0 positional, 0 keyword pair) 26 POP_EXCEPT 27 JUMP_FORWARD 1 (to 31) 30 END_FINALLY 7 >> 31 LOAD_FAST 0 (a) 34 RETURN_VALUE f.<excepthandler> ----------------- 1 0 LOAD_CONST 0 (1) 3 STORE_FAST 0 (a) 6 LOAD_CONST 1 (None) 9 RETURN_VALUE This new code properly raises the unbound locals exception when executed. For g we could use a MAKE_CLOSURE instead of MAKE_FUNCTION.

On Fri, 8 Apr 2016 at 14:03 Joseph Jevnik <joejev@gmail.com> wrote:
So that change was to prevent memory leaks for the caught exception due to tracebacks being attached to exceptions, not with anything to do with scoping.
The goal of this change is to prevent people from relying on names that are defined only in a handler.
So that will break code and would make an `except` clause even more special in terms of how it works compared to other blocks. Right now the Python scoping rules are pretty straight-forward (local, non-local/free, global, built-in). Adding this would shove in a "unless you're in an `except` clause" rule that makes things more complicated for little benefit in the face of backwards-compatibility. -Brett

On Sat, Apr 9, 2016 at 7:03 AM, Joseph Jevnik <joejev@gmail.com> wrote:
I'm not sure what the point there is; when do you need this kind of thing, and why only in the 'except' clause? Also, what if the name is assigned to in the 'try' and the 'except' but nowhere else? If you're curious, I actually put together a "just for the fun of it" patch that adds a form of sub-function scoping to Python. It'd be a great way to figure out just how far this flies in the face of Python's design. ChrisA

On Fri, Apr 08, 2016 at 05:03:05PM -0400, Joseph Jevnik wrote:
An interesting proposal, but you're missing one critical point: why is it harmful to create names inside an except block? There is a concrete reason why Python 3, and not Python 2, deletes the "except Exception as err" name when the except block leaves: because exceptions now hold on to a lot more call info, which can prevent objects from being garbage-collected. But the same doesn't apply to arbitrary names. At the moment, only a few block statements create a new scope: def and class mostly. In particular, no flow control statement does: if, elif, else, for, while, try, except all use the existing scope. This is a nice clean design, and in my opinion must better than the rule that any indented block is a new scope. I would certainly object to making "except" the only exception (pun intended) and I would object even more to making *all* the block statements create a new scope. Here is an example of how your proposal would bite people. Nearly all by code is hybrid 2+3 code, so I often have a construct like this at the start of modules: try: import builtins # Python 3.x except ImportError: # Python 2.x import __builtin__ as builtins Nice and clean. But what if try and except introduced a new scope? I would have to write: builtins = None try: global builtins import builtins except ImportError: global builtins import __builtin__ as builtins assert builtins is not None Since try and except are different scopes, I need a separate global declaration in each. If you think this second version is an improvement over the first, then our ideas of what makes good looking code are so far apart that I don't think its worth discussing this further :-) If only except is a different scope, then I have this shorter version: try: # global scope import builtins except ImportError: # local scope global builtins import __builtin__ as builtins
Not necessary. It depends on what is hidden by the ... dots. For example: def f(): try: a = sequence.pop() except AttributeError: a = -1 return a It might not be the most Pythonic code around, but it works, and your proposal will break it. Bottom line is, there's nothing fundamentally wrong with except blocks *not* starting a new scope. I'm not sure if there's any real benefit to the proposal, but even if there is, I doubt it's worth the cost of breaking existing working code. So if you still want to champion your proposal, it's not enough to demonstrate that it could be done. You're going to have to demonstrate not only a benefit from the change, but that the benefit is worth breaking other people's code. -- Steve

Thank you for the responses. I did not realize that the delete fast was added because the traceback is on the exception, that makes a lot of sense. Regarding the case of: try: import a except ImportError: import b as a my proposal would still have this work as intended because the first import would appear as an assignment to the name `a` outside the scope of the handler which would case the inner scope to emit a store deref after the import instead of a store fast. The only thing this change would block is introducing a new variable which is only defined inside the except handler.
To be honest, this was mainly proposed because I thought about _how_ to implement it and less about _should_ we implement this. I implemented a small version of this to generate the dis outputs that I put in the first email. After reading the responses I agree that this should probably not be added; I do not think that we need to discuss this further unless someone else has strong feelings about this. On Sat, Apr 9, 2016 at 3:44 AM, Steven D'Aprano <steve@pearwood.info> wrote:

On 9 April 2016 at 17:44, Steven D'Aprano <steve@pearwood.info> wrote:
Not just any code, but "try it and see if it works" name binding idioms recommended in the reference documentation: https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead... It's also worth noting that when it comes to detecting this kind of structural error, tools like pylint already do a good job of tracing possible control flow problems: $ cat > conditional_name_binding.py try: pass except: a = 1 print(a) $ pylint -E --enable=invalid-name conditional_name_binding.py No config file found, using default configuration ************* Module conditional_name_binding C: 4, 4: Invalid constant name "a" (invalid-name) More easily finding this kind of problem is one of the major advantages of using static analysis tools in addition to dynamic testing. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, 8 Apr 2016 at 14:03 Joseph Jevnik <joejev@gmail.com> wrote:
So that change was to prevent memory leaks for the caught exception due to tracebacks being attached to exceptions, not with anything to do with scoping.
The goal of this change is to prevent people from relying on names that are defined only in a handler.
So that will break code and would make an `except` clause even more special in terms of how it works compared to other blocks. Right now the Python scoping rules are pretty straight-forward (local, non-local/free, global, built-in). Adding this would shove in a "unless you're in an `except` clause" rule that makes things more complicated for little benefit in the face of backwards-compatibility. -Brett

On Sat, Apr 9, 2016 at 7:03 AM, Joseph Jevnik <joejev@gmail.com> wrote:
I'm not sure what the point there is; when do you need this kind of thing, and why only in the 'except' clause? Also, what if the name is assigned to in the 'try' and the 'except' but nowhere else? If you're curious, I actually put together a "just for the fun of it" patch that adds a form of sub-function scoping to Python. It'd be a great way to figure out just how far this flies in the face of Python's design. ChrisA

On Fri, Apr 08, 2016 at 05:03:05PM -0400, Joseph Jevnik wrote:
An interesting proposal, but you're missing one critical point: why is it harmful to create names inside an except block? There is a concrete reason why Python 3, and not Python 2, deletes the "except Exception as err" name when the except block leaves: because exceptions now hold on to a lot more call info, which can prevent objects from being garbage-collected. But the same doesn't apply to arbitrary names. At the moment, only a few block statements create a new scope: def and class mostly. In particular, no flow control statement does: if, elif, else, for, while, try, except all use the existing scope. This is a nice clean design, and in my opinion must better than the rule that any indented block is a new scope. I would certainly object to making "except" the only exception (pun intended) and I would object even more to making *all* the block statements create a new scope. Here is an example of how your proposal would bite people. Nearly all by code is hybrid 2+3 code, so I often have a construct like this at the start of modules: try: import builtins # Python 3.x except ImportError: # Python 2.x import __builtin__ as builtins Nice and clean. But what if try and except introduced a new scope? I would have to write: builtins = None try: global builtins import builtins except ImportError: global builtins import __builtin__ as builtins assert builtins is not None Since try and except are different scopes, I need a separate global declaration in each. If you think this second version is an improvement over the first, then our ideas of what makes good looking code are so far apart that I don't think its worth discussing this further :-) If only except is a different scope, then I have this shorter version: try: # global scope import builtins except ImportError: # local scope global builtins import __builtin__ as builtins
Not necessary. It depends on what is hidden by the ... dots. For example: def f(): try: a = sequence.pop() except AttributeError: a = -1 return a It might not be the most Pythonic code around, but it works, and your proposal will break it. Bottom line is, there's nothing fundamentally wrong with except blocks *not* starting a new scope. I'm not sure if there's any real benefit to the proposal, but even if there is, I doubt it's worth the cost of breaking existing working code. So if you still want to champion your proposal, it's not enough to demonstrate that it could be done. You're going to have to demonstrate not only a benefit from the change, but that the benefit is worth breaking other people's code. -- Steve

Thank you for the responses. I did not realize that the delete fast was added because the traceback is on the exception, that makes a lot of sense. Regarding the case of: try: import a except ImportError: import b as a my proposal would still have this work as intended because the first import would appear as an assignment to the name `a` outside the scope of the handler which would case the inner scope to emit a store deref after the import instead of a store fast. The only thing this change would block is introducing a new variable which is only defined inside the except handler.
To be honest, this was mainly proposed because I thought about _how_ to implement it and less about _should_ we implement this. I implemented a small version of this to generate the dis outputs that I put in the first email. After reading the responses I agree that this should probably not be added; I do not think that we need to discuss this further unless someone else has strong feelings about this. On Sat, Apr 9, 2016 at 3:44 AM, Steven D'Aprano <steve@pearwood.info> wrote:

On 9 April 2016 at 17:44, Steven D'Aprano <steve@pearwood.info> wrote:
Not just any code, but "try it and see if it works" name binding idioms recommended in the reference documentation: https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead... It's also worth noting that when it comes to detecting this kind of structural error, tools like pylint already do a good job of tracing possible control flow problems: $ cat > conditional_name_binding.py try: pass except: a = 1 print(a) $ pylint -E --enable=invalid-name conditional_name_binding.py No config file found, using default configuration ************* Module conditional_name_binding C: 4, 4: Invalid constant name "a" (invalid-name) More easily finding this kind of problem is one of the major advantages of using static analysis tools in addition to dynamic testing. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (5)
-
Brett Cannon
-
Chris Angelico
-
Joseph Jevnik
-
Nick Coghlan
-
Steven D'Aprano