[Python-ideas] PEP 572: Statement-Local Name Bindings

Chris Angelico rosuav at gmail.com
Wed Feb 28 14:41:10 EST 2018


On Thu, Mar 1, 2018 at 3:30 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 28 February 2018 at 13:45, Chris Angelico <rosuav at gmail.com> wrote:
>> On Wed, Feb 28, 2018 at 10:49 PM, Paul Moore <p.f.moore at gmail.com> wrote:
>
>>> While there's basically no justification for doing so, it should be
>>> noted that under this proposal, ((((((((1 as x) as y) as z) as w) as
>>> v) as u) as t) as s) is valid. Of course, "you can write confusing
>>> code using this" isn't an argument against a useful enhancement, but
>>> potential for abuse is something to be aware of. There's also
>>> (slightly more realistically) something like [(sqrt((b*b as bsq) +
>>> (4*a*c as fourac)) as root1), (sqrt(bsq - fourac) as root2)], which I
>>> can see someone thinking is a good idea!
>>
>> Sure! Though I'm not sure what you're representing there; it looks
>> almost, but not quite, like the quadratic formula. If that was the
>> intention, I'd be curious to see the discriminant broken out, with
>> some kind of trap for the case where it's negative.
>
> lol, it was meant to be the quadratic roots. If I got it wrong, that
> probably says something about how hard it is top maintain or write
> code that over-uses the proposed feature ;-) If I didn't get it wrong,
> that still makes the same point, I guess!

Or, more likely, it says something about what happens when a
programmer bashes out some code to try to represent a famous formula,
but doesn't actually debug it. As is often said, code that isn't
tested is buggy :)

Here's another equally untested piece of code:

[(-b + (sqrt(b*b - 4*a*c) as disc)) / (2*a), (-b - disc) / (2*a)]

>>>> Open questions
>>>> ==============
>>>>
>>>> 1. What happens if the name has already been used? `(x, (1 as x), x)`
>>>>    Currently, prior usage functions as if the named expression did not
>>>>    exist (following the usual lookup rules); the new name binding will
>>>>    shadow the other name from the point where it is evaluated until the
>>>>    end of the statement.  Is this acceptable?  Should it raise a syntax
>>>>    error or warning?
>>>
>>> IMO, relying on evaluation order is the only viable option, but it's
>>> confusing. I would immediately reject something like `(x, (1 as x),
>>> x)` as bad style, simply because the meaning of x at the two places it
>>> is used is non-obvious.
>>>
>>> I'm -1 on a warning. I'd prefer an error, but I can't see how you'd
>>> implement (or document) it.
>>
>> Sure. For now, I'm just going to leave it as a perfectly acceptable
>> use of the feature; it can be rejected as poor style, but permitted by
>> the language.
>
> "Perfectly acceptable" I disagree with. "Unacceptable but impossible
> to catch in the compiler" is closer to my view.

Sorry, what I meant by "acceptable" was "legal". The compiler accepts
it, the bytecode exec is fine with it, but a human may very well
decide that it's unacceptable.

>>>> 2. The current implementation [1] implements statement-local names using
>>>>    a special (and mostly-invisible) name mangling.  This works perfectly
>>>>    inside functions (including list comprehensions), but not at top
>>>>    level.  Is this a serious limitation?  Is it confusing?
>>>
>>> I'm strongly -1 on "works like the current implementation" as a
>>> definition of the behaviour. While having a proof of concept to
>>> clarify behaviour is great, my first question would be "how is the
>>> behaviour documented to work?" So what is the PEP proposing would
>>> happen if I did
>>>
>>> if ('.'.join((str(x) for x in sys.version_info[:2])) as ver) == '3.6':
>>>     # Python 3.6 specific code here
>>> elif sys.version_info[0] < 3:
>>>     print(f"Version {ver} is not supported")
>>>
>>> at the top level of a Python file? To me, that's a perfectly
>>> reasonable way of using the new feature to avoid exposing a binding
>>> for "ver" in my module...
>>
>> I agree, sounds good. I'll reword this to be a limitation of implementation.
>
> To put it another way, "Intended to work, but we haven't determined
> how to implement it yet"? Fair enough, although it needs to be
> possible to implement it. These names are a weird not-quite-scope
> construct, and interactions with real scopes are going to be tricky to
> get right (not just implement, but define).

Yeah. My ideal is something like this:

* The subscope names are actual real variables, with unspellable names.
* These name bindings get created and destroyed just like other names
do, with the exception that they are automatically destroyed when they
"fall out of scope".
* While a subscope name is visible, locals() will use that value for
that name (shadowing any other).
* Once that name is removed, locals() will return to the normal form
of the name.

And yes, ideally this will still work when locals() is globals().
There are a couple of issues, but that's my planned design.

An alternative design that is also viable: These subscope names
(easily detected internally) are simply hidden from locals() and
globals().

I haven't dug into the implementation consequences of any of this at
global scope. I know what parts of the code I need to look at, but my
days have this annoying habit of having only 24 hours in them. Anyone
got a bugfix for that? :|

> Consider
>
> x = 12
> if (1 as x) == 1:
>     def foo():
>         print(x)
>         # Actually, you'd probably get a "Name used before definition"
> error here.
>         # Would "global x" refer to x=12 or to the statement-local x (1)?
>         # Would "nonlocal x" refer to the statement-local x?
>         x = 13
>         return x
>     print(foo())

Yeah, that's going to give UnboundLocalError, because inside foo(), x
has been flagged as local. That's independent of the global scope
changes.

I'd like to say that "global x" would catch the 12, but until I
actually get around to implementing it, I'm not sure.

> print(x)
> print(foo())
> print(x)

Anything that executes after the 'if' exits should see x as 12. The
temporary variable is completely gone at that point.

> What should that return? Not "what does the current implementation
> return", but what is the intended result according to the proposal,
> and how would you explain that result in the docs?
>
> I think I'd expect
>
> 1
> 13 # But see note about global/nonlocal
> 12
> 1 xxxxxxx Not sure? Maybe 1? Can you create a closure over a
> statement-local variable?
> 13 # But see note about global/nonlocal
> 12
>
> The most charitable thing I can say here is that the semantics are
> currently under-specified in the PEP :-)

Hah. This is why I started out by saying that this ONLY applies inside
a function. Extending this to global scope (and class scope; my guess
is that it'll behave the same as global) is something that I'm only
just now looking into.

>>>> 4. Syntactic confusion in `except` statements.  While technically
>>>>    unambiguous, it is potentially confusing to humans.  In Python 3.7,
>>>>    parenthesizing `except (Exception as e):` is illegal, and there is no
>>>>    reason to capture the exception type (as opposed to the exception
>>>>    instance, as is done by the regular syntax).  Should this be made
>>>>    outright illegal, to prevent confusion?  Can it be left to linters?
>>>
>>> Wait - except (Exception as e): would set e to the type Exception, and
>>> not capture the actual exception object?
>>
>> Correct. The expression "Exception" evaluates to the type Exception,
>> and you can capture that. It's a WutFace moment but it's a logical
>> consequence of the nature of Python.
>
> "Logical consequence of the rules" isn't enough for a Python language
> feature, IMO. Intuitive and easy to infer are key. Even if this is a
> corner case, it counts as a mildly ugly wart to me.

Does it need to be special-cased as an error? I do *not* want to
special-case it to capture the exception instance, as that would
almost certainly misbehave in more complicated scenarios.

> I value Python for making it easy to write correct code, not easy to
> spot your errors. Too many hings like this would start me thinking I
> should ban statement-local names from codebases I maintain, which is
> not a good start for a feature...

Banning them from 'except' clauses isn't a bad thing, though. There's
nothing that you need to capture; you're normally going to use a
static lookup of a simple name (at best, a qualified name).

>>> This seems to imply that the name in (expr as name) when used as a top
>>> level expression will persist after the closing parenthesis. Is that
>>> intended? It's not mentioned anywhere in the PEP (that I could see).
>>> On re-reading, I see that you say "for the remainder of the current
>>> *statement*" and not (as I had misread it) the remainder of the
>>> current *expression*.
>>
>> Yep. If you have an expression on its own, it's an "expression
>> statement", and the subscope will end at the newline (or the
>> semicolon, if you have one). Inside something larger, it'll persist.
>
> Technically you can have more than one expression in a statement.
> Consider (from the grammar):
>
>     for_stmt ::=  "for" target_list "in" expression_list ":" suite
>                   ["else" ":" suite]
>
>     expression_list    ::=  expression ( "," expression )* [","]
>
> Would a name binding in the first expression in an expression_list be
> visible in the second expression? Should it be? It will be, because
> it's visible to the end of the statement, not to the end of the
> expression, but there might be subtle technical implications here (I
> haven't checked the grammar to see what other statements allow
> multiple expressions - that's your job ;-)) To clarify this sort of
> question, you should probably document in the PEP precisely how the
> grammar will be modified.

Yes, it will. It's exactly the same in any form of statement: the name
binding begins to exist at the point where it's evaluated, and it
ceases to exist once that statement finishes executing. If it's an
expression statement (by which I specifically mean the syntactic
construct of putting a bare expression on a line on its own, called
"expr_stmt" in the grammar), that point happens to coincide with the
end of the expression, but that's a coincidence.

So if, in the "in" expression list, you capture something, that thing
will be visible all through the suite. Here's an example:

for item in (get_items() as items):
    print(item)
    print(items)
print(items)

What actually happens is kinda this:

for item in (get_items() as items_0x142857):
    print(item)
    print(items_0x142857)
del items_0x142857
print(items)

except that, internally, the name "items_0x142857" actually has a dot
in it, making it impossible to reference using regular syntax. Once
the 'for' loop is completely finished, the unbinding is compiled in,
and then the name mangling ceases to happen. So it doesn't actually
matter how many expressions are in a statement; it's just "this
statement".

> lol, see? Closures rear their ugly heads, as I mentioned above.
>
> "What the current proof of concept implementation does" isn't useful
> anyway, but even ignoring that I'd prefer to see what it *does* rather
> than what it *compiles to*. But what needs to be documented is what
> the PEP *proposes* it does.

The current implementation matches my proposed semantics, as long as
the code in question is all inside a function.

>> I'll add some more examples. I think the if/while usage is potentially of value.
>
> I think it's an unexpected consequence of an overly-broad solution to
> the original problem, that accidentally solves another long-running
> debate. But it means you've opened a much bigger can of worms than it
> originally appeared, and I'm not sure you don't risk losing the
> simplicity that *expression* local names might have had.
>
> But I can even break expression local names:
>
>     x = ((lambda: boom()) as boom)
>     x()
>
> It's possible that the "boom" is just my head exploding, not the
> interpreter. But I think I just demonstrated a closure over an
> expression-local name. For added fun, replace "x" with "boom"...

And this is why I am not trying for expression-local names. If someone
wants to run with a competing proposal for list-comprehension-local
names, sure, but I'm not in favour of that either. Expression-local is
too narrow to be useful AND it still has the problems that
statement-local has.

>> Thanks for the feedback! Keep it coming! :)
>
> Ask and you shall receive :-)

If I take a razor and cut myself with it, it's called "self-harm" and
can get me dinged for a psych evaluation. If I take a mailing list and
induce it to send me hundreds of emails and force myself to read them
all... there's probably a padded cell with my name on it somewhere.

You know, I'd be okay with that actually. Just as long as the cell has wifi.

ChrisA


More information about the Python-ideas mailing list