[Python-ideas] PEP 572 version 2: Statement-Local Name Bindings

David Foster davidfstr at gmail.com
Sat Mar 17 02:49:14 EDT 2018


(1) I am concerned with the proposal's ability to introduce variables with
a new broader kind of multi-line scope not seen anywhere else in Python. It
is difficult to reason about, particularly in constructs like lambdas and
inline def functions.

Limiting scope to the very same line is great:

> stuff = [[(f(x) as y), x/y] for x in range(5)]

However I advocate that named expressions should ONLY be usable in the
header of statements that enclose other statements. Thus I would expect the
following to be an error:

> if (re.match(...) as m):
>     print(m.groups(0))  # NameError: name 'm' is not defined

> while (sock.read() as data):
>     print("Received data:", data)  # NameError: name 'data' is not defined

Using the named expression in other parts of the statement header is still
fine:

> if (re.match(...) as m) is not None and m.groups(1) == 'begin':
>     ...

In summary, -1 from me for introducing a new kind of sublocal scope. +0
from me for keeping the scope limited to the header of the statement.
Predictable conservative semantics.


(2) If there WAS a desire to allow a named expression inside the
substatements of a compound statement, I would advocate that the named
expression just be treated as a regular (usually-)local variable
assignment, with scope that extends to the entire function. This would
imply that the variable would appear in locals().

Under this interpretation, the following would be okay:

> while (sock.read() as data):
>     print("Received data:", data)
> print("Last received data:", data)  # ok; data is a regular local

I would be -0 for using a regular local scope for a named expression.
Predictable semantics but leaks the expression to the wider scope of the
function.


(3a) With a header-limited scope (in proposal #1 above), I advocate that a
named expression should NOT be able to shadow other variables, giving a
SyntaxError. I can't think of a reasonable reason why such shadowing should
be allowed, and preventing shadowing may eliminate unintentional errors.

(3b) With a local scope (in proposal #2 above), a named expression simply
replaces any preexisting local with the same name. (Or it would replace the
global/nonlocal if a global/nonlocal statement was in use.) There is no
shadowing per-se.


Cheers,
--
David Foster | Seattle, WA, USA



On Fri, Mar 2, 2018 at 3:43 AM, Chris Angelico <rosuav at gmail.com> wrote:

> After dozens of posts and a wide variety of useful opinions and
> concerns being raised, here is the newest version of PEP 572 for your
> debating pleasure.
>
> Formatted version:
>
> https://www.python.org/dev/peps/pep-0572/
>
> There are now several more examples, greater clarity in edge cases,
> and improved wording of the actual proposal and specifications. Also,
> the reference implementation has been significantly enhanced, for
> those who wish to try this themselves.
>
> ChrisA
>
> PEP: 572
> Title: Syntax for Statement-Local Name Bindings
> Author: Chris Angelico <rosuav at gmail.com>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 28-Feb-2018
> Python-Version: 3.8
> Post-History: 28-Feb-2018, 02-Mar-2018
>
>
> Abstract
> ========
>
> Programming is all about reusing code rather than duplicating it.  When
> an expression needs to be used twice in quick succession but never again,
> it is convenient to assign it to a temporary name with small scope.
> By permitting name bindings to exist within a single statement only, we
> make this both convenient and safe against name collisions.
>
>
> Rationale
> =========
>
> When a subexpression is used multiple times in a list comprehension, there
> are currently several ways to spell this, none of which is universally
> accepted as ideal. A statement-local name allows any subexpression to be
> temporarily captured and then used multiple times.
>
> Additionally, this syntax can in places be used to remove the need to
> write an
> infinite loop with a ``break`` in it.  Capturing part of a ``while`` loop's
> condition can improve the clarity of the loop header while still making the
> actual value available within the loop body.
>
>
> Syntax and semantics
> ====================
>
> In any context where arbitrary Python expressions can be used, a named
> expression can appear. This must be parenthesized for clarity, and is of
> the form ``(expr as NAME)`` where ``expr`` is any valid Python expression,
> and ``NAME`` is a simple name.
>
> The value of such a named expression is the same as the incorporated
> expression, with the additional side-effect that NAME is bound to that
> value in all retrievals for the remainder of the current statement.
>
> Just as function-local names shadow global names for the scope of the
> function, statement-local names shadow other names for that statement.
> They can also shadow each other, though actually doing this should be
> strongly discouraged in style guides.
>
> Assignment to statement-local names is ONLY through this syntax. Regular
> assignment to the same name will remove the statement-local name and
> affect the name in the surrounding scope (function, class, or module).
>
> Statement-local names never appear in locals() or globals(), and cannot be
> closed over by nested functions.
>
>
> Execution order and its consequences
> ------------------------------------
>
> Since the statement-local name binding lasts from its point of execution
> to the end of the current statement, this can potentially cause confusion
> when the actual order of execution does not match the programmer's
> expectations. Some examples::
>
>     # A simple statement ends at the newline or semicolon.
>     a = (1 as y)
>     print(y) # NameError
>
>     # The assignment ignores the SLNB - this adds one to 'a'
>     a = (a + 1 as a)
>
>     # Compound statements usually enclose everything...
>     if (re.match(...) as m):
>         print(m.groups(0))
>     print(m) # NameError
>
>     # ... except when function bodies are involved...
>     if (input("> ") as cmd):
>         def run_cmd():
>             print("Running command", cmd) # NameError
>
>     # ... but function *headers* are executed immediately
>     if (input("> ") as cmd):
>         def run_cmd(cmd=cmd): # Capture the value in the default arg
>             print("Running command", cmd) # Works
>
> Some of these examples should be considered *bad code* and rejected by code
> review and/or linters; they are not, however, illegal.
>
>
> Example usage
> =============
>
> These list comprehensions are all approximately equivalent::
>
>     # Calling the function twice
>     stuff = [[f(x), x/f(x)] for x in range(5)]
>
>     # External helper function
>     def pair(x, value): return [value, x/value]
>     stuff = [pair(x, f(x)) for x in range(5)]
>
>     # Inline helper function
>     stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
>
>     # Extra 'for' loop - see also Serhiy's optimization
>     stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
>
>     # Iterating over a genexp
>     stuff = [[y, x/y] for x, y in ((x, f(x)) for x in range(5))]
>
>     # Expanding the comprehension into a loop
>     stuff = []
>     for x in range(5):
>         y = f(x)
>         stuff.append([y, x/y])
>
>     # Wrapping the loop in a generator function
>     def g():
>         for x in range(5):
>             y = f(x)
>             yield [y, x/y]
>     stuff = list(g)
>
>     # Using a statement-local name
>     stuff = [[(f(x) as y), x/y] for x in range(5)]
>
> If calling ``f(x)`` is expensive or has side effects, the clean operation
> of
> the list comprehension gets muddled. Using a short-duration name binding
> retains the simplicity; while the extra ``for`` loop does achieve this, it
> does so at the cost of dividing the expression visually, putting the named
> part at the end of the comprehension instead of the beginning.
>
> Statement-local name bindings can be used in any context, but should be
> avoided where regular assignment can be used, just as ``lambda`` should be
> avoided when ``def`` is an option.  As the name's scope extends to the full
> current statement, even a block statement, this can be used to good effect
> in the header of an ``if`` or ``while`` statement::
>
>     # Current Python, not caring about function return value
>     while input("> ") != "quit":
>         print("You entered a command.")
>
>     # Current Python, capturing return value - four-line loop header
>     while True:
>         command = input("> ");
> if command == "quit":
>     break
>         print("You entered:", command)
>
>     # Proposed alternative to the above
>     while (input("> ") as command) != "quit":
>         print("You entered:", command)
>
>     # See, for instance, Lib/pydoc.py
>     if (re.search(pat, text) as match):
>         print("Found:", match.group(0))
>
>     while (sock.read() as data):
>         print("Received data:", data)
>
> Particularly with the ``while`` loop, this can remove the need to have an
> infinite loop, an assignment, and a condition. It also creates a smooth
> parallel between a loop which simply uses a function call as its condition,
> and one which uses that as its condition but also uses the actual value.
>
>
> Performance costs
> =================
>
> The cost of SLNBs must be kept to a minimum, particularly when they are not
> used; the normal case MUST NOT be measurably penalized.  SLNBs are expected
> to be uncommon, and using many of them in a single function should
> definitely
> be discouraged.  Thus the current implementation uses a linked list of SLNB
> cells, with the absence of such a list being the normal case. This list is
> used for code compilation only; once a function's bytecode has been baked
> in,
> execution of that bytecode has no performance cost compared to regular
> assignment.
>
> Other Python implementations may choose to do things differently, but a
> zero
> run-time cost is strongly recommended, as is a minimal compile-time cost in
> the case where no SLNBs are used.
>
>
> Open questions
> ==============
>
> 1. What happens if the name has already been used? ``(x, (1 as x), x)``
>    Currently, prior usage functions as if the named expression did not
>    exist (following the usual lookup rules); the new name binding will
>    shadow the other name from the point where it is evaluated until the
>    end of the statement.  Is this acceptable?  Should it raise a syntax
>    error or warning?
>
> 2. Syntactic confusion in ``except`` statements.  While technically
>    unambiguous, it is potentially confusing to humans.  In Python 3.7,
>    parenthesizing ``except (Exception as e):`` is illegal, and there is no
>    reason to capture the exception type (as opposed to the exception
>    instance, as is done by the regular syntax).  Should this be made
>    outright illegal, to prevent confusion?  Can it be left to linters?
>    It may also (and independently) be of value to use a subscope for the
>    normal except clause binding, such that ``except Exception as e:`` will
>    no longer unbind a previous use of the name ``e``.
>
> 3. Similar confusion in ``with`` statements, with the difference that there
>    is good reason to capture the result of an expression, and it is also
>    very common for ``__enter__`` methods to return ``self``.  In many
> cases,
>    ``with expr as name:`` will do the same thing as ``with (expr as
> name):``,
>    adding to the confusion.
>
> 4. Should closures be able to refer to statement-local names? Either way,
>    there will be edge cases that make no sense.  Assigning to a name will
>    "push through" the SLNB and bind to the regular name; this means that a
>    statement ``x = x`` will promote the SLNB to full name, and thus has an
>    impact.  Closing over statement-local names, however, introduces scope
>    and lifetime confusions, as it then becomes possible to have two
> functions
>    in almost the same context, closing over the same name, referring to two
>    different cells.
>
>
> Alternative proposals
> =====================
>
> Proposals of this nature have come up frequently on python-ideas. Below are
> a number of alternative syntaxes, some of them specific to comprehensions,
> which have been rejected in favour of the one given above.
>
> 1. ``where``, ``let``, ``given``::
>
>        stuff = [(y, x/y) where y = f(x) for x in range(5)]
>        stuff = [(y, x/y) let y = f(x) for x in range(5)]
>        stuff = [(y, x/y) given y = f(x) for x in range(5)]
>
>    This brings the subexpression to a location in between the 'for' loop
> and
>    the expression. It introduces an additional language keyword, which
> creates
>    conflicts. Of the three, ``where`` reads the most cleanly, but also has
> the
>    greatest potential for conflict (eg SQLAlchemy and numpy have ``where``
>    methods, as does ``tkinter.dnd.Icon`` in the standard library).
>
> 2. ``with``::
>
>        stuff = [(y, x/y) with y = f(x) for x in range(5)]
>
>    As above, but reusing the `with` keyword. Doesn't read too badly, and
> needs
>    no additional language keyword. Is restricted to comprehensions, though,
>    and cannot as easily be transformed into "longhand" for-loop syntax. Has
>    the C problem that an equals sign in an expression can now create a name
>    binding, rather than performing a comparison.
>
> 3. ``with... as``::
>
>        stuff = [(y, x/y) with f(x) as y for x in range(5)]
>
>    As per option 2, but using ``as`` in place of the equals sign. Aligns
>    syntactically with other uses of ``as`` for name binding, but a simple
>    transformation to for-loop longhand would create drastically different
>    semantics; the meaning of ``with`` inside a comprehension would be
>    completely different from the meaning as a stand-alone statement.
>
> 4. ``EXPR as NAME`` without parentheses::
>
>        stuff = [[f(x) as y, x/y] for x in range(5)]
>
>    Omitting the parentheses from this PEP's proposed syntax introduces many
>    syntactic ambiguities.
>
> 5. Adorning statement-local names with a leading dot::
>
>        stuff = [[(f(x) as .y), x/.y] for x in range(5)]
>
>    This has the advantage that leaked usage can be readily detected,
> removing
>    some forms of syntactic ambiguity.  However, this would be the only
> place
>    in Python where a variable's scope is encoded into its name, making
>    refactoring harder.  This syntax is quite viable, and could be promoted
> to
>    become the current recommendation if its advantages are found to
> outweigh
>    its cost.
>
> 6. Allowing ``(EXPR as NAME)`` to assign to any form of name.
>
>    This is exactly the same as the promoted proposal, save that the name is
>    bound in the same scope that it would otherwise have. Any expression can
>    assign to any name, just as it would if the ``=`` operator had been
> used.
>
>
> Discrepancies in the current implementation
> ===========================================
>
> 1. SLNBs are implemented using a special (and mostly-invisible) name
>    mangling.  They may sometimes appear in globals() and/or locals() with
>    their simple or mangled names (but buggily and unreliably). They should
>    be suppressed as though they were guinea pigs.
>
>
> References
> ==========
>
> .. [1] Proof of concept / reference implementation
>    (https://github.com/Rosuav/cpython/tree/statement-local-variables)
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>
>
> ..
>    Local Variables:
>    mode: indented-text
>    indent-tabs-mode: nil
>    sentence-end-double-space: t
>    fill-column: 70
>    coding: utf-8
>    End:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
David Foster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180316/e9e86577/attachment-0001.html>


More information about the Python-ideas mailing list