[Python-Dev] PEP 572: Now with 25% less reference implementation!

Dmitry Malinovsky damalinov at gmail.com
Fri Apr 20 00:45:37 EDT 2018


Hello Chris, and thank you for working on this PEP!

What do you think about using variable type hints with this syntax?
I tried to search through python-dev and couldn't find a single post
discussing that question.
If I missed it somehow, could you please include its conclusions into the PEP?

For instance, as I understand now the parser will fail on this snippet:

    while data: bytes := stream.read():
        print("Received data:", data)

Do brackets help?

    while (data: bytes := stream.read()):
        print("Received data:", data)

IIUC, in 3.7 It is invalid syntax to specify a type hint for a for loop item;
should brackets help? Currently they don't:

    Python 3.7.0b3+ (heads/3.7:7dcfd6c, Mar 30 2018, 21:30:34)
    [Clang 9.0.0 (clang-900.0.39.2)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> for (x: int) in [1,2,3]:
      File "<stdin>", line 1
        for (x: int) in [1,2,3]:
          ^
    SyntaxError: invalid syntax

Thanks,

> On 20 Apr 2018, at 09:10, Chris Angelico <rosuav at gmail.com> wrote:
> 
> Working on the reference implementation for PEP 572 is turning out to
> be a massive time sink, both on my personal schedule and on the PEP's
> discussion. I can't just hold off all discussion on a topic until I
> figure out whether something is possible or not, because that could
> take me several days, even a week or more. And considering the massive
> backlash against the proposal, it seems like a poor use of my time to
> try to prove that something's impossible, find that I don't know
> enough about grammar editing to be able to say anything more than
> "well, I couldn't do it, but someone else might be able to", and then
> try to resume the discussion with no more certainty than we had
> before.
> 
> So here's the PEP again, simplified. I'm fairly sure it's just going
> to be another on a growing list of rejected PEPs to my name, and I'm
> done with trying to argue some of these points. Either the rules get
> simplified, or they don't. Trying to simplify the rules and maintain
> perfect backward compatibility is just making the rules even more
> complicated.
> 
> PEP 572, if accepted, *will* change the behaviour of certain
> constructs inside comprehensions, mainly due to interactions with
> class scope that make no sense unless you know how they're implemented
> internally. The official tutorial pretends that comprehensions are
> "equivalent to" longhand:
> 
> https://docs.python.org/3/tutorial/datastructures.html?highlight=equivalent#list-comprehensions
> https://docs.python.org/3/howto/functional.html?highlight=equivalent#generator-expressions-and-list-comprehensions
> 
> and this is an inaccuracy for the sake of simplicity. PEP 572 will
> make this far more accurate; the only difference is that the
> comprehension is inside a function. Current semantics are far more
> bizarre than that.
> 
> Do you want absolutely 100% backward compatibility? Then reject this
> PEP. Or better still, keep using Python 3.7, and don't upgrade to 3.8,
> in case something breaks. Do you want list comprehensions that make
> better sense? Then accept that some code will need to change, if it
> tried to use the same name in multiple scopes, or tried to use ancient
> Python 2 semantics with a yield expression in the outermost iterable.
> 
> I'm pretty much ready for pronouncement.
> 
> https://www.python.org/dev/peps/pep-0572/
> 
> ChrisA
> 
> PEP: 572
> Title: Assignment Expressions
> Author: Chris Angelico <rosuav at gmail.com>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 28-Feb-2018
> Python-Version: 3.8
> Post-History: 28-Feb-2018, 02-Mar-2018, 23-Mar-2018, 04-Apr-2018, 17-Apr-2018
> 
> 
> Abstract
> ========
> 
> This is a proposal for creating a way to assign to variables within an
> expression. Additionally, the precise scope of comprehensions is adjusted, to
> maintain consistency and follow expectations.
> 
> 
> Rationale
> =========
> 
> Naming the result of an expression is an important part of programming,
> allowing a descriptive name to be used in place of a longer expression,
> and permitting reuse.  Currently, this feature is available only in
> statement form, making it unavailable in list comprehensions and other
> expression contexts.  Merely introducing a way to assign as an expression
> would create bizarre edge cases around comprehensions, though, and to avoid
> the worst of the confusions, we change the definition of comprehensions,
> causing some edge cases to be interpreted differently, but maintaining the
> existing behaviour in the majority of situations.
> 
> 
> Syntax and semantics
> ====================
> 
> In any context where arbitrary Python expressions can be used, a **named
> expression** can appear. This is of the form ``target := expr`` where
> ``expr`` is any valid Python expression, and ``target`` is any valid
> assignment target.
> 
> The value of such a named expression is the same as the incorporated
> expression, with the additional side-effect that the target is assigned
> that value::
> 
>    # Handle a matched regex
>    if (match := pattern.search(data)) is not None:
>        ...
> 
>    # A more explicit alternative to the 2-arg form of iter() invocation
>    while (value := read_next_item()) is not None:
>        ...
> 
>    # Share a subexpression between a comprehension filter clause and its output
>    filtered_data = [y for x in data if (y := f(x)) is not None]
> 
> 
> Differences from regular assignment statements
> ----------------------------------------------
> 
> Most importantly, since ``:=`` is an expression, it can be used in contexts
> where statements are illegal, including lambda functions and comprehensions.
> 
> An assignment statement can assign to multiple targets, left-to-right::
> 
>    x = y = z = 0
> 
> The equivalent assignment expression is parsed as separate binary operators,
> and is therefore processed right-to-left, as if it were spelled thus::
> 
>    assert 0 == (x := (y := (z := 0)))
> 
> Augmented assignment is not supported in expression form::
> 
>>>> x +:= 1
>      File "<stdin>", line 1
>        x +:= 1
>            ^
>    SyntaxError: invalid syntax
> 
> Otherwise, the semantics of assignment are identical in statement and
> expression forms.
> 
> 
> Alterations to comprehensions
> -----------------------------
> 
> The current behaviour of list/set/dict comprehensions and generator
> expressions has some edge cases that would behave strangely if an assignment
> expression were to be used. Therefore the proposed semantics are changed,
> removing the current edge cases, and instead altering their behaviour *only*
> in a class scope.
> 
> As of Python 3.7, the outermost iterable of any comprehension is evaluated
> in the surrounding context, and then passed as an argument to the implicit
> function that evaluates the comprehension.
> 
> Under this proposal, the entire body of the comprehension is evaluated in
> its implicit function. Names not assigned to within the comprehension are
> located in the surrounding scopes, as with normal lookups. As one special
> case, a comprehension at class scope will **eagerly bind** any name which
> is already defined in the class scope.
> 
> A list comprehension can be unrolled into an equivalent function. With
> Python 3.7 semantics::
> 
>    numbers = [x + y for x in range(3) for y in range(4)]
>    # Is approximately equivalent to
>    def <listcomp>(iterator):
>        result = []
>        for x in iterator:
>            for y in range(4):
>                result.append(x + y)
>        return result
>    numbers = <listcomp>(iter(range(3)))
> 
> Under the new semantics, this would instead be equivalent to::
> 
>    def <listcomp>():
>        result = []
>        for x in range(3):
>            for y in range(4):
>                result.append(x + y)
>        return result
>    numbers = <listcomp>()
> 
> When a class scope is involved, a naive transformation into a function would
> prevent name lookups (as the function would behave like a method)::
> 
>    class X:
>        names = ["Fred", "Barney", "Joe"]
>        prefix = "> "
>        prefixed_names = [prefix + name for name in names]
> 
> With Python 3.7 semantics, this will evaluate the outermost iterable at class
> scope, which will succeed; but it will evaluate everything else in a function::
> 
>    class X:
>        names = ["Fred", "Barney", "Joe"]
>        prefix = "> "
>        def <listcomp>(iterator):
>            result = []
>            for name in iterator:
>                result.append(prefix + name)
>            return result
>        prefixed_names = <listcomp>(iter(names))
> 
> The name ``prefix`` is thus searched for at global scope, ignoring the class
> name. Under the proposed semantics, this name will be eagerly bound; and the
> same early binding then handles the outermost iterable as well. The list
> comprehension is thus approximately equivalent to::
> 
>    class X:
>        names = ["Fred", "Barney", "Joe"]
>        prefix = "> "
>        def <listcomp>(names=names, prefix=prefix):
>            result = []
>            for name in names:
>                result.append(prefix + name)
>            return result
>        prefixed_names = <listcomp>()
> 
> With list comprehensions, this is unlikely to cause any confusion. With
> generator expressions, this has the potential to affect behaviour, as the
> eager binding means that the name could be rebound between the creation of
> the genexp and the first call to ``next()``. It is, however, more closely
> aligned to normal expectations.  The effect is ONLY seen with names that
> are looked up from class scope; global names (eg ``range()``) will still
> be late-bound as usual.
> 
> One consequence of this change is that certain bugs in genexps will not
> be detected until the first call to ``next()``, where today they would be
> caught upon creation of the generator.
> 
> 
> Recommended use-cases
> =====================
> 
> Simplifying list comprehensions
> -------------------------------
> 
> A list comprehension can map and filter efficiently by capturing
> the condition::
> 
>    results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]
> 
> Similarly, a subexpression can be reused within the main expression, by
> giving it a name on first use::
> 
>    stuff = [[y := f(x), x/y] for x in range(5)]
> 
>    # There are a number of less obvious ways to spell this in current
>    # versions of Python, such as:
> 
>    # Inline helper function
>    stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
> 
>    # Extra 'for' loop - potentially could be optimized internally
>    stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
> 
>    # Using a mutable cache object (various forms possible)
>    c = {}
>    stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)]
> 
> In all cases, the name is local to the comprehension; like iteration variables,
> it cannot leak out into the surrounding context.
> 
> 
> Capturing condition values
> --------------------------
> 
> Assignment expressions can be used to good effect in the header of
> an ``if`` or ``while`` statement::
> 
>    # Proposed syntax
>    while (command := input("> ")) != "quit":
>        print("You entered:", command)
> 
>    # Capturing regular expression match objects
>    # See, for instance, Lib/pydoc.py, which uses a multiline spelling
>    # of this effect
>    if match := re.search(pat, text):
>        print("Found:", match.group(0))
> 
>    # Reading socket data until an empty string is returned
>    while data := sock.read():
>        print("Received data:", data)
> 
>    # Equivalent in current Python, not caring about function return value
>    while input("> ") != "quit":
>        print("You entered a command.")
> 
>    # To capture the return value in current Python demands a four-line
>    # loop header.
>    while True:
>        command = input("> ");
>        if command == "quit":
>            break
>        print("You entered:", command)
> 
> Particularly with the ``while`` loop, this can remove the need to have an
> infinite loop, an assignment, and a condition. It also creates a smooth
> parallel between a loop which simply uses a function call as its condition,
> and one which uses that as its condition but also uses the actual value.
> 
> 
> Rejected alternative proposals
> ==============================
> 
> Proposals broadly similar to this one have come up frequently on python-ideas.
> Below are a number of alternative syntaxes, some of them specific to
> comprehensions, which have been rejected in favour of the one given above.
> 
> 
> Alternative spellings
> ---------------------
> 
> Broadly the same semantics as the current proposal, but spelled differently.
> 
> 1. ``EXPR as NAME``::
> 
>       stuff = [[f(x) as y, x/y] for x in range(5)]
> 
>   Since ``EXPR as NAME`` already has meaning in ``except`` and ``with``
>   statements (with different semantics), this would create unnecessary
>   confusion or require special-casing (eg to forbid assignment within the
>   headers of these statements).
> 
> 2. ``EXPR -> NAME``::
> 
>       stuff = [[f(x) -> y, x/y] for x in range(5)]
> 
>   This syntax is inspired by languages such as R and Haskell, and some
>   programmable calculators. (Note that a left-facing arrow ``y <- f(x)`` is
>   not possible in Python, as it would be interpreted as less-than and unary
>   minus.) This syntax has a slight advantage over 'as' in that it does not
>   conflict with ``with`` and ``except`` statements, but otherwise is
>   equivalent.
> 
> 3. Adorning statement-local names with a leading dot::
> 
>       stuff = [[(f(x) as .y), x/.y] for x in range(5)] # with "as"
>       stuff = [[(.y := f(x)), x/.y] for x in range(5)] # with ":="
> 
>   This has the advantage that leaked usage can be readily detected, removing
>   some forms of syntactic ambiguity.  However, this would be the only place
>   in Python where a variable's scope is encoded into its name, making
>   refactoring harder.
> 
> 4. Adding a ``where:`` to any statement to create local name bindings::
> 
>       value = x**2 + 2*x where:
>           x = spam(1, 4, 7, q)
> 
>   Execution order is inverted (the indented body is performed first, followed
>   by the "header").  This requires a new keyword, unless an existing keyword
>   is repurposed (most likely ``with:``).  See PEP 3150 for prior discussion
>   on this subject (with the proposed keyword being ``given:``).
> 
> 5. ``TARGET from EXPR``::
> 
>       stuff = [[y from f(x), x/y] for x in range(5)]
> 
>   This syntax has fewer conflicts than ``as`` does (conflicting only with the
>   ``raise Exc from Exc`` notation), but is otherwise comparable to it. Instead
>   of paralleling ``with expr as target:`` (which can be useful but can also be
>   confusing), this has no parallels, but is evocative.
> 
> 
> Special-casing conditional statements
> -------------------------------------
> 
> One of the most popular use-cases is ``if`` and ``while`` statements.  Instead
> of a more general solution, this proposal enhances the syntax of these two
> statements to add a means of capturing the compared value::
> 
>    if re.search(pat, text) as match:
>        print("Found:", match.group(0))
> 
> This works beautifully if and ONLY if the desired condition is based on the
> truthiness of the captured value.  It is thus effective for specific
> use-cases (regex matches, socket reads that return `''` when done), and
> completely useless in more complicated cases (eg where the condition is
> ``f(x) < 0`` and you want to capture the value of ``f(x)``).  It also has
> no benefit to list comprehensions.
> 
> Advantages: No syntactic ambiguities. Disadvantages: Answers only a fraction
> of possible use-cases, even in ``if``/``while`` statements.
> 
> 
> Special-casing comprehensions
> -----------------------------
> 
> Another common use-case is comprehensions (list/set/dict, and genexps). As
> above, proposals have been made for comprehension-specific solutions.
> 
> 1. ``where``, ``let``, or ``given``::
> 
>       stuff = [(y, x/y) where y = f(x) for x in range(5)]
>       stuff = [(y, x/y) let y = f(x) for x in range(5)]
>       stuff = [(y, x/y) given y = f(x) for x in range(5)]
> 
>   This brings the subexpression to a location in between the 'for' loop and
>   the expression. It introduces an additional language keyword, which creates
>   conflicts. Of the three, ``where`` reads the most cleanly, but also has the
>   greatest potential for conflict (eg SQLAlchemy and numpy have ``where``
>   methods, as does ``tkinter.dnd.Icon`` in the standard library).
> 
> 2. ``with NAME = EXPR``::
> 
>       stuff = [(y, x/y) with y = f(x) for x in range(5)]
> 
>   As above, but reusing the `with` keyword. Doesn't read too badly, and needs
>   no additional language keyword. Is restricted to comprehensions, though,
>   and cannot as easily be transformed into "longhand" for-loop syntax. Has
>   the C problem that an equals sign in an expression can now create a name
>   binding, rather than performing a comparison. Would raise the question of
>   why "with NAME = EXPR:" cannot be used as a statement on its own.
> 
> 3. ``with EXPR as NAME``::
> 
>       stuff = [(y, x/y) with f(x) as y for x in range(5)]
> 
>   As per option 2, but using ``as`` rather than an equals sign. Aligns
>   syntactically with other uses of ``as`` for name binding, but a simple
>   transformation to for-loop longhand would create drastically different
>   semantics; the meaning of ``with`` inside a comprehension would be
>   completely different from the meaning as a stand-alone statement, while
>   retaining identical syntax.
> 
> Regardless of the spelling chosen, this introduces a stark difference between
> comprehensions and the equivalent unrolled long-hand form of the loop.  It is
> no longer possible to unwrap the loop into statement form without reworking
> any name bindings.  The only keyword that can be repurposed to this task is
> ``with``, thus giving it sneakily different semantics in a comprehension than
> in a statement; alternatively, a new keyword is needed, with all the costs
> therein.
> 
> 
> Lowering operator precedence
> ----------------------------
> 
> There are two logical precedences for the ``:=`` operator. Either it should
> bind as loosely as possible, as does statement-assignment; or it should bind
> more tightly than comparison operators. Placing its precedence between the
> comparison and arithmetic operators (to be precise: just lower than bitwise
> OR) allows most uses inside ``while`` and ``if`` conditions to be spelled
> without parentheses, as it is most likely that you wish to capture the value
> of something, then perform a comparison on it::
> 
>    pos = -1
>    while pos := buffer.find(search_term, pos + 1) >= 0:
>        ...
> 
> Once find() returns -1, the loop terminates. If ``:=`` binds as loosely as
> ``=`` does, this would capture the result of the comparison (generally either
> ``True`` or ``False``), which is less useful.
> 
> While this behaviour would be convenient in many situations, it is also harder
> to explain than "the := operator behaves just like the assignment statement",
> and as such, the precedence for ``:=`` has been made as close as possible to
> that of ``=``.
> 
> 
> Migration path
> ==============
> 
> The semantic changes to list/set/dict comprehensions, and more so to generator
> expressions, may potentially require migration of code. In many cases, the
> changes simply make legal what used to raise an exception, but there are some
> edge cases that were previously legal and now are not, and a few corner cases
> with altered semantics.
> 
> 
> The Outermost Iterable
> ----------------------
> 
> As of Python 3.7, the outermost iterable in a comprehension is special: it is
> evaluated in the surrounding context, instead of inside the comprehension.
> Thus it is permitted to contain a ``yield`` expression, to use a name also
> used elsewhere, and to reference names from class scope. Also, in a genexp,
> the outermost iterable is pre-evaluated, but the rest of the code is not
> touched until the genexp is first iterated over. Class scope is now handled
> more generally (see above), but if other changes require the old behaviour,
> the iterable must be explicitly elevated from the comprehension::
> 
>    # Python 3.7
>    def f(x):
>        return [x for x in x if x]
>    def g():
>        return [x for x in [(yield 1)]]
>    # With PEP 572
>    def f(x):
>        return [y for y in x if y]
>    def g():
>        sent_item = (yield 1)
>        return [x for x in [sent_item]]
> 
> This more clearly shows that it is g(), not the comprehension, which is able
> to yield values (and is thus a generator function). The entire comprehension
> is consistently in a single scope.
> 
> The following expressions would, in Python 3.7, raise exceptions immediately.
> With the removal of the outermost iterable's special casing, they are now
> equivalent to the most obvious longhand form::
> 
>    gen = (x for x in rage(10)) # NameError
>    gen = (x for x in 10) # TypeError (not iterable)
>    gen = (x for x in range(1/0)) # ZeroDivisionError
> 
>    def <genexp>():
>        for x in rage(10):
>            yield x
>    gen = <genexp>() # No exception yet
>    tng = next(gen) # NameError
> 
> 
> Open questions
> ==============
> 
> Importing names into comprehensions
> -----------------------------------
> 
> A list comprehension can use and update local names, and they will retain
> their values from one iteration to another. It would be convenient to use
> this feature to create rolling or self-effecting data streams::
> 
>    progressive_sums = [total := total + value for value in data]
> 
> This will fail with UnboundLocalError due to ``total`` not being initalized.
> Simply initializing it outside of the comprehension is insufficient - unless
> the comprehension is in class scope::
> 
>    class X:
>        total = 0
>        progressive_sums = [total := total + value for value in data]
> 
> At other scopes, it may be beneficial to have a way to fetch a value from the
> surrounding scope. Should this be automatic? Should it be controlled with a
> keyword? Hypothetically (and using no new keywords), this could be written::
> 
>    total = 0
>    progressive_sums = [total := total + value
>        import nonlocal total
>        for value in data]
> 
> Translated into longhand, this would become::
> 
>    total = 0
>    def <listcomp>(total=total):
>        result = []
>        for value in data:
>            result.append(total := total + value)
>        return result
>    progressive_sums = <listcomp>()
> 
> ie utilizing the same early-binding technique that is used at class scope.
> 
> 
> Frequently Raised Objections
> ============================
> 
> Why not just turn existing assignment into an expression?
> ---------------------------------------------------------
> 
> C and its derivatives define the ``=`` operator as an expression, rather than
> a statement as is Python's way.  This allows assignments in more contexts,
> including contexts where comparisons are more common.  The syntactic similarity
> between ``if (x == y)`` and ``if (x = y)`` belies their drastically different
> semantics.  Thus this proposal uses ``:=`` to clarify the distinction.
> 
> 
> This could be used to create ugly code!
> ---------------------------------------
> 
> So can anything else.  This is a tool, and it is up to the programmer to use it
> where it makes sense, and not use it where superior constructs can be used.
> 
> 
> With assignment expressions, why bother with assignment statements?
> -------------------------------------------------------------------
> 
> The two forms have different flexibilities.  The ``:=`` operator can be used
> inside a larger expression; the ``=`` statement can be augmented to ``+=`` and
> its friends. The assignment statement is a clear declaration of intent: this
> value is to be assigned to this target, and that's it.
> 
> 
> Why not use a sublocal scope and prevent namespace pollution?
> -------------------------------------------------------------
> 
> Previous revisions of this proposal involved sublocal scope (restricted to a
> single statement), preventing name leakage and namespace pollution.  While a
> definite advantage in a number of situations, this increases complexity in
> many others, and the costs are not justified by the benefits. In the interests
> of language simplicity, the name bindings created here are exactly equivalent
> to any other name bindings, including that usage at class or module scope will
> create externally-visible names.  This is no different from ``for`` loops or
> other constructs, and can be solved the same way: ``del`` the name once it is
> no longer needed, or prefix it with an underscore.
> 
> Names bound within a comprehension are local to that comprehension, even in
> the outermost iterable, and can thus be used freely without polluting the
> surrounding namespace.
> 
> (The author wishes to thank Guido van Rossum and Christoph Groth for their
> suggestions to move the proposal in this direction. [2]_)
> 
> 
> Style guide recommendations
> ===========================
> 
> As this adds another way to spell some of the same effects as can already be
> done, it is worth noting a few broad recommendations. These could be included
> in PEP 8 and/or other style guides.
> 
> 1. If either assignment statements or assignment expressions can be
>   used, prefer statements; they are a clear declaration of intent.
> 
> 2. If using assignment expressions would lead to ambiguity about
>   execution order, restructure it to use statements instead.
> 
> 
> Acknowledgements
> ================
> 
> The author wishes to thank Guido van Rossum and Nick Coghlan for their
> considerable contributions to this proposal, and to members of the
> core-mentorship mailing list for assistance with implementation.
> 
> 
> References
> ==========
> 
> .. [1] Proof of concept / reference implementation
>   (https://github.com/Rosuav/cpython/tree/assignment-expressions)
> .. [2] Pivotal post regarding inline assignment semantics
>   (https://mail.python.org/pipermail/python-ideas/2018-March/049409.html)
> 
> 
> Copyright
> =========
> 
> This document has been placed in the public domain.
> 
> 
> ..
>   Local Variables:
>   mode: indented-text
>   indent-tabs-mode: nil
>   sentence-end-double-space: t
>   fill-column: 70
>   coding: utf-8
>   End:
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/damalinov%40gmail.com



More information about the Python-Dev mailing list