Working on the reference implementation for PEP 572 is turning out to be a massive time sink, both on my personal schedule and on the PEP's discussion. I can't just hold off all discussion on a topic until I figure out whether something is possible or not, because that could take me several days, even a week or more. And considering the massive backlash against the proposal, it seems like a poor use of my time to try to prove that something's impossible, find that I don't know enough about grammar editing to be able to say anything more than "well, I couldn't do it, but someone else might be able to", and then try to resume the discussion with no more certainty than we had before.
So here's the PEP again, simplified. I'm fairly sure it's just going to be another on a growing list of rejected PEPs to my name, and I'm done with trying to argue some of these points. Either the rules get simplified, or they don't. Trying to simplify the rules and maintain perfect backward compatibility is just making the rules even more complicated.
PEP 572, if accepted, will change the behaviour of certain constructs inside comprehensions, mainly due to interactions with class scope that make no sense unless you know how they're implemented internally. The official tutorial pretends that comprehensions are "equivalent to" longhand:
https://docs.python.org/3/tutorial/datastructures.html?highlight=equivalent#... https://docs.python.org/3/howto/functional.html?highlight=equivalent#generat...
and this is an inaccuracy for the sake of simplicity. PEP 572 will make this far more accurate; the only difference is that the comprehension is inside a function. Current semantics are far more bizarre than that.
Do you want absolutely 100% backward compatibility? Then reject this PEP. Or better still, keep using Python 3.7, and don't upgrade to 3.8, in case something breaks. Do you want list comprehensions that make better sense? Then accept that some code will need to change, if it tried to use the same name in multiple scopes, or tried to use ancient Python 2 semantics with a yield expression in the outermost iterable.
I'm pretty much ready for pronouncement.
https://www.python.org/dev/peps/pep-0572/
ChrisA
PEP: 572 Title: Assignment Expressions Author: Chris Angelico rosuav@gmail.com Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 28-Feb-2018 Python-Version: 3.8 Post-History: 28-Feb-2018, 02-Mar-2018, 23-Mar-2018, 04-Apr-2018, 17-Apr-2018
This is a proposal for creating a way to assign to variables within an expression. Additionally, the precise scope of comprehensions is adjusted, to maintain consistency and follow expectations.
Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse. Currently, this feature is available only in statement form, making it unavailable in list comprehensions and other expression contexts. Merely introducing a way to assign as an expression would create bizarre edge cases around comprehensions, though, and to avoid the worst of the confusions, we change the definition of comprehensions, causing some edge cases to be interpreted differently, but maintaining the existing behaviour in the majority of situations.
In any context where arbitrary Python expressions can be used, a named
expression can appear. This is of the form target := expr
where
expr
is any valid Python expression, and target
is any valid
assignment target.
The value of such a named expression is the same as the incorporated expression, with the additional side-effect that the target is assigned that value::
# Handle a matched regex
if (match := pattern.search(data)) is not None:
...
# A more explicit alternative to the 2-arg form of iter() invocation
while (value := read_next_item()) is not None:
...
# Share a subexpression between a comprehension filter clause and its output
filtered_data = [y for x in data if (y := f(x)) is not None]
Most importantly, since :=
is an expression, it can be used in contexts
where statements are illegal, including lambda functions and comprehensions.
An assignment statement can assign to multiple targets, left-to-right::
x = y = z = 0
The equivalent assignment expression is parsed as separate binary operators, and is therefore processed right-to-left, as if it were spelled thus::
assert 0 == (x := (y := (z := 0)))
Augmented assignment is not supported in expression form::
>>> x +:= 1
File "<stdin>", line 1
x +:= 1
^
SyntaxError: invalid syntax
Otherwise, the semantics of assignment are identical in statement and expression forms.
The current behaviour of list/set/dict comprehensions and generator expressions has some edge cases that would behave strangely if an assignment expression were to be used. Therefore the proposed semantics are changed, removing the current edge cases, and instead altering their behaviour only in a class scope.
As of Python 3.7, the outermost iterable of any comprehension is evaluated in the surrounding context, and then passed as an argument to the implicit function that evaluates the comprehension.
Under this proposal, the entire body of the comprehension is evaluated in its implicit function. Names not assigned to within the comprehension are located in the surrounding scopes, as with normal lookups. As one special case, a comprehension at class scope will eagerly bind any name which is already defined in the class scope.
A list comprehension can be unrolled into an equivalent function. With Python 3.7 semantics::
numbers = [x + y for x in range(3) for y in range(4)]
# Is approximately equivalent to
def <listcomp>(iterator):
result = []
for x in iterator:
for y in range(4):
result.append(x + y)
return result
numbers = <listcomp>(iter(range(3)))
Under the new semantics, this would instead be equivalent to::
def <listcomp>():
result = []
for x in range(3):
for y in range(4):
result.append(x + y)
return result
numbers = <listcomp>()
When a class scope is involved, a naive transformation into a function would prevent name lookups (as the function would behave like a method)::
class X:
names = ["Fred", "Barney", "Joe"]
prefix = "> "
prefixed_names = [prefix + name for name in names]
With Python 3.7 semantics, this will evaluate the outermost iterable at class scope, which will succeed; but it will evaluate everything else in a function::
class X:
names = ["Fred", "Barney", "Joe"]
prefix = "> "
def <listcomp>(iterator):
result = []
for name in iterator:
result.append(prefix + name)
return result
prefixed_names = <listcomp>(iter(names))
The name prefix
is thus searched for at global scope, ignoring the class
name. Under the proposed semantics, this name will be eagerly bound; and the
same early binding then handles the outermost iterable as well. The list
comprehension is thus approximately equivalent to::
class X:
names = ["Fred", "Barney", "Joe"]
prefix = "> "
def <listcomp>(names=names, prefix=prefix):
result = []
for name in names:
result.append(prefix + name)
return result
prefixed_names = <listcomp>()
With list comprehensions, this is unlikely to cause any confusion. With
generator expressions, this has the potential to affect behaviour, as the
eager binding means that the name could be rebound between the creation of
the genexp and the first call to next()
. It is, however, more closely
aligned to normal expectations. The effect is ONLY seen with names that
are looked up from class scope; global names (eg range()
) will still
be late-bound as usual.
One consequence of this change is that certain bugs in genexps will not
be detected until the first call to next()
, where today they would be
caught upon creation of the generator.
A list comprehension can map and filter efficiently by capturing the condition::
results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]
Similarly, a subexpression can be reused within the main expression, by giving it a name on first use::
stuff = [[y := f(x), x/y] for x in range(5)]
# There are a number of less obvious ways to spell this in current
# versions of Python, such as:
# Inline helper function
stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
# Extra 'for' loop - potentially could be optimized internally
stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
# Using a mutable cache object (various forms possible)
c = {}
stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)]
In all cases, the name is local to the comprehension; like iteration variables, it cannot leak out into the surrounding context.
Assignment expressions can be used to good effect in the header of
an if
or while
statement::
# Proposed syntax
while (command := input("> ")) != "quit":
print("You entered:", command)
# Capturing regular expression match objects
# See, for instance, Lib/pydoc.py, which uses a multiline spelling
# of this effect
if match := re.search(pat, text):
print("Found:", match.group(0))
# Reading socket data until an empty string is returned
while data := sock.read():
print("Received data:", data)
# Equivalent in current Python, not caring about function return value
while input("> ") != "quit":
print("You entered a command.")
# To capture the return value in current Python demands a four-line
# loop header.
while True:
command = input("> ");
if command == "quit":
break
print("You entered:", command)
Particularly with the while
loop, this can remove the need to have an
infinite loop, an assignment, and a condition. It also creates a smooth
parallel between a loop which simply uses a function call as its condition,
and one which uses that as its condition but also uses the actual value.
Proposals broadly similar to this one have come up frequently on python-ideas. Below are a number of alternative syntaxes, some of them specific to comprehensions, which have been rejected in favour of the one given above.
Broadly the same semantics as the current proposal, but spelled differently.
EXPR as NAME
::
stuff = [[f(x) as y, x/y] for x in range(5)]
Since EXPR as NAME
already has meaning in except
and
with
statements (with different semantics), this would create unnecessary
confusion or require special-casing (eg to forbid assignment within the
headers of these statements).
EXPR -> NAME
::
stuff = [[f(x) -> y, x/y] for x in range(5)]
This syntax is inspired by languages such as R and Haskell, and some
programmable calculators. (Note that a left-facing arrow y <- f(x)
is
not possible in Python, as it would be interpreted as less-than and unary
minus.) This syntax has a slight advantage over 'as' in that it does not
conflict with with
and except
statements, but otherwise is
equivalent.
Adorning statement-local names with a leading dot::
stuff = [[(f(x) as .y), x/.y] for x in range(5)] # with "as"
stuff = [[(.y := f(x)), x/.y] for x in range(5)] # with ":="
This has the advantage that leaked usage can be readily detected, removing some forms of syntactic ambiguity. However, this would be the only place in Python where a variable's scope is encoded into its name, making refactoring harder.
Adding a where:
to any statement to create local name bindings::
value = x**2 + 2*x where:
x = spam(1, 4, 7, q)
Execution order is inverted (the indented body is performed first, followed
by the "header"). This requires a new keyword, unless an existing keyword
is repurposed (most likely with:
). See PEP 3150 for prior discussion
on this subject (with the proposed keyword being given:
).
TARGET from EXPR
::
stuff = [[y from f(x), x/y] for x in range(5)]
This syntax has fewer conflicts than as
does (conflicting only with the
raise Exc from Exc
notation), but is otherwise comparable to it. Instead
of paralleling with expr as target:
(which can be useful but can also be
confusing), this has no parallels, but is evocative.
One of the most popular use-cases is if
and while
statements.
Instead
of a more general solution, this proposal enhances the syntax of these two
statements to add a means of capturing the compared value::
if re.search(pat, text) as match:
print("Found:", match.group(0))
This works beautifully if and ONLY if the desired condition is based on the
truthiness of the captured value. It is thus effective for specific
use-cases (regex matches, socket reads that return ''
when done), and
completely useless in more complicated cases (eg where the condition is
f(x) < 0
and you want to capture the value of f(x)
). It also
has
no benefit to list comprehensions.
Advantages: No syntactic ambiguities. Disadvantages: Answers only a fraction
of possible use-cases, even in if
/while
statements.
Another common use-case is comprehensions (list/set/dict, and genexps). As above, proposals have been made for comprehension-specific solutions.
where
, let
, or given
::
stuff = [(y, x/y) where y = f(x) for x in range(5)]
stuff = [(y, x/y) let y = f(x) for x in range(5)]
stuff = [(y, x/y) given y = f(x) for x in range(5)]
This brings the subexpression to a location in between the 'for' loop and
the expression. It introduces an additional language keyword, which creates
conflicts. Of the three, where
reads the most cleanly, but also has the
greatest potential for conflict (eg SQLAlchemy and numpy have where
methods, as does tkinter.dnd.Icon
in the standard library).
with NAME = EXPR
::
stuff = [(y, x/y) with y = f(x) for x in range(5)]
As above, but reusing the with
keyword. Doesn't read too badly, and needs
no additional language keyword. Is restricted to comprehensions, though,
and cannot as easily be transformed into "longhand" for-loop syntax. Has
the C problem that an equals sign in an expression can now create a name
binding, rather than performing a comparison. Would raise the question of
why "with NAME = EXPR:" cannot be used as a statement on its own.
with EXPR as NAME
::
stuff = [(y, x/y) with f(x) as y for x in range(5)]
As per option 2, but using as
rather than an equals sign. Aligns
syntactically with other uses of as
for name binding, but a simple
transformation to for-loop longhand would create drastically different
semantics; the meaning of with
inside a comprehension would be
completely different from the meaning as a stand-alone statement, while
retaining identical syntax.
Regardless of the spelling chosen, this introduces a stark difference between
comprehensions and the equivalent unrolled long-hand form of the loop. It is
no longer possible to unwrap the loop into statement form without reworking
any name bindings. The only keyword that can be repurposed to this task is
with
, thus giving it sneakily different semantics in a comprehension than
in a statement; alternatively, a new keyword is needed, with all the costs
therein.
There are two logical precedences for the :=
operator. Either it should
bind as loosely as possible, as does statement-assignment; or it should bind
more tightly than comparison operators. Placing its precedence between the
comparison and arithmetic operators (to be precise: just lower than bitwise
OR) allows most uses inside while
and if
conditions to be
spelled
without parentheses, as it is most likely that you wish to capture the value
of something, then perform a comparison on it::
pos = -1
while pos := buffer.find(search_term, pos + 1) >= 0:
...
Once find() returns -1, the loop terminates. If :=
binds as loosely as
=
does, this would capture the result of the comparison (generally either
True
or False
), which is less useful.
While this behaviour would be convenient in many situations, it is also harder
to explain than "the := operator behaves just like the assignment statement",
and as such, the precedence for :=
has been made as close as possible to
that of =
.
The semantic changes to list/set/dict comprehensions, and more so to generator expressions, may potentially require migration of code. In many cases, the changes simply make legal what used to raise an exception, but there are some edge cases that were previously legal and now are not, and a few corner cases with altered semantics.
As of Python 3.7, the outermost iterable in a comprehension is special: it is
evaluated in the surrounding context, instead of inside the comprehension.
Thus it is permitted to contain a yield
expression, to use a name also
used elsewhere, and to reference names from class scope. Also, in a genexp,
the outermost iterable is pre-evaluated, but the rest of the code is not
touched until the genexp is first iterated over. Class scope is now handled
more generally (see above), but if other changes require the old behaviour,
the iterable must be explicitly elevated from the comprehension::
# Python 3.7
def f(x):
return [x for x in x if x]
def g():
return [x for x in [(yield 1)]]
# With PEP 572
def f(x):
return [y for y in x if y]
def g():
sent_item = (yield 1)
return [x for x in [sent_item]]
This more clearly shows that it is g(), not the comprehension, which is able to yield values (and is thus a generator function). The entire comprehension is consistently in a single scope.
The following expressions would, in Python 3.7, raise exceptions immediately. With the removal of the outermost iterable's special casing, they are now equivalent to the most obvious longhand form::
gen = (x for x in rage(10)) # NameError
gen = (x for x in 10) # TypeError (not iterable)
gen = (x for x in range(1/0)) # ZeroDivisionError
def <genexp>():
for x in rage(10):
yield x
gen = <genexp>() # No exception yet
tng = next(gen) # NameError
A list comprehension can use and update local names, and they will retain their values from one iteration to another. It would be convenient to use this feature to create rolling or self-effecting data streams::
progressive_sums = [total := total + value for value in data]
This will fail with UnboundLocalError due to total
not being initalized.
Simply initializing it outside of the comprehension is insufficient - unless
the comprehension is in class scope::
class X:
total = 0
progressive_sums = [total := total + value for value in data]
At other scopes, it may be beneficial to have a way to fetch a value from the surrounding scope. Should this be automatic? Should it be controlled with a keyword? Hypothetically (and using no new keywords), this could be written::
total = 0
progressive_sums = [total := total + value
import nonlocal total
for value in data]
Translated into longhand, this would become::
total = 0
def <listcomp>(total=total):
result = []
for value in data:
result.append(total := total + value)
return result
progressive_sums = <listcomp>()
ie utilizing the same early-binding technique that is used at class scope.
C and its derivatives define the =
operator as an expression, rather than
a statement as is Python's way. This allows assignments in more contexts,
including contexts where comparisons are more common. The syntactic similarity
between if (x == y)
and if (x = y)
belies their drastically
different
semantics. Thus this proposal uses :=
to clarify the distinction.
So can anything else. This is a tool, and it is up to the programmer to use it where it makes sense, and not use it where superior constructs can be used.
The two forms have different flexibilities. The :=
operator can be used
inside a larger expression; the =
statement can be augmented to
+=
and
its friends. The assignment statement is a clear declaration of intent: this
value is to be assigned to this target, and that's it.
Previous revisions of this proposal involved sublocal scope (restricted to a
single statement), preventing name leakage and namespace pollution. While a
definite advantage in a number of situations, this increases complexity in
many others, and the costs are not justified by the benefits. In the interests
of language simplicity, the name bindings created here are exactly equivalent
to any other name bindings, including that usage at class or module scope will
create externally-visible names. This is no different from for
loops or
other constructs, and can be solved the same way: del
the name once it is
no longer needed, or prefix it with an underscore.
Names bound within a comprehension are local to that comprehension, even in the outermost iterable, and can thus be used freely without polluting the surrounding namespace.
(The author wishes to thank Guido van Rossum and Christoph Groth for their suggestions to move the proposal in this direction. [2]_)
As this adds another way to spell some of the same effects as can already be done, it is worth noting a few broad recommendations. These could be included in PEP 8 and/or other style guides.
If either assignment statements or assignment expressions can be used, prefer statements; they are a clear declaration of intent.
If using assignment expressions would lead to ambiguity about execution order, restructure it to use statements instead.
The author wishes to thank Guido van Rossum and Nick Coghlan for their considerable contributions to this proposal, and to members of the core-mentorship mailing list for assistance with implementation.
.. [1] Proof of concept / reference implementation (https://github.com/Rosuav/cpython/tree/assignment-expressions) .. [2] Pivotal post regarding inline assignment semantics (https://mail.python.org/pipermail/python-ideas/2018-March/049409.html)
This document has been placed in the public domain.
.. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
Hello Chris, and thank you for working on this PEP!
What do you think about using variable type hints with this syntax? I tried to search through python-dev and couldn't find a single post discussing that question. If I missed it somehow, could you please include its conclusions into the PEP?
For instance, as I understand now the parser will fail on this snippet:
while data: bytes := stream.read():
print("Received data:", data)
Do brackets help?
while (data: bytes := stream.read()):
print("Received data:", data)
IIUC, in 3.7 It is invalid syntax to specify a type hint for a for loop item; should brackets help? Currently they don't:
Python 3.7.0b3+ (heads/3.7:7dcfd6c, Mar 30 2018, 21:30:34)
[Clang 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> for (x: int) in [1,2,3]:
File "<stdin>", line 1
for (x: int) in [1,2,3]:
^
SyntaxError: invalid syntax
Thanks,
On 20 Apr 2018, at 09:10, Chris Angelico rosuav@gmail.com wrote:
Working on the reference implementation for PEP 572 is turning out to be a massive time sink, both on my personal schedule and on the PEP's discussion. I can't just hold off all discussion on a topic until I figure out whether something is possible or not, because that could take me several days, even a week or more. And considering the massive backlash against the proposal, it seems like a poor use of my time to try to prove that something's impossible, find that I don't know enough about grammar editing to be able to say anything more than "well, I couldn't do it, but someone else might be able to", and then try to resume the discussion with no more certainty than we had before.
So here's the PEP again, simplified. I'm fairly sure it's just going to be another on a growing list of rejected PEPs to my name, and I'm done with trying to argue some of these points. Either the rules get simplified, or they don't. Trying to simplify the rules and maintain perfect backward compatibility is just making the rules even more complicated.
PEP 572, if accepted, will change the behaviour of certain constructs inside comprehensions, mainly due to interactions with class scope that make no sense unless you know how they're implemented internally. The official tutorial pretends that comprehensions are "equivalent to" longhand:
https://docs.python.org/3/tutorial/datastructures.html?highlight=equivalent#... https://docs.python.org/3/howto/functional.html?highlight=equivalent#generat...
and this is an inaccuracy for the sake of simplicity. PEP 572 will make this far more accurate; the only difference is that the comprehension is inside a function. Current semantics are far more bizarre than that.
Do you want absolutely 100% backward compatibility? Then reject this PEP. Or better still, keep using Python 3.7, and don't upgrade to 3.8, in case something breaks. Do you want list comprehensions that make better sense? Then accept that some code will need to change, if it tried to use the same name in multiple scopes, or tried to use ancient Python 2 semantics with a yield expression in the outermost iterable.
I'm pretty much ready for pronouncement.
https://www.python.org/dev/peps/pep-0572/
ChrisA
PEP: 572 Title: Assignment Expressions Author: Chris Angelico rosuav@gmail.com Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 28-Feb-2018 Python-Version: 3.8 Post-History: 28-Feb-2018, 02-Mar-2018, 23-Mar-2018, 04-Apr-2018, 17-Apr-2018
This is a proposal for creating a way to assign to variables within an expression. Additionally, the precise scope of comprehensions is adjusted, to maintain consistency and follow expectations.
Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse. Currently, this feature is available only in statement form, making it unavailable in list comprehensions and other expression contexts. Merely introducing a way to assign as an expression would create bizarre edge cases around comprehensions, though, and to avoid the worst of the confusions, we change the definition of comprehensions, causing some edge cases to be interpreted differently, but maintaining the existing behaviour in the majority of situations.
In any context where arbitrary Python expressions can be used, a named
expression can appear. This is of the form target := expr
where
expr
is any valid Python expression, and target
is any valid
assignment target.
The value of such a named expression is the same as the incorporated expression, with the additional side-effect that the target is assigned that value::
# Handle a matched regex
if (match := pattern.search(data)) is not None: ...
# A more explicit alternative to the 2-arg form of iter() invocation
while (value := read_next_item()) is not None: ...
# Share a subexpression between a comprehension filter clause and its output
filtered_data = [y for x in data if (y := f(x)) is not None]
Most importantly, since :=
is an expression, it can be used in contexts
where statements are illegal, including lambda functions and comprehensions.
An assignment statement can assign to multiple targets, left-to-right::
x = y = z = 0
The equivalent assignment expression is parsed as separate binary operators, and is therefore processed right-to-left, as if it were spelled thus::
assert 0 == (x := (y := (z := 0)))
Augmented assignment is not supported in expression form::
x +:= 1 File "<stdin>", line 1 x +:= 1 ^ SyntaxError: invalid syntax
Otherwise, the semantics of assignment are identical in statement and expression forms.
The current behaviour of list/set/dict comprehensions and generator expressions has some edge cases that would behave strangely if an assignment expression were to be used. Therefore the proposed semantics are changed, removing the current edge cases, and instead altering their behaviour only in a class scope.
As of Python 3.7, the outermost iterable of any comprehension is evaluated in the surrounding context, and then passed as an argument to the implicit function that evaluates the comprehension.
Under this proposal, the entire body of the comprehension is evaluated in its implicit function. Names not assigned to within the comprehension are located in the surrounding scopes, as with normal lookups. As one special case, a comprehension at class scope will eagerly bind any name which is already defined in the class scope.
A list comprehension can be unrolled into an equivalent function. With Python 3.7 semantics::
numbers = [x + y for x in range(3) for y in range(4)]
# Is approximately equivalent to
def <listcomp>(iterator): result = [] for x in iterator: for y in range(4): result.append(x + y) return result numbers = <listcomp>(iter(range(3)))
Under the new semantics, this would instead be equivalent to::
def <listcomp>(): result = [] for x in range(3): for y in range(4): result.append(x + y) return result numbers = <listcomp>()
When a class scope is involved, a naive transformation into a function would prevent name lookups (as the function would behave like a method)::
class X: names = ["Fred", "Barney", "Joe"] prefix = "> " prefixed_names = [prefix + name for name in names]
With Python 3.7 semantics, this will evaluate the outermost iterable at class scope, which will succeed; but it will evaluate everything else in a function::
class X: names = ["Fred", "Barney", "Joe"] prefix = "> " def <listcomp>(iterator): result = [] for name in iterator: result.append(prefix + name) return result prefixed_names = <listcomp>(iter(names))
The name prefix
is thus searched for at global scope, ignoring the class
name. Under the proposed semantics, this name will be eagerly bound; and the
same early binding then handles the outermost iterable as well. The list
comprehension is thus approximately equivalent to::
class X: names = ["Fred", "Barney", "Joe"] prefix = "> " def <listcomp>(names=names, prefix=prefix): result = [] for name in names: result.append(prefix + name) return result prefixed_names = <listcomp>()
With list comprehensions, this is unlikely to cause any confusion. With
generator expressions, this has the potential to affect behaviour, as the
eager binding means that the name could be rebound between the creation of
the genexp and the first call to next()
. It is, however, more closely
aligned to normal expectations. The effect is ONLY seen with names that
are looked up from class scope; global names (eg range()
) will still
be late-bound as usual.
One consequence of this change is that certain bugs in genexps will not
be detected until the first call to next()
, where today they would be
caught upon creation of the generator.
A list comprehension can map and filter efficiently by capturing the condition::
results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]
Similarly, a subexpression can be reused within the main expression, by giving it a name on first use::
stuff = [[y := f(x), x/y] for x in range(5)]
# There are a number of less obvious ways to spell this in current
# versions of Python, such as:
# Inline helper function
stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
# Extra 'for' loop - potentially could be optimized internally
stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
# Using a mutable cache object (various forms possible)
c = {} stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)]
In all cases, the name is local to the comprehension; like iteration variables, it cannot leak out into the surrounding context.
Assignment expressions can be used to good effect in the header of
an if
or while
statement::
# Proposed syntax
while (command := input("> ")) != "quit": print("You entered:", command)
# Capturing regular expression match objects
# See, for instance, Lib/pydoc.py, which uses a multiline spelling
# of this effect
if match := re.search(pat, text): print("Found:", match.group(0))
# Reading socket data until an empty string is returned
while data := sock.read(): print("Received data:", data)
# Equivalent in current Python, not caring about function return value
while input("> ") != "quit": print("You entered a command.")
# To capture the return value in current Python demands a four-line
# loop header.
while True: command = input("> "); if command == "quit": break print("You entered:", command)
Particularly with the while
loop, this can remove the need to have an
infinite loop, an assignment, and a condition. It also creates a smooth
parallel between a loop which simply uses a function call as its condition,
and one which uses that as its condition but also uses the actual value.
Proposals broadly similar to this one have come up frequently on python-ideas. Below are a number of alternative syntaxes, some of them specific to comprehensions, which have been rejected in favour of the one given above.
Broadly the same semantics as the current proposal, but spelled differently.
EXPR as NAME
::
stuff = [[f(x) as y, x/y] for x in range(5)]
Since EXPR as NAME
already has meaning in except
and
with
statements (with different semantics), this would create unnecessary
confusion or require special-casing (eg to forbid assignment within the
headers of these statements).
EXPR -> NAME
::
stuff = [[f(x) -> y, x/y] for x in range(5)]
This syntax is inspired by languages such as R and Haskell, and some
programmable calculators. (Note that a left-facing arrow y <- f(x)
is
not possible in Python, as it would be interpreted as less-than and unary
minus.) This syntax has a slight advantage over 'as' in that it does not
conflict with with
and except
statements, but otherwise is
equivalent.
Adorning statement-local names with a leading dot::
stuff = [[(f(x) as .y), x/.y] for x in range(5)] # with "as" stuff = [[(.y := f(x)), x/.y] for x in range(5)] # with ":="
This has the advantage that leaked usage can be readily detected, removing some forms of syntactic ambiguity. However, this would be the only place in Python where a variable's scope is encoded into its name, making refactoring harder.
Adding a where:
to any statement to create local name bindings::
value = x*2 + 2x where:
x = spam(1, 4, 7, q)
Execution order is inverted (the indented body is performed first, followed
by the "header"). This requires a new keyword, unless an existing keyword
is repurposed (most likely with:
). See PEP 3150 for prior discussion
on this subject (with the proposed keyword being given:
).
TARGET from EXPR
::
stuff = [[y from f(x), x/y] for x in range(5)]
This syntax has fewer conflicts than as
does (conflicting only with the
raise Exc from Exc
notation), but is otherwise comparable to it. Instead
of paralleling with expr as target:
(which can be useful but can also be
confusing), this has no parallels, but is evocative.
One of the most popular use-cases is if
and while
statements.
Instead
of a more general solution, this proposal enhances the syntax of these two
statements to add a means of capturing the compared value::
if re.search(pat, text) as match: print("Found:", match.group(0))
This works beautifully if and ONLY if the desired condition is based on the
truthiness of the captured value. It is thus effective for specific
use-cases (regex matches, socket reads that return ''
when done), and
completely useless in more complicated cases (eg where the condition is
f(x) < 0
and you want to capture the value of f(x)
). It also
has
no benefit to list comprehensions.
Advantages: No syntactic ambiguities. Disadvantages: Answers only a fraction
of possible use-cases, even in if
/while
statements.
Another common use-case is comprehensions (list/set/dict, and genexps). As above, proposals have been made for comprehension-specific solutions.
where
, let
, or given
::
stuff = [(y, x/y) where y = f(x) for x in range(5)] stuff = [(y, x/y) let y = f(x) for x in range(5)] stuff = [(y, x/y) given y = f(x) for x in range(5)]
This brings the subexpression to a location in between the 'for' loop and
the expression. It introduces an additional language keyword, which creates
conflicts. Of the three, where
reads the most cleanly, but also has the
greatest potential for conflict (eg SQLAlchemy and numpy have where
methods, as does tkinter.dnd.Icon
in the standard library).
with NAME = EXPR
::
stuff = [(y, x/y) with y = f(x) for x in range(5)]
As above, but reusing the with
keyword. Doesn't read too badly, and needs
no additional language keyword. Is restricted to comprehensions, though,
and cannot as easily be transformed into "longhand" for-loop syntax. Has
the C problem that an equals sign in an expression can now create a name
binding, rather than performing a comparison. Would raise the question of
why "with NAME = EXPR:" cannot be used as a statement on its own.
with EXPR as NAME
::
stuff = [(y, x/y) with f(x) as y for x in range(5)]
As per option 2, but using as
rather than an equals sign. Aligns
syntactically with other uses of as
for name binding, but a simple
transformation to for-loop longhand would create drastically different
semantics; the meaning of with
inside a comprehension would be
completely different from the meaning as a stand-alone statement, while
retaining identical syntax.
Regardless of the spelling chosen, this introduces a stark difference between
comprehensions and the equivalent unrolled long-hand form of the loop. It is
no longer possible to unwrap the loop into statement form without reworking
any name bindings. The only keyword that can be repurposed to this task is
with
, thus giving it sneakily different semantics in a comprehension than
in a statement; alternatively, a new keyword is needed, with all the costs
therein.
There are two logical precedences for the :=
operator. Either it should
bind as loosely as possible, as does statement-assignment; or it should bind
more tightly than comparison operators. Placing its precedence between the
comparison and arithmetic operators (to be precise: just lower than bitwise
OR) allows most uses inside while
and if
conditions to be
spelled
without parentheses, as it is most likely that you wish to capture the value
of something, then perform a comparison on it::
pos = -1 while pos := buffer.find(search_term, pos + 1) >= 0: ...
Once find() returns -1, the loop terminates. If :=
binds as loosely as
=
does, this would capture the result of the comparison (generally either
True
or False
), which is less useful.
While this behaviour would be convenient in many situations, it is also harder
to explain than "the := operator behaves just like the assignment statement",
and as such, the precedence for :=
has been made as close as possible to
that of =
.
The semantic changes to list/set/dict comprehensions, and more so to generator expressions, may potentially require migration of code. In many cases, the changes simply make legal what used to raise an exception, but there are some edge cases that were previously legal and now are not, and a few corner cases with altered semantics.
As of Python 3.7, the outermost iterable in a comprehension is special: it is
evaluated in the surrounding context, instead of inside the comprehension.
Thus it is permitted to contain a yield
expression, to use a name also
used elsewhere, and to reference names from class scope. Also, in a genexp,
the outermost iterable is pre-evaluated, but the rest of the code is not
touched until the genexp is first iterated over. Class scope is now handled
more generally (see above), but if other changes require the old behaviour,
the iterable must be explicitly elevated from the comprehension::
# Python 3.7
def f(x): return [x for x in x if x] def g(): return [x for x in [(yield 1)]]
# With PEP 572
def f(x): return [y for y in x if y] def g(): sent_item = (yield 1) return [x for x in [sent_item]]
This more clearly shows that it is g(), not the comprehension, which is able to yield values (and is thus a generator function). The entire comprehension is consistently in a single scope.
The following expressions would, in Python 3.7, raise exceptions immediately. With the removal of the outermost iterable's special casing, they are now equivalent to the most obvious longhand form::
gen = (x for x in rage(10)) # NameError gen = (x for x in 10) # TypeError (not iterable) gen = (x for x in range(1/0)) # ZeroDivisionError
def <genexp>(): for x in rage(10): yield x gen = <genexp>() # No exception yet tng = next(gen) # NameError
A list comprehension can use and update local names, and they will retain their values from one iteration to another. It would be convenient to use this feature to create rolling or self-effecting data streams::
progressive_sums = [total := total + value for value in data]
This will fail with UnboundLocalError due to total
not being initalized.
Simply initializing it outside of the comprehension is insufficient - unless
the comprehension is in class scope::
class X: total = 0 progressive_sums = [total := total + value for value in data]
At other scopes, it may be beneficial to have a way to fetch a value from the surrounding scope. Should this be automatic? Should it be controlled with a keyword? Hypothetically (and using no new keywords), this could be written::
total = 0 progressive_sums = [total := total + value import nonlocal total for value in data]
Translated into longhand, this would become::
total = 0 def <listcomp>(total=total): result = [] for value in data: result.append(total := total + value) return result progressive_sums = <listcomp>()
ie utilizing the same early-binding technique that is used at class scope.
C and its derivatives define the =
operator as an expression, rather than
a statement as is Python's way. This allows assignments in more contexts,
including contexts where comparisons are more common. The syntactic similarity
between if (x == y)
and if (x = y)
belies their drastically
different
semantics. Thus this proposal uses :=
to clarify the distinction.
So can anything else. This is a tool, and it is up to the programmer to use it where it makes sense, and not use it where superior constructs can be used.
The two forms have different flexibilities. The :=
operator can be used
inside a larger expression; the =
statement can be augmented to
+=
and
its friends. The assignment statement is a clear declaration of intent: this
value is to be assigned to this target, and that's it.
Previous revisions of this proposal involved sublocal scope (restricted to a
single statement), preventing name leakage and namespace pollution. While a
definite advantage in a number of situations, this increases complexity in
many others, and the costs are not justified by the benefits. In the interests
of language simplicity, the name bindings created here are exactly equivalent
to any other name bindings, including that usage at class or module scope will
create externally-visible names. This is no different from for
loops or
other constructs, and can be solved the same way: del
the name once it is
no longer needed, or prefix it with an underscore.
Names bound within a comprehension are local to that comprehension, even in the outermost iterable, and can thus be used freely without polluting the surrounding namespace.
(The author wishes to thank Guido van Rossum and Christoph Groth for their suggestions to move the proposal in this direction. [2]_)
As this adds another way to spell some of the same effects as can already be done, it is worth noting a few broad recommendations. These could be included in PEP 8 and/or other style guides.
If either assignment statements or assignment expressions can be used, prefer statements; they are a clear declaration of intent.
If using assignment expressions would lead to ambiguity about execution order, restructure it to use statements instead.
The author wishes to thank Guido van Rossum and Nick Coghlan for their considerable contributions to this proposal, and to members of the core-mentorship mailing list for assistance with implementation.
.. [1] Proof of concept / reference implementation (https://github.com/Rosuav/cpython/tree/assignment-expressions) .. [2] Pivotal post regarding inline assignment semantics (https://mail.python.org/pipermail/python-ideas/2018-March/049409.html)
This document has been placed in the public domain.
.. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/damalinov%40gmail.com
On Fri, Apr 20, 2018 at 2:45 PM, Dmitry Malinovsky damalinov@gmail.com wrote:
Hello Chris, and thank you for working on this PEP!
What do you think about using variable type hints with this syntax? I tried to search through python-dev and couldn't find a single post discussing that question. If I missed it somehow, could you please include its conclusions into the PEP?
I'm ignoring them for the sake of the PEP, because it'd complicate the grammar for little benefit. If someone wants to add an enhancement later, that's fine; but the proposal can stand without it, and with it, it'll make for even more noise in a line full of colons.
For instance, as I understand now the parser will fail on this snippet:
while data: bytes := stream.read():
print("Received data:", data)
Do brackets help?
while (data: bytes := stream.read()):
print("Received data:", data)
IIUC, in 3.7 It is invalid syntax to specify a type hint for a for loop item; should brackets help? Currently they don't:
Python 3.7.0b3+ (heads/3.7:7dcfd6c, Mar 30 2018, 21:30:34)
[Clang 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> for (x: int) in [1,2,3]:
File "<stdin>", line 1
for (x: int) in [1,2,3]:
^
SyntaxError: invalid syntax
And that's another good reason not to bother, at least for now. I'm not sure whether you can use a Py2-style type hint comment on a for loop, but if so, you should also be able to do it on a while loop or anything. Or, of course, you can just annotate the variable before the loop, if you want to.
ChrisA
Maybe annotations should get a brief mention in the Rejected Ideas section, with your explanation here added. (And maybe my response.)
On Thu, Apr 19, 2018 at 11:31 PM, Chris Angelico rosuav@gmail.com wrote:
On Fri, Apr 20, 2018 at 2:45 PM, Dmitry Malinovsky damalinov@gmail.com wrote:
Hello Chris, and thank you for working on this PEP!
What do you think about using variable type hints with this syntax? I tried to search through python-dev and couldn't find a single post discussing that question. If I missed it somehow, could you please include its conclusions into the PEP?
I'm ignoring them for the sake of the PEP, because it'd complicate the grammar for little benefit. If someone wants to add an enhancement later, that's fine; but the proposal can stand without it, and with it, it'll make for even more noise in a line full of colons.
For instance, as I understand now the parser will fail on this snippet:
while data: bytes := stream.read():
print("Received data:", data)
Do brackets help?
while (data: bytes := stream.read()):
print("Received data:", data)
IIUC, in 3.7 It is invalid syntax to specify a type hint for a for loop item; should brackets help? Currently they don't:
Python 3.7.0b3+ (heads/3.7:7dcfd6c, Mar 30 2018, 21:30:34)
[Clang 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more
information.
>>> for (x: int) in [1,2,3]:
File "<stdin>", line 1
for (x: int) in [1,2,3]:
^
SyntaxError: invalid syntax
And that's another good reason not to bother, at least for now. I'm not sure whether you can use a Py2-style type hint comment on a for loop, but if so, you should also be able to do it on a while loop or anything. Or, of course, you can just annotate the variable before the loop, if you want to.
ChrisA
Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org
-- --Guido van Rossum (python.org/~guido)
On Sat, Apr 21, 2018 at 1:50 AM, Guido van Rossum guido@python.org wrote:
Maybe annotations should get a brief mention in the Rejected Ideas section, with your explanation here added. (And maybe my response.)
Good idea. Actually, since it's a syntactic difference, I've promoted it to the "Differences" section.
ChrisA
If you want type hints you can use a variable annotation without initialization before the statement:
data: bytes while data := stream.read(): print("Got", data)
On Thu, Apr 19, 2018 at 9:45 PM, Dmitry Malinovsky damalinov@gmail.com wrote:
Hello Chris, and thank you for working on this PEP!
What do you think about using variable type hints with this syntax? I tried to search through python-dev and couldn't find a single post discussing that question. If I missed it somehow, could you please include its conclusions into the PEP?
For instance, as I understand now the parser will fail on this snippet:
while data: bytes := stream.read():
print("Received data:", data)
Do brackets help?
while (data: bytes := stream.read()):
print("Received data:", data)
IIUC, in 3.7 It is invalid syntax to specify a type hint for a for loop item; should brackets help? Currently they don't:
Python 3.7.0b3+ (heads/3.7:7dcfd6c, Mar 30 2018, 21:30:34)
[Clang 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> for (x: int) in [1,2,3]:
File "<stdin>", line 1
for (x: int) in [1,2,3]:
^
SyntaxError: invalid syntax
Thanks,
On 20 Apr 2018, at 09:10, Chris Angelico rosuav@gmail.com wrote:
Working on the reference implementation for PEP 572 is turning out to be a massive time sink, both on my personal schedule and on the PEP's discussion. I can't just hold off all discussion on a topic until I figure out whether something is possible or not, because that could take me several days, even a week or more. And considering the massive backlash against the proposal, it seems like a poor use of my time to try to prove that something's impossible, find that I don't know enough about grammar editing to be able to say anything more than "well, I couldn't do it, but someone else might be able to", and then try to resume the discussion with no more certainty than we had before.
So here's the PEP again, simplified. I'm fairly sure it's just going to be another on a growing list of rejected PEPs to my name, and I'm done with trying to argue some of these points. Either the rules get simplified, or they don't. Trying to simplify the rules and maintain perfect backward compatibility is just making the rules even more complicated.
PEP 572, if accepted, will change the behaviour of certain constructs inside comprehensions, mainly due to interactions with class scope that make no sense unless you know how they're implemented internally. The official tutorial pretends that comprehensions are "equivalent to" longhand:
https://docs.python.org/3/tutorial/datastructures.html? highlight=equivalent#list-comprehensions https://docs.python.org/3/howto/functional.html?highlight=equivalent# generator-expressions-and-list-comprehensions
and this is an inaccuracy for the sake of simplicity. PEP 572 will make this far more accurate; the only difference is that the comprehension is inside a function. Current semantics are far more bizarre than that.
Do you want absolutely 100% backward compatibility? Then reject this PEP. Or better still, keep using Python 3.7, and don't upgrade to 3.8, in case something breaks. Do you want list comprehensions that make better sense? Then accept that some code will need to change, if it tried to use the same name in multiple scopes, or tried to use ancient Python 2 semantics with a yield expression in the outermost iterable.
I'm pretty much ready for pronouncement.
https://www.python.org/dev/peps/pep-0572/
ChrisA
PEP: 572 Title: Assignment Expressions Author: Chris Angelico rosuav@gmail.com Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 28-Feb-2018 Python-Version: 3.8 Post-History: 28-Feb-2018, 02-Mar-2018, 23-Mar-2018, 04-Apr-2018, 17-Apr-2018
This is a proposal for creating a way to assign to variables within an expression. Additionally, the precise scope of comprehensions is adjusted, to maintain consistency and follow expectations.
Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse. Currently, this feature is available only in statement form, making it unavailable in list comprehensions and other expression contexts. Merely introducing a way to assign as an expression would create bizarre edge cases around comprehensions, though, and to avoid the worst of the confusions, we change the definition of comprehensions, causing some edge cases to be interpreted differently, but maintaining the existing behaviour in the majority of situations.
In any context where arbitrary Python expressions can be used, a named
expression can appear. This is of the form target := expr
where
expr
is any valid Python expression, and target
is any valid
assignment target.
The value of such a named expression is the same as the incorporated expression, with the additional side-effect that the target is assigned that value::
# Handle a matched regex
if (match := pattern.search(data)) is not None: ...
# A more explicit alternative to the 2-arg form of iter() invocation
while (value := read_next_item()) is not None: ...
# Share a subexpression between a comprehension filter clause and its
output filtered_data = [y for x in data if (y := f(x)) is not None]
Most importantly, since :=
is an expression, it can be used in
contexts
where statements are illegal, including lambda functions and
comprehensions.
An assignment statement can assign to multiple targets, left-to-right::
x = y = z = 0
The equivalent assignment expression is parsed as separate binary operators, and is therefore processed right-to-left, as if it were spelled thus::
assert 0 == (x := (y := (z := 0)))
Augmented assignment is not supported in expression form::
x +:= 1 File "<stdin>", line 1 x +:= 1 ^ SyntaxError: invalid syntax
Otherwise, the semantics of assignment are identical in statement and expression forms.
The current behaviour of list/set/dict comprehensions and generator expressions has some edge cases that would behave strangely if an assignment expression were to be used. Therefore the proposed semantics are changed, removing the current edge cases, and instead altering their behaviour only in a class scope.
As of Python 3.7, the outermost iterable of any comprehension is evaluated in the surrounding context, and then passed as an argument to the implicit function that evaluates the comprehension.
Under this proposal, the entire body of the comprehension is evaluated in its implicit function. Names not assigned to within the comprehension are located in the surrounding scopes, as with normal lookups. As one special case, a comprehension at class scope will eagerly bind any name which is already defined in the class scope.
A list comprehension can be unrolled into an equivalent function. With Python 3.7 semantics::
numbers = [x + y for x in range(3) for y in range(4)]
# Is approximately equivalent to
def <listcomp>(iterator): result = [] for x in iterator: for y in range(4): result.append(x + y) return result numbers = <listcomp>(iter(range(3)))
Under the new semantics, this would instead be equivalent to::
def <listcomp>(): result = [] for x in range(3): for y in range(4): result.append(x + y) return result numbers = <listcomp>()
When a class scope is involved, a naive transformation into a function would prevent name lookups (as the function would behave like a method)::
class X: names = ["Fred", "Barney", "Joe"] prefix = "> " prefixed_names = [prefix + name for name in names]
With Python 3.7 semantics, this will evaluate the outermost iterable at class scope, which will succeed; but it will evaluate everything else in a function::
class X: names = ["Fred", "Barney", "Joe"] prefix = "> " def <listcomp>(iterator): result = [] for name in iterator: result.append(prefix + name) return result prefixed_names = <listcomp>(iter(names))
The name prefix
is thus searched for at global scope, ignoring the
class
name. Under the proposed semantics, this name will be eagerly bound; and
the
same early binding then handles the outermost iterable as well. The list
comprehension is thus approximately equivalent to::
class X: names = ["Fred", "Barney", "Joe"] prefix = "> " def <listcomp>(names=names, prefix=prefix): result = [] for name in names: result.append(prefix + name) return result prefixed_names = <listcomp>()
With list comprehensions, this is unlikely to cause any confusion. With
generator expressions, this has the potential to affect behaviour, as the
eager binding means that the name could be rebound between the creation
of
the genexp and the first call to next()
. It is, however, more closely
aligned to normal expectations. The effect is ONLY seen with names that
are looked up from class scope; global names (eg range()
) will still
be late-bound as usual.
One consequence of this change is that certain bugs in genexps will not
be detected until the first call to next()
, where today they would be
caught upon creation of the generator.
A list comprehension can map and filter efficiently by capturing the condition::
results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]
Similarly, a subexpression can be reused within the main expression, by giving it a name on first use::
stuff = [[y := f(x), x/y] for x in range(5)]
# There are a number of less obvious ways to spell this in current
# versions of Python, such as:
# Inline helper function
stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
# Extra 'for' loop - potentially could be optimized internally
stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
# Using a mutable cache object (various forms possible)
c = {} stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)]
In all cases, the name is local to the comprehension; like iteration variables, it cannot leak out into the surrounding context.
Assignment expressions can be used to good effect in the header of
an if
or while
statement::
# Proposed syntax
while (command := input("> ")) != "quit": print("You entered:", command)
# Capturing regular expression match objects
# See, for instance, Lib/pydoc.py, which uses a multiline spelling
# of this effect
if match := re.search(pat, text): print("Found:", match.group(0))
# Reading socket data until an empty string is returned
while data := sock.read(): print("Received data:", data)
# Equivalent in current Python, not caring about function return value
while input("> ") != "quit": print("You entered a command.")
# To capture the return value in current Python demands a four-line
# loop header.
while True: command = input("> "); if command == "quit": break print("You entered:", command)
Particularly with the while
loop, this can remove the need to have an
infinite loop, an assignment, and a condition. It also creates a smooth
parallel between a loop which simply uses a function call as its
condition,
and one which uses that as its condition but also uses the actual value.
Proposals broadly similar to this one have come up frequently on python-ideas. Below are a number of alternative syntaxes, some of them specific to comprehensions, which have been rejected in favour of the one given above.
Broadly the same semantics as the current proposal, but spelled differently.
EXPR as NAME
::
stuff = [[f(x) as y, x/y] for x in range(5)]
Since EXPR as NAME
already has meaning in except
and
with
statements (with different semantics), this would create unnecessary
confusion or require special-casing (eg to forbid assignment within the
headers of these statements).
EXPR -> NAME
::
stuff = [[f(x) -> y, x/y] for x in range(5)]
This syntax is inspired by languages such as R and Haskell, and some
programmable calculators. (Note that a left-facing arrow y <- f(x)
is
not possible in Python, as it would be interpreted as less-than and
unary
minus.) This syntax has a slight advantage over 'as' in that it does
not
conflict with with
and except
statements, but otherwise is
equivalent.
Adorning statement-local names with a leading dot::
stuff = [[(f(x) as .y), x/.y] for x in range(5)] # with "as" stuff = [[(.y := f(x)), x/.y] for x in range(5)] # with ":="
This has the advantage that leaked usage can be readily detected, removing some forms of syntactic ambiguity. However, this would be the only place in Python where a variable's scope is encoded into its name, making refactoring harder.
Adding a where:
to any statement to create local name bindings::
value = x*2 + 2x where:
x = spam(1, 4, 7, q)
Execution order is inverted (the indented body is performed first,
followed
by the "header"). This requires a new keyword, unless an existing
keyword
is repurposed (most likely with:
). See PEP 3150 for prior
discussion
on this subject (with the proposed keyword being given:
).
TARGET from EXPR
::
stuff = [[y from f(x), x/y] for x in range(5)]
This syntax has fewer conflicts than as
does (conflicting only
with the
raise Exc from Exc
notation), but is otherwise comparable to it.
Instead
of paralleling with expr as target:
(which can be useful but can
also be
confusing), this has no parallels, but is evocative.
One of the most popular use-cases is if
and while
statements.
Instead
of a more general solution, this proposal enhances the syntax of these
two
statements to add a means of capturing the compared value::
if re.search(pat, text) as match: print("Found:", match.group(0))
This works beautifully if and ONLY if the desired condition is based on
the
truthiness of the captured value. It is thus effective for specific
use-cases (regex matches, socket reads that return ''
when done), and
completely useless in more complicated cases (eg where the condition is
f(x) < 0
and you want to capture the value of f(x)
). It also
has
no benefit to list comprehensions.
Advantages: No syntactic ambiguities. Disadvantages: Answers only a
fraction
of possible use-cases, even in if
/while
statements.
Another common use-case is comprehensions (list/set/dict, and genexps). As above, proposals have been made for comprehension-specific solutions.
where
, let
, or given
::
stuff = [(y, x/y) where y = f(x) for x in range(5)] stuff = [(y, x/y) let y = f(x) for x in range(5)] stuff = [(y, x/y) given y = f(x) for x in range(5)]
This brings the subexpression to a location in between the 'for' loop
and
the expression. It introduces an additional language keyword, which
creates
conflicts. Of the three, where
reads the most cleanly, but also
has the
greatest potential for conflict (eg SQLAlchemy and numpy have where
methods, as does tkinter.dnd.Icon
in the standard library).
with NAME = EXPR
::
stuff = [(y, x/y) with y = f(x) for x in range(5)]
As above, but reusing the with
keyword. Doesn't read too badly, and
needs
no additional language keyword. Is restricted to comprehensions,
though,
and cannot as easily be transformed into "longhand" for-loop syntax.
Has
the C problem that an equals sign in an expression can now create a
name
binding, rather than performing a comparison. Would raise the question
of
why "with NAME = EXPR:" cannot be used as a statement on its own.
with EXPR as NAME
::
stuff = [(y, x/y) with f(x) as y for x in range(5)]
As per option 2, but using as
rather than an equals sign. Aligns
syntactically with other uses of as
for name binding, but a simple
transformation to for-loop longhand would create drastically different
semantics; the meaning of with
inside a comprehension would be
completely different from the meaning as a stand-alone statement, while
retaining identical syntax.
Regardless of the spelling chosen, this introduces a stark difference
between
comprehensions and the equivalent unrolled long-hand form of the loop.
It is
no longer possible to unwrap the loop into statement form without
reworking
any name bindings. The only keyword that can be repurposed to this task
is
with
, thus giving it sneakily different semantics in a comprehension
than
in a statement; alternatively, a new keyword is needed, with all the
costs
therein.
There are two logical precedences for the :=
operator. Either it
should
bind as loosely as possible, as does statement-assignment; or it should
bind
more tightly than comparison operators. Placing its precedence between
the
comparison and arithmetic operators (to be precise: just lower than
bitwise
OR) allows most uses inside while
and if
conditions to be
spelled
without parentheses, as it is most likely that you wish to capture the
value
of something, then perform a comparison on it::
pos = -1 while pos := buffer.find(search_term, pos + 1) >= 0: ...
Once find() returns -1, the loop terminates. If :=
binds as loosely
as
=
does, this would capture the result of the comparison (generally
either
True
or False
), which is less useful.
While this behaviour would be convenient in many situations, it is also
harder
to explain than "the := operator behaves just like the assignment
statement",
and as such, the precedence for :=
has been made as close as
possible to
that of =
.
The semantic changes to list/set/dict comprehensions, and more so to generator expressions, may potentially require migration of code. In many cases, the changes simply make legal what used to raise an exception, but there are some edge cases that were previously legal and now are not, and a few corner cases with altered semantics.
As of Python 3.7, the outermost iterable in a comprehension is special:
it is
evaluated in the surrounding context, instead of inside the
comprehension.
Thus it is permitted to contain a yield
expression, to use a name
also
used elsewhere, and to reference names from class scope. Also, in a
genexp,
the outermost iterable is pre-evaluated, but the rest of the code is not
touched until the genexp is first iterated over. Class scope is now
handled
more generally (see above), but if other changes require the old
behaviour,
the iterable must be explicitly elevated from the comprehension::
# Python 3.7
def f(x): return [x for x in x if x] def g(): return [x for x in [(yield 1)]]
# With PEP 572
def f(x): return [y for y in x if y] def g(): sent_item = (yield 1) return [x for x in [sent_item]]
This more clearly shows that it is g(), not the comprehension, which is able to yield values (and is thus a generator function). The entire comprehension is consistently in a single scope.
The following expressions would, in Python 3.7, raise exceptions immediately. With the removal of the outermost iterable's special casing, they are now equivalent to the most obvious longhand form::
gen = (x for x in rage(10)) # NameError gen = (x for x in 10) # TypeError (not iterable) gen = (x for x in range(1/0)) # ZeroDivisionError
def <genexp>(): for x in rage(10): yield x gen = <genexp>() # No exception yet tng = next(gen) # NameError
A list comprehension can use and update local names, and they will retain their values from one iteration to another. It would be convenient to use this feature to create rolling or self-effecting data streams::
progressive_sums = [total := total + value for value in data]
This will fail with UnboundLocalError due to total
not being
initalized.
Simply initializing it outside of the comprehension is insufficient -
unless
the comprehension is in class scope::
class X: total = 0 progressive_sums = [total := total + value for value in data]
At other scopes, it may be beneficial to have a way to fetch a value from the surrounding scope. Should this be automatic? Should it be controlled with a keyword? Hypothetically (and using no new keywords), this could be written::
total = 0 progressive_sums = [total := total + value import nonlocal total for value in data]
Translated into longhand, this would become::
total = 0 def <listcomp>(total=total): result = [] for value in data: result.append(total := total + value) return result progressive_sums = <listcomp>()
ie utilizing the same early-binding technique that is used at class scope.
C and its derivatives define the =
operator as an expression, rather
than
a statement as is Python's way. This allows assignments in more
contexts,
including contexts where comparisons are more common. The syntactic
similarity
between if (x == y)
and if (x = y)
belies their drastically
different
semantics. Thus this proposal uses :=
to clarify the distinction.
So can anything else. This is a tool, and it is up to the programmer to use it where it makes sense, and not use it where superior constructs can be used.
The two forms have different flexibilities. The :=
operator can be
used
inside a larger expression; the =
statement can be augmented to
+=
and
its friends. The assignment statement is a clear declaration of intent:
this
value is to be assigned to this target, and that's it.
Previous revisions of this proposal involved sublocal scope (restricted
to a
single statement), preventing name leakage and namespace pollution.
While a
definite advantage in a number of situations, this increases complexity
in
many others, and the costs are not justified by the benefits. In the
interests
of language simplicity, the name bindings created here are exactly
equivalent
to any other name bindings, including that usage at class or module
scope will
create externally-visible names. This is no different from for
loops or
other constructs, and can be solved the same way: del
the name once
it is
no longer needed, or prefix it with an underscore.
Names bound within a comprehension are local to that comprehension, even in the outermost iterable, and can thus be used freely without polluting the surrounding namespace.
(The author wishes to thank Guido van Rossum and Christoph Groth for their suggestions to move the proposal in this direction. [2]_)
As this adds another way to spell some of the same effects as can already be done, it is worth noting a few broad recommendations. These could be included in PEP 8 and/or other style guides.
If either assignment statements or assignment expressions can be used, prefer statements; they are a clear declaration of intent.
If using assignment expressions would lead to ambiguity about execution order, restructure it to use statements instead.
The author wishes to thank Guido van Rossum and Nick Coghlan for their considerable contributions to this proposal, and to members of the core-mentorship mailing list for assistance with implementation.
.. [1] Proof of concept / reference implementation (https://github.com/Rosuav/cpython/tree/assignment-expressions) .. [2] Pivotal post regarding inline assignment semantics (https://mail.python.org/pipermail/python-ideas/2018-March/049409.html )
This document has been placed in the public domain.
.. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ damalinov%40gmail.com
Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org
-- --Guido van Rossum (python.org/~guido)