Having survived four rounds in the boxing ring at python-ideas, PEP 572 is now ready to enter the arena of python-dev. I'll let the proposal speak for itself. Be aware that the reference implementation currently has a few test failures, which I'm still working on, but to my knowledge nothing will prevent the proposal itself from being successfully implemented.
For those who have seen the most recent iteration on -ideas, the only actual change to the core proposal is that chaining is fully supported now.
Formatted version: https://www.python.org/dev/peps/pep-0572/
ChrisA
PEP: 572 Title: Assignment Expressions Author: Chris Angelico rosuav@gmail.com Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 28-Feb-2018 Python-Version: 3.8 Post-History: 28-Feb-2018, 02-Mar-2018, 23-Mar-2018, 04-Apr-2018, 17-Apr-2018
This is a proposal for creating a way to assign to names within an expression. Additionally, the precise scope of comprehensions is adjusted, to maintain consistency and follow expectations.
Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse. Currently, this feature is available only in statement form, making it unavailable in list comprehensions and other expression contexts. Merely introducing a way to assign as an expression would create bizarre edge cases around comprehensions, though, and to avoid the worst of the confusions, we change the definition of comprehensions, causing some edge cases to be interpreted differently, but maintaining the existing behaviour in the majority of situations.
In any context where arbitrary Python expressions can be used, a named
expression can appear. This is of the form target := expr
where
expr
is any valid Python expression, and target
is any valid
assignment target.
The value of such a named expression is the same as the incorporated expression, with the additional side-effect that the target is assigned that value::
# Handle a matched regex
if (match := pattern.search(data)) is not None:
...
# A more explicit alternative to the 2-arg form of iter() invocation
while (value := read_next_item()) is not None:
...
# Share a subexpression between a comprehension filter clause and its output
filtered_data = [y for x in data if (y := f(x)) is not None]
Most importantly, since :=
is an expression, it can be used in contexts
where statements are illegal, including lambda functions and comprehensions.
An assignment statement can assign to multiple targets, left-to-right::
x = y = z = 0
The equivalent assignment expression is parsed as separate binary operators, and is therefore processed right-to-left, as if it were spelled thus::
assert 0 == (x := (y := (z := 0)))
Augmented assignment is not supported in expression form::
>>> x +:= 1
File "<stdin>", line 1
x +:= 1
^
SyntaxError: invalid syntax
Otherwise, the semantics of assignment are identical in statement and expression forms.
The current behaviour of list/set/dict comprehensions and generator expressions has some edge cases that would behave strangely if an assignment expression were to be used. Therefore the proposed semantics are changed, removing the current edge cases, and instead altering their behaviour only in a class scope.
As of Python 3.7, the outermost iterable of any comprehension is evaluated in the surrounding context, and then passed as an argument to the implicit function that evaluates the comprehension.
Under this proposal, the entire body of the comprehension is evaluated in its implicit function. Names not assigned to within the comprehension are located in the surrounding scopes, as with normal lookups. As one special case, a comprehension at class scope will eagerly bind any name which is already defined in the class scope.
A list comprehension can be unrolled into an equivalent function. With Python 3.7 semantics::
numbers = [x + y for x in range(3) for y in range(4)]
# Is approximately equivalent to
def <listcomp>(iterator):
result = []
for x in iterator:
for y in range(4):
result.append(x + y)
return result
numbers = <listcomp>(iter(range(3)))
Under the new semantics, this would instead be equivalent to::
def <listcomp>():
result = []
for x in range(3):
for y in range(4):
result.append(x + y)
return result
numbers = <listcomp>()
When a class scope is involved, a naive transformation into a function would prevent name lookups (as the function would behave like a method)::
class X:
names = ["Fred", "Barney", "Joe"]
prefix = "> "
prefixed_names = [prefix + name for name in names]
With Python 3.7 semantics, this will evaluate the outermost iterable at class scope, which will succeed; but it will evaluate everything else in a function::
class X:
names = ["Fred", "Barney", "Joe"]
prefix = "> "
def <listcomp>(iterator):
result = []
for name in iterator:
result.append(prefix + name)
return result
prefixed_names = <listcomp>(iter(names))
The name prefix
is thus searched for at global scope, ignoring the class
name. Under the proposed semantics, this name will be eagerly bound; and the
same early binding then handles the outermost iterable as well. The list
comprehension is thus approximately equivalent to::
class X:
names = ["Fred", "Barney", "Joe"]
prefix = "> "
def <listcomp>(names=names, prefix=prefix):
result = []
for name in names:
result.append(prefix + name)
return result
prefixed_names = <listcomp>()
With list comprehensions, this is unlikely to cause any confusion. With
generator expressions, this has the potential to affect behaviour, as the
eager binding means that the name could be rebound between the creation of
the genexp and the first call to next()
. It is, however, more closely
aligned to normal expectations. The effect is ONLY seen with names that
are looked up from class scope; global names (eg range()
) will still
be late-bound as usual.
One consequence of this change is that certain bugs in genexps will not
be detected until the first call to next()
, where today they would be
caught upon creation of the generator. See 'open questions' below.
These list comprehensions are all approximately equivalent::
stuff = [[y := f(x), x/y] for x in range(5)]
# There are a number of less obvious ways to spell this in current
# versions of Python.
# External helper function
def pair(x, value): return [value, x/value]
stuff = [pair(x, f(x)) for x in range(5)]
# Inline helper function
stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
# Extra 'for' loop - potentially could be optimized internally
stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
# Iterating over a genexp
stuff = [[y, x/y] for x, y in ((x, f(x)) for x in range(5))]
# Expanding the comprehension into a loop
stuff = []
for x in range(5):
y = f(x)
stuff.append([y, x/y])
# Wrapping the loop in a generator function
def g():
for x in range(5):
y = f(x)
yield [y, x/y]
stuff = list(g())
# Using a mutable cache object (various forms possible)
c = {}
stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)]
If calling f(x)
is expensive or has side effects, the clean operation of
the list comprehension gets muddled. Using a short-duration name binding
retains the simplicity; while the extra for
loop does achieve this, it
does so at the cost of dividing the expression visually, putting the named
part at the end of the comprehension instead of the beginning.
Similarly, a list comprehension can map and filter efficiently by capturing the condition::
results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]
Assignment expressions can be used to good effect in the header of
an if
or while
statement::
# Proposed syntax
while (command := input("> ")) != "quit":
print("You entered:", command)
# Capturing regular expression match objects
# See, for instance, Lib/pydoc.py, which uses a multiline spelling
# of this effect
if match := re.search(pat, text):
print("Found:", match.group(0))
# Reading socket data until an empty string is returned
while data := sock.read():
print("Received data:", data)
# Equivalent in current Python, not caring about function return value
while input("> ") != "quit":
print("You entered a command.")
# To capture the return value in current Python demands a four-line
# loop header.
while True:
command = input("> ");
if command == "quit":
break
print("You entered:", command)
Particularly with the while
loop, this can remove the need to have an
infinite loop, an assignment, and a condition. It also creates a smooth
parallel between a loop which simply uses a function call as its condition,
and one which uses that as its condition but also uses the actual value.
Proposals broadly similar to this one have come up frequently on python-ideas. Below are a number of alternative syntaxes, some of them specific to comprehensions, which have been rejected in favour of the one given above.
Broadly the same semantics as the current proposal, but spelled differently.
EXPR as NAME
, with or without parentheses::
stuff = [[f(x) as y, x/y] for x in range(5)]
Omitting the parentheses in this form of the proposal introduces many syntactic ambiguities. Requiring them in all contexts leaves open the option to make them optional in specific situations where the syntax is unambiguous (cf generator expressions as sole parameters in function calls), but there is no plausible way to make them optional everywhere.
With the parentheses, this becomes a viable option, with its own tradeoffs
in syntactic ambiguity. Since EXPR as NAME
already has meaning in
except
and with
statements (with different semantics), this
would
create unnecessary confusion or require special-casing (most notably of
with
and except
statements, where a nearly-identical syntax has
different semantics).
EXPR -> NAME
::
stuff = [[f(x) -> y, x/y] for x in range(5)]
This syntax is inspired by languages such as R and Haskell, and some
programmable calculators. (Note that a left-facing arrow y <- f(x)
is
not possible in Python, as it would be interpreted as less-than and unary
minus.) This syntax has a slight advantage over 'as' in that it does not
conflict with with
and except
statements, but otherwise is
equivalent.
Adorning statement-local names with a leading dot::
stuff = [[(f(x) as .y), x/.y] for x in range(5)] # with "as"
stuff = [[(.y := f(x)), x/.y] for x in range(5)] # with ":="
This has the advantage that leaked usage can be readily detected, removing some forms of syntactic ambiguity. However, this would be the only place in Python where a variable's scope is encoded into its name, making refactoring harder. This syntax is quite viable, and could be promoted to become the current recommendation if its advantages are found to outweigh its cost.
Adding a where:
to any statement to create local name bindings::
value = x**2 + 2*x where:
x = spam(1, 4, 7, q)
Execution order is inverted (the indented body is performed first, followed
by the "header"). This requires a new keyword, unless an existing keyword
is repurposed (most likely with:
). See PEP 3150 for prior discussion
on this subject (with the proposed keyword being given:
).
TARGET from EXPR
::
stuff = [[y from f(x), x/y] for x in range(5)]
This syntax has fewer conflicts than as
does (conflicting only with the
raise Exc from Exc
notation), but is otherwise comparable to it. Instead
of paralleling with expr as target:
(which can be useful but can also be
confusing), this has no parallels, but is evocative.
One of the most popular use-cases is if
and while
statements.
Instead
of a more general solution, this proposal enhances the syntax of these two
statements to add a means of capturing the compared value::
if re.search(pat, text) as match:
print("Found:", match.group(0))
This works beautifully if and ONLY if the desired condition is based on the
truthiness of the captured value. It is thus effective for specific
use-cases (regex matches, socket reads that return ''
when done), and
completely useless in more complicated cases (eg where the condition is
f(x) < 0
and you want to capture the value of f(x)
). It also
has
no benefit to list comprehensions.
Advantages: No syntactic ambiguities. Disadvantages: Answers only a fraction
of possible use-cases, even in if
/while
statements.
Another common use-case is comprehensions (list/set/dict, and genexps). As above, proposals have been made for comprehension-specific solutions.
where
, let
, or given
::
stuff = [(y, x/y) where y = f(x) for x in range(5)]
stuff = [(y, x/y) let y = f(x) for x in range(5)]
stuff = [(y, x/y) given y = f(x) for x in range(5)]
This brings the subexpression to a location in between the 'for' loop and
the expression. It introduces an additional language keyword, which creates
conflicts. Of the three, where
reads the most cleanly, but also has the
greatest potential for conflict (eg SQLAlchemy and numpy have where
methods, as does tkinter.dnd.Icon
in the standard library).
with NAME = EXPR
::
stuff = [(y, x/y) with y = f(x) for x in range(5)]
As above, but reusing the with
keyword. Doesn't read too badly, and needs
no additional language keyword. Is restricted to comprehensions, though,
and cannot as easily be transformed into "longhand" for-loop syntax. Has
the C problem that an equals sign in an expression can now create a name
binding, rather than performing a comparison. Would raise the question of
why "with NAME = EXPR:" cannot be used as a statement on its own.
with EXPR as NAME
::
stuff = [(y, x/y) with f(x) as y for x in range(5)]
As per option 2, but using as
rather than an equals sign. Aligns
syntactically with other uses of as
for name binding, but a simple
transformation to for-loop longhand would create drastically different
semantics; the meaning of with
inside a comprehension would be
completely different from the meaning as a stand-alone statement, while
retaining identical syntax.
Regardless of the spelling chosen, this introduces a stark difference between
comprehensions and the equivalent unrolled long-hand form of the loop. It is
no longer possible to unwrap the loop into statement form without reworking
any name bindings. The only keyword that can be repurposed to this task is
with
, thus giving it sneakily different semantics in a comprehension than
in a statement; alternatively, a new keyword is needed, with all the costs
therein.
There are two logical precedences for the :=
operator. Either it should
bind as loosely as possible, as does statement-assignment; or it should bind
more tightly than comparison operators. Placing its precedence between the
comparison and arithmetic operators (to be precise: just lower than bitwise
OR) allows most uses inside while
and if
conditions to be
spelled
without parentheses, as it is most likely that you wish to capture the value
of something, then perform a comparison on it::
pos = -1
while pos := buffer.find(search_term, pos + 1) >= 0:
...
Once find() returns -1, the loop terminates. If :=
binds as loosely as
=
does, this would capture the result of the comparison (generally either
True
or False
), which is less useful.
While this behaviour would be convenient in many situations, it is also harder
to explain than "the := operator behaves just like the assignment statement",
and as such, the precedence for :=
has been made as close as possible to
that of =
.
The semantic changes to list/set/dict comprehensions, and more so to generator expressions, may potentially require migration of code. In many cases, the changes simply make legal what used to raise an exception, but there are some edge cases that were previously legal and now are not, and a few corner cases with altered semantics.
As of Python 3.7, the outermost iterable in a comprehension is permitted to contain a 'yield' expression. If this is required, the iterable (or at least the yield) must be explicitly elevated from the comprehension::
# Python 3.7
def g():
return [x for x in [(yield 1)]]
# With PEP 572
def g():
sent_item = (yield 1)
return [x for x in [sent_item]]
This more clearly shows that it is g(), not the comprehension, which is able to yield values (and is thus a generator function). The entire comprehension is consistently in a single scope.
If the same name is used in the outermost iterable and also as an iteration variable, this will now raise UnboundLocalError when previously it referred to the name in the surrounding scope. Example::
# Lib/typing.py
tvars = []
for t in types:
if isinstance(t, TypeVar) and t not in tvars:
tvars.append(t)
if isinstance(t, _GenericAlias) and not t._special:
tvars.extend([ty for ty in t.__parameters__ if ty not in tvars])
If the list comprehension uses the name t
rather than ty
,
this will
work in Python 3.7 but not with this proposal. As with other unwanted name
shadowing, the solution is to use distinct names.
A comprehension inside a class previously was able to 'see' class members ONLY from the outermost iterable. Other name lookups would ignore the class and potentially locate a name at an outer scope::
pattern = "<%d>"
class X:
pattern = "[%d]"
numbers = [pattern % n for n in range(5)]
In Python 3.7, X.numbers
would show angle brackets; with PEP 572, it
would
show square brackets. Maintaining the current behaviour here is best done by
using distinct names for the different forms of pattern
, as would be the
case with functions.
Certain types of bugs in genexps were previously caught more quickly. Some are now detected only at first iteration::
gen = (x for x in rage(10)) # NameError
gen = (x for x in 10) # TypeError (not iterable)
gen = (x for x in range(1/0)) # Exception raised during evaluation
This brings such generator expressions in line with a simple translation to function form::
def <genexp>():
for x in rage(10):
yield x
gen = <genexp>() # No exception yet
tng = next(gen) # NameError
Detecting these errors more quickly is nontrivial. It is, however, the exact same problem as generator functions currently suffer from, and this proposal brings the genexp in line with the most natural longhand form.
As of Python 3.7, the outermost iterable in a genexp is evaluated early, and the result passed to the implicit function as an argument. With PEP 572, this would no longer be the case. Can we still, somehow, evaluate it before moving on? One possible implementation would be::
gen = (x for x in rage(10))
# translates to
def <genexp>():
iterable = iter(rage(10))
yield None
for x in iterable:
yield x
gen = <genexp>()
next(gen)
This would pump the iterable up to just before the loop starts, evaluating
exactly as much as is evaluated outside the generator function in Py3.7.
This would result in it being possible to call gen.send()
immediately,
unlike with most generators, and may incur unnecessary overhead in the
common case where the iterable is pumped immediately (perhaps as part of a
larger expression).
A list comprehension can use and update local names, and they will retain their values from one iteration to another. It would be convenient to use this feature to create rolling or self-effecting data streams::
progressive_sums = [total := total + value for value in data]
This will fail with UnboundLocalError due to total
not being initalized.
Simply initializing it outside of the comprehension is insufficient - unless
the comprehension is in class scope::
class X:
total = 0
progressive_sums = [total := total + value for value in data]
At other scopes, it may be beneficial to have a way to fetch a value from the surrounding scope. Should this be automatic? Should it be controlled with a keyword? Hypothetically (and using no new keywords), this could be written::
total = 0
progressive_sums = [total := total + value
import nonlocal total
for value in data]
Translated into longhand, this would become::
total = 0
def <listcomp>(total=total):
result = []
for value in data:
result.append(total := total + value)
return result
progressive_sums = <listcomp>()
ie utilizing the same early-binding technique that is used at class scope.
C and its derivatives define the =
operator as an expression, rather than
a statement as is Python's way. This allows assignments in more contexts,
including contexts where comparisons are more common. The syntactic similarity
between if (x == y)
and if (x = y)
belies their drastically
different
semantics. Thus this proposal uses :=
to clarify the distinction.
So can anything else. This is a tool, and it is up to the programmer to use it where it makes sense, and not use it where superior constructs can be used.
The two forms have different flexibilities. The :=
operator can be used
inside a larger expression; the =
statement can be augmented to
+=
and
its friends. The assignment statement is a clear declaration of intent: this
value is to be assigned to this target, and that's it.
Previous revisions of this proposal involved sublocal scope (restricted to a
single statement), preventing name leakage and namespace pollution. While a
definite advantage in a number of situations, this increases complexity in
many others, and the costs are not justified by the benefits. In the interests
of language simplicity, the name bindings created here are exactly equivalent
to any other name bindings, including that usage at class or module scope will
create externally-visible names. This is no different from for
loops or
other constructs, and can be solved the same way: del
the name once it is
no longer needed, or prefix it with an underscore.
Names bound within a comprehension are local to that comprehension, even in the outermost iterable, and can thus be used freely without polluting the surrounding namespace.
As this adds another way to spell some of the same effects as can already be done, it is worth noting a few broad recommendations. These could be included in PEP 8 and/or other style guides.
If either assignment statements or assignment expressions can be used, prefer statements; they are a clear declaration of intent.
If using assignment expressions would lead to ambiguity about execution order, restructure it to use statements instead.
Chaining multiple assignment expressions should generally be avoided. More than one assignment per expression can detract from readability.
The author wishes to thank Guido van Rossum and Nick Coghlan for their considerable contributions to this proposal, and to members of the core-mentorship mailing list for assistance with implementation.
.. [1] Proof of concept / reference implementation (https://github.com/Rosuav/cpython/tree/assignment-expressions)
This document has been placed in the public domain.
.. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
On Tue, Apr 17, 2018 at 12:46 AM, Chris Angelico rosuav@gmail.com wrote: >
Having survived four rounds in the boxing ring at python-ideas, PEP 572 is now ready to enter the arena of python-dev. I'll let the proposal speak for itself. Be aware that the reference implementation currently has a few test failures, which I'm still working on, but to my knowledge nothing will prevent the proposal itself from being successfully implemented.
Very interesting / exciting, thanks!
Augmented assignment is not supported in expression form::
>>> x +:= 1
File "<stdin>", line 1
x +:= 1
^
SyntaxError: invalid syntax
Can you include in the PEP a brief rationale for not accepting this form? In particular, is the intent never to support it, or is the intent to expressly allow adding it at a later date (e.g. after getting experience with the simpler form, etc)?
--Chris
On Tue, Apr 17, 2018 at 6:26 PM, Chris Jerdonek
chris.jerdonek@gmail.com wrote:
On Tue, Apr 17, 2018 at 12:46 AM, Chris Angelico rosuav@gmail.com wrote: >
Having survived four rounds in the boxing ring at python-ideas, PEP 572 is now ready to enter the arena of python-dev. I'll let the proposal speak for itself. Be aware that the reference implementation currently has a few test failures, which I'm still working on, but to my knowledge nothing will prevent the proposal itself from being successfully implemented.
Very interesting / exciting, thanks!
Augmented assignment is not supported in expression form::
>>> x +:= 1
File "<stdin>", line 1
x +:= 1
^
SyntaxError: invalid syntax
Can you include in the PEP a brief rationale for not accepting this form? In particular, is the intent never to support it, or is the intent to expressly allow adding it at a later date (e.g. after getting experience with the simpler form, etc)?
Sure. There are a few reasons; and Imaybe the best place to explain them is in the rejecteds section.
Augmented assignment is currently all of these:
augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' | '<<=' | '>>=' | '**=' | '//=')
I'm actually not sure whether the augmented-assignment-expression operators should be "+:=" or ":+=", but either way, it'd be another thirteen tokens, some of which would be four character tokens. That's an awful lot of extra tokens. There's a lot less value in chaining on top of a "+=" than a simple assignment, and it doesn't seem right to have tons of code to support that. (Consider how much extra code there is to support async functions, and then imagine that you have a similarly large amount of effort for something that's of far lesser significance.)
Another reason is that it can be confusing (to a human, not to the parser) that it would be the result of the augmented assignment, not the original expression, that would carry through. With regular assignment (whether it's to a simple name or to a subscript/attribute), removing the "target :=" part will leave you with the same value - the value of "x := 1" is 1. With augmented, the only logical way to do it would be to re-capture the left operand. Consider:
x = 5 print(x +:= 2)
Logically, this has to set x to 7, then print 7. But in more complicated situations, it can get confusing more easily than direct assignment does. Consider, for instance, augmented addition on a list, or similar.
It could potentially be added later, but I'm not working hard to keep the option open or anything. Basically it's just "outside the scope of this document".
ChrisA
On Tue, Apr 17, 2018 at 2:23 AM, Chris Angelico rosuav@gmail.com wrote:
Augmented assignment is currently all of these:
augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' | '<<=' | '>>=' | '**=' | '//=')
I'm actually not sure whether the augmented-assignment-expression operators should be "+:=" or ":+=", but either way, it'd be another thirteen tokens, some of which would be four character tokens.
Or simply rework the augmented assignment's semantics to become expression operators without any syntactic changes. Since there's no bug magnet arising in the usual context where '=' and '==' get confused:
if x += 1 < 2:
On 17 April 2018 at 17:46, Chris Angelico rosuav@gmail.com wrote:
In any context where arbitrary Python expressions can be used, a named
expression can appear. This is of the form target := expr
where
expr
is any valid Python expression, and target
is any valid
assignment target.
The "assignment expressions should be restricted to names only" subthread from python-ideas finally crystallised for me (thanks in part to your own comment that 'With regular assignment (whether it's to a simple name or to a subscript/attribute), removing the "target :=" part will leave you with the same value - the value of "x := 1" is 1.'), and I now have a concrete argument for why I think we want to restrict the assignment targets to names only: all complex assignment targets create inherent ambiguity around the type of the expression result, and exactly which operations are performed as part of the assignment.
Initially I thought the problem was specific to tuple unpacking syntax, but attempting to explain why subscript assignment and attribute assignments were OK made me realise that they're actually even worse off (since they can execute arbitrary code on both setting and retrieval, whereas tuple unpacking only iterates over iterables).
Tackling those in order...
Tuple unpacking:
What's the result type for "a, b, c := range(3)"? Is it a range()
object? Or is it a 3-tuple? If it's a 3-tuple, is that 3-tuple "(1, 2, 3)" or "(a, b, range(3))"? Once you have your answer, what about "a, b, c := iter(range(3))" or "a, b, *c := range(10)"?
Whichever answers we chose would be surprising at least some of the time, so it seems simplest to disallow such ambiguous constructs, such that the only possible interpretation is as "(a, b, range(3))"
Subscript assignment:
What's the final value of "result" in "seq = list(); result =
(seq[:] := range(3))"? Is it "range(3)"? Or is it "[1, 2, 3]"? As for tuple unpacking, does your preferred answer change for the case of "seq[:] := iter(range(3))"?
More generally, if I write "container[k] := value", does only
"type(container).__setitem__" get called, or does "type(container).__getitem__" get called as well?
Again, this seems inherently ambiguous to me, and hence best avoided (at least for now), such that the result is always unambiguously "range(3)".
Attribute assignment:
If I write "obj.attr := value", does only "type(obj).__setattr__"
get called, or does "type(obj).__getattribute__" get called as well?
While I can't think of a simple obviously ambiguous example using builtins or the standard library, result ambiguity exists even for the attribute access case, since type or value coercion may occur either when setting the attribute, or when retrieving it, so it makes a difference as to whether a reference to the right hand side is passed through directly as the assignment expression result, or if the attribute is stored and then retrieved again.
If all these constructs are prohibited, then a simple design principle serves to explain both their absence and the absence of the augmented assignment variants: "allowing the more complex forms of assignment as expressions makes the order of operations (as well as exactly which operations are executed) inherently ambiguous".
That ambiguity generally doesn't exist with simple name bindings (I'm excluding execution namespaces with exotic binding behaviour from consideration here, as the consequences of trying to work with those are clearly on the folks defining and using them).
The value of such a named expression is the same as the incorporated expression, with the additional side-effect that the target is assigned that value::
# Handle a matched regex
if (match := pattern.search(data)) is not None:
...
# A more explicit alternative to the 2-arg form of iter() invocation
while (value := read_next_item()) is not None:
...
# Share a subexpression between a comprehension filter clause and its output
filtered_data = [y for x in data if (y := f(x)) is not None]
[snip]
As this adds another way to spell some of the same effects as can already be done, it is worth noting a few broad recommendations. These could be included in PEP 8 and/or other style guides.
If either assignment statements or assignment expressions can be used, prefer statements; they are a clear declaration of intent.
If using assignment expressions would lead to ambiguity about execution order, restructure it to use statements instead.
Chaining multiple assignment expressions should generally be avoided. More than one assignment per expression can detract from readability.
Given the many different uses for ":" identified on python-ideas, I'm inclined to suggest making these proposed style guidelines more prescriptive (at least initially) by either:
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Strongly agree with Nick that only simple name targets should be permitted (at least initially). NONE of the motivating cases use more complex targets, and allowing them encourages obscurity and code golf.
On Tue, Apr 17, 2018, 8:20 AM Nick Coghlan ncoghlan@gmail.com wrote:
On 17 April 2018 at 17:46, Chris Angelico rosuav@gmail.com wrote:
In any context where arbitrary Python expressions can be used, a named
expression can appear. This is of the form target := expr
where
expr
is any valid Python expression, and target
is any valid
assignment target.
The "assignment expressions should be restricted to names only" subthread from python-ideas finally crystallised for me (thanks in part to your own comment that 'With regular assignment (whether it's to a simple name or to a subscript/attribute), removing the "target :=" part will leave you with the same value - the value of "x := 1" is 1.'), and I now have a concrete argument for why I think we want to restrict the assignment targets to names only: all complex assignment targets create inherent ambiguity around the type of the expression result, and exactly which operations are performed as part of the assignment.
Initially I thought the problem was specific to tuple unpacking syntax, but attempting to explain why subscript assignment and attribute assignments were OK made me realise that they're actually even worse off (since they can execute arbitrary code on both setting and retrieval, whereas tuple unpacking only iterates over iterables).
Tackling those in order...
Tuple unpacking:
What's the result type for "a, b, c := range(3)"? Is it a range()
object? Or is it a 3-tuple? If it's a 3-tuple, is that 3-tuple "(1, 2, 3)" or "(a, b, range(3))"? Once you have your answer, what about "a, b, c := iter(range(3))" or "a, b, *c := range(10)"?
Whichever answers we chose would be surprising at least some of the time, so it seems simplest to disallow such ambiguous constructs, such that the only possible interpretation is as "(a, b, range(3))"
Subscript assignment:
What's the final value of "result" in "seq = list(); result =
(seq[:] := range(3))"? Is it "range(3)"? Or is it "[1, 2, 3]"? As for tuple unpacking, does your preferred answer change for the case of "seq[:] := iter(range(3))"?
More generally, if I write "container[k] := value", does only
"type(container).__setitem__" get called, or does "type(container).__getitem__" get called as well?
Again, this seems inherently ambiguous to me, and hence best avoided (at least for now), such that the result is always unambiguously "range(3)".
Attribute assignment:
If I write "obj.attr := value", does only "type(obj).__setattr__"
get called, or does "type(obj).__getattribute__" get called as well?
While I can't think of a simple obviously ambiguous example using builtins or the standard library, result ambiguity exists even for the attribute access case, since type or value coercion may occur either when setting the attribute, or when retrieving it, so it makes a difference as to whether a reference to the right hand side is passed through directly as the assignment expression result, or if the attribute is stored and then retrieved again.
If all these constructs are prohibited, then a simple design principle serves to explain both their absence and the absence of the augmented assignment variants: "allowing the more complex forms of assignment as expressions makes the order of operations (as well as exactly which operations are executed) inherently ambiguous".
That ambiguity generally doesn't exist with simple name bindings (I'm excluding execution namespaces with exotic binding behaviour from consideration here, as the consequences of trying to work with those are clearly on the folks defining and using them).
The value of such a named expression is the same as the incorporated expression, with the additional side-effect that the target is assigned that value::
# Handle a matched regex
if (match := pattern.search(data)) is not None:
...
# A more explicit alternative to the 2-arg form of iter() invocation
while (value := read_next_item()) is not None:
...
# Share a subexpression between a comprehension filter clause and
its output filtered_data = [y for x in data if (y := f(x)) is not None]
[snip]
As this adds another way to spell some of the same effects as can already be done, it is worth noting a few broad recommendations. These could be included in PEP 8 and/or other style guides.
If either assignment statements or assignment expressions can be used, prefer statements; they are a clear declaration of intent.
If using assignment expressions would lead to ambiguity about execution order, restructure it to use statements instead.
Chaining multiple assignment expressions should generally be avoided. More than one assignment per expression can detract from readability.
Given the many different uses for ":" identified on python-ideas, I'm inclined to suggest making these proposed style guidelines more prescriptive (at least initially) by either:
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/mertz%40gnosis.cx
On 17 April 2018 at 14:01, David Mertz mertz@gnosis.cx wrote:
Strongly agree with Nick that only simple name targets should be permitted (at least initially). NONE of the motivating cases use more complex targets, and allowing them encourages obscurity and code golf.
I also agree. Originally I would have said why not allow them, it's a potentially useful generalisation. But Nick's examples pretty clearly demonstrate that there are a lot of unclear edge cases involved, and even though "prevent people writing ugly code" is explicitly stated as a non-goal in the PEP, that doesn't mean it's OK to allow an obvious bug magnet with no clear use cases.
Paul
On 17 April 2018 at 14:07, Paul Moore p.f.moore@gmail.com wrote:
On 17 April 2018 at 14:01, David Mertz mertz@gnosis.cx wrote:
Strongly agree with Nick that only simple name targets should be permitted (at least initially). NONE of the motivating cases use more complex targets, and allowing them encourages obscurity and code golf.
I also agree. Originally I would have said why not allow them, it's a potentially useful generalisation. But Nick's examples pretty clearly demonstrate that there are a lot of unclear edge cases involved, and even though "prevent people writing ugly code" is explicitly stated as a non-goal in the PEP, that doesn't mean it's OK to allow an obvious bug magnet with no clear use cases.
I should also point out that I remain -0 on this proposal (I'd already said this on python-ideas, but I should probably repeat it here). For me, the use cases are mostly marginal, and the major disadvantage is in having two forms of assignment. Explaining to a beginner why we use a := b in an expression, but a = b in a statement is going to be a challenge.
The fact that the PEP needs a section covering all the style guide warnings we feel are needed seems like it's a warning bell, too.
Paul
Agree with Paul. The PEP is well thought out and well presented, but I really don’t think we need this in Python (and I say this as someone who uses it regularly in C/C#).
-1 on the idea; no disrespect intended toward to people who did a lot of work on it.
Top-posted from my Windows phone
From: Paul Moore Sent: Tuesday, April 17, 2018 6:31 To: David Mertz Cc: Nick Coghlan; Python-Dev Subject: Re: [Python-Dev] PEP 572: Assignment Expressions
On 17 April 2018 at 14:07, Paul Moore p.f.moore@gmail.com wrote:
On 17 April 2018 at 14:01, David Mertz mertz@gnosis.cx wrote:
Strongly agree with Nick that only simple name targets should be permitted (at least initially). NONE of the motivating cases use more complex targets, and allowing them encourages obscurity and code golf.
I also agree. Originally I would have said why not allow them, it's a potentially useful generalisation. But Nick's examples pretty clearly demonstrate that there are a lot of unclear edge cases involved, and even though "prevent people writing ugly code" is explicitly stated as a non-goal in the PEP, that doesn't mean it's OK to allow an obvious bug magnet with no clear use cases.
I should also point out that I remain -0 on this proposal (I'd already said this on python-ideas, but I should probably repeat it here). For me, the use cases are mostly marginal, and the major disadvantage is in having two forms of assignment. Explaining to a beginner why we use a := b in an expression, but a = b in a statement is going to be a challenge.
The fact that the PEP needs a section covering all the style guide warnings we feel are needed seems like it's a warning bell, too.
Paul
Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org
On 04/17/2018 06:26 AM, Paul Moore wrote:
I should also point out that I remain -0 on this proposal (I'd already said this on python-ideas, but I should probably repeat it here). For me, the use cases are mostly marginal, and the major disadvantage is in having two forms of assignment. Explaining to a beginner why we use a := b in an expression, but a = b in a statement is going to be a challenge.
I don't see the challenge: They are different because '=' came first, but using '=' in an expression is a common source of bugs, so there we use ':=' instead.
-- ~Ethan~
On 17 April 2018 at 16:12, Ethan Furman ethan@stoneleaf.us wrote:
On 04/17/2018 06:26 AM, Paul Moore wrote:
I should also point out that I remain -0 on this proposal (I'd already said this on python-ideas, but I should probably repeat it here). For me, the use cases are mostly marginal, and the major disadvantage is in having two forms of assignment. Explaining to a beginner why we use a := b in an expression, but a = b in a statement is going to be a challenge.
I don't see the challenge: They are different because '=' came first, but using '=' in an expression is a common source of bugs, so there we use ':=' instead.
I fully expect the question "so if I want to assign to a variable, why shouldn't I just use x := 12 (as a statement)?" I don't have a good answer for that. And if I can't come up with a convincing reply to that, the next question will likely be "so why does = exist at all?"
If we want to change Python so that assignments are expressions, and assignment statements only remain for backward compatibility, then fine - we should propose that (I'd probably be against it, BTW, but I'd reserve judgement until I saw a proper proposal) but this PEP would need a lot of changes if it were to go down that route. This half-way house of having both seems like it will just confuse people.
Paul
[Paul Moore]
the next question will likely be "so why does = exist at all?"
[Greg Ewing greg.ewing@canterbury.ac.nz]
And if we decide to make ':=' the official assigment operator and deprectate '=', the next question will be "Why do we have '==' instead of '='?"
Which would be a fine question! In Python's very early days, it didn't have "==" at all: plain "=" was used for both assignment and equality testing.
From the HISTORY file:
""" New features in 0.9.6: ...
That script crawled a source tree and replaced instances of "=" used for equality testing with the new-fangled "==". We can obviously do something similar to replace instances of "=" used for assignment when that's removed, and I'm sure nobody will complain about that either ;-)
On 2018-04-17 22:44, Greg Ewing wrote:
Paul Moore wrote:
the next question will likely be "so why does = exist at all?"
And if we decide to make ':=' the official assigment operator and deprectate '=', the next question will be "Why do we have '==' instead of '='?"
Some languages use '=' for assignment, others for equality, but do you know of a language that uses ':=' for equality' or '==' for assignment?
If Python saw '=' it could ask "Do you mean assignment ':=' or equality '=='?".
On Apr 18, 2018, at 10:43, MRAB python@mrabarnett.plus.com wrote:
Some languages use '=' for assignment, others for equality, but do you know of a language that uses ':=' for equality' or '==' for assignment?
Clearly we should take a page from the ternary operator and make the assignment expression operator just ugly enough that people won’t overuse it. Since I can’t have ‘>>’ or ‘<>’ back, I propose ‘=======‘.
go-ahead-count-‘em-every-time-ly y’rs, -Barry
On Wed, Apr 18, 2018 at 11:04 AM Barry Warsaw barry@python.org wrote:
On Apr 18, 2018, at 10:43, MRAB python@mrabarnett.plus.com wrote:
Some languages use '=' for assignment, others for equality, but do you know of a language that uses ':=' for equality' or '==' for assignment?
Clearly we should take a page from the ternary operator and make the assignment expression operator just ugly enough that people won’t overuse it. Since I can’t have ‘>>’ or ‘<>’ back, I propose ‘=======‘.
go-ahead-count-‘em-every-time-ly y’rs,
8 of course. to "match" what merge conflict markers look like. ;)
php already uses === for something, we should just use =========== so we can say "it goes to eleven", ending the operator war once and for all. :P
-gps
On Wed, Apr 18, 2018 at 09:26:17PM +0000, "Gregory P. Smith" greg@krypto.org wrote:
On Wed, Apr 18, 2018 at 11:04 AM Barry Warsaw barry@python.org wrote:
Since I can???t have ???>>??? or ???<>??? back, I propose ???=======???.
8 of course. to "match" what merge conflict markers look like. ;)
Sorry for being pedantic, but git conflict markers are 7 in length.
-gps
Oleg Broytman http://phdru.name/ phd@phdru.name
Programmers don't die, they just GOSUB without RETURN.
MRAB wrote:
Some languages use '=' for assignment, others for equality, but do you know of a language that uses ':=' for equality' or '==' for assignment?
No, but the only sane reason to use "==" for equality testing seems to be if you're already using "=" for something else. So maybe we should just implement "from __future__ import pascal" and be done with. :-)
-- Greg
On Tue, Apr 17, 2018 at 10:17 PM, Nick Coghlan ncoghlan@gmail.com wrote:
Initially I thought the problem was specific to tuple unpacking syntax, but attempting to explain why subscript assignment and attribute assignments were OK made me realise that they're actually even worse off (since they can execute arbitrary code on both setting and retrieval, whereas tuple unpacking only iterates over iterables).
Tackling those in order...
Tuple unpacking:
What's the result type for "a, b, c := range(3)"? Is it a range()
object? Or is it a 3-tuple? If it's a 3-tuple, is that 3-tuple "(1, 2, 3)" or "(a, b, range(3))"? Once you have your answer, what about "a, b, c := iter(range(3))" or "a, b, *c := range(10)"?
This is one that I didn't originally think about, and when I first tried it out a couple of weeks ago, I decided against mentioning it either way, because I've no idea what _should_ be done. But here's what my reference implementation does:
x = (a,b,c := range(3)) Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'a' is not defined
In other words, it's being parsed as:
x = (a, b, (c := range(3)))
Forcing the interpretation does work:
x = ((a,b,c) := range(3))
And then the value of the expression is the same object that just got assigned (or in this case, unpacked):
x range(0, 3)
That's true even if it's an exhausted iterator:
x = ((a,b,c) := iter(range(3))) x
<range_iterator object at 0x7fb8e52c7030> next(x) Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration
The way it works is that the RHS gets evaluated, then that gets put on ice for a moment, and the assignment done with a copy of it. (In the CPython reference implementation, "put on ice" is done with DUP_TOP.) So the value that got assigned - even if that assignment involved unpacking a sequence - is still passed through the assignment and out the other side.
dis.dis("x = ((a,b,c) := range(3))")
### evaluate RHS
1 0 LOAD_NAME 0 (range) 2 LOAD_CONST 0 (3) 4 CALL_FUNCTION 1
### grab another reference to the range
6 DUP_TOP
### do the assignment
8 UNPACK_SEQUENCE 3
10 STORE_NAME 1 (a)
12 STORE_NAME 2 (b)
14 STORE_NAME 3 (c)
### carry on with the rest of the expression
16 STORE_NAME 4 (x)
18 LOAD_CONST 1 (None)
20 RETURN_VALUE
Whichever answers we chose would be surprising at least some of the time, so it seems simplest to disallow such ambiguous constructs, such that the only possible interpretation is as "(a, b, range(3))"
Yeah, that's what happens. Tuple display is defined as "test , test" and a 'test' can be 'target := value', so each element of a tuple can have an assignment expression in it. If you actually want to unpack inside an assignment expression, you need parentheses.
Subscript assignment:
What's the final value of "result" in "seq = list(); result =
(seq[:] := range(3))"? Is it "range(3)"? Or is it "[1, 2, 3]"? As for tuple unpacking, does your preferred answer change for the case of "seq[:] := iter(range(3))"?
It's range(3), and no, my preferred answer doesn't change. It'll probably never be useful to unpack an iterator in an assignment expression (since you'll always get an exhausted iterator at the end of it), but I'm sure there'll be uses for unpacking iterables.
More generally, if I write "container[k] :=
value", does only
"type(container).__setitem__" get called, or does "type(container).__getitem__" get called as well?
Only setitem. If you like, imagine the := operator as a tee junction: you "tap off" the pipe and snag it with assignment, and also keep using it just as if the assignment hadn't been there.
Again, this seems inherently ambiguous to me, and hence best avoided (at least for now), such that the result is always unambiguously "range(3)".
Attribute assignment:
If I write "obj.attr := value", does only "type(obj).__setattr__"
get called, or does "type(obj).__getattribute__" get called as well?
I didn't change anything about how assignment actually works, so I would expect it to be exactly the same semantics as statement-assignment has. Let's test.
class X: ... @property ... def spam(self): return 42 ... @spam.setter ... def spam(self, val): print("Setting spam to", val) ... x = X() dis.dis("(x.spam := 7)") 1 0 LOAD_CONST 0 (7) 2 DUP_TOP 4 LOAD_NAME 0 (x) 6 STORE_ATTR 1 (spam) 8 RETURN_VALUE (x.spam := 7) Setting spam to 7 7
Looks good to me. If I had to choose semantics, I don't think this would be a bad choice; and for something that derived naturally from a basic "let's just copy in the code for assignment", it's looking consistent and usable.
While I can't think of a simple obviously ambiguous example using builtins or the standard library, result ambiguity exists even for the attribute access case, since type or value coercion may occur either when setting the attribute, or when retrieving it, so it makes a difference as to whether a reference to the right hand side is passed through directly as the assignment expression result, or if the attribute is stored and then retrieved again.
Agreed.
That ambiguity generally doesn't exist with simple name bindings (I'm excluding execution namespaces with exotic binding behaviour from consideration here, as the consequences of trying to work with those are clearly on the folks defining and using them).
The cool thing about the simple and naive code is that even those should work. I don't have an example ready for demo, but I fully expect that it would 'just work' the exact same way; the namespace would never be retrieved from, only set to.
Hmm. I don't know what the consequences would be on class namespace with a non-vanilla dict. Probably functionally identical. But there might be some extremely weird cases if the namespace dict accepts setitem and then raises KeyError for that key.
As this adds another way to spell some of the same effects as can already be done, it is worth noting a few broad recommendations. These could be included in PEP 8 and/or other style guides.
If either assignment statements or assignment expressions can be used, prefer statements; they are a clear declaration of intent.
If using assignment expressions would lead to ambiguity about execution order, restructure it to use statements instead.
Chaining multiple assignment expressions should generally be avoided. More than one assignment per expression can detract from readability.
Given the many different uses for ":" identified on python-ideas, I'm inclined to suggest making these proposed style guidelines more prescriptive (at least initially) by either:
I'm actually dubious about the third point as it stands. It's either too broad or too narrow, but I'm not sure which; there are plenty of legitimate uses for multiple colons in an expression without confusion, but there are also plenty of ways that even a single assignexp could be pretty bad for readability. So I'm hoping that we can get some people to test this out well before 3.8 lands, and refine the style recommendations before this feature hits release.
ChrisA
On 04/17/2018 07:01 AM, Chris Angelico wrote:
On Tue, Apr 17, 2018 at 10:17 PM, Nick Coghlan wrote:
That ambiguity generally doesn't exist with simple name bindings (I'm excluding execution namespaces with exotic binding behaviour from consideration here, as the consequences of trying to work with those are clearly on the folks defining and using them).
The cool thing about the simple and naive code is that even those should work. I don't have an example ready for demo, but I fully expect that it would 'just work' the exact same way; the namespace would never be retrieved from, only set to.
Hmm. I don't know what the consequences would be on class namespace with a non-vanilla dict. Probably functionally identical. But there might be some extremely weird cases if the namespace dict accepts setitem and then raises KeyError for that key.
If you want to play with a non-standard (okay, wierd) class namespaces, you can try using the assignment expression in an Enum class.
-- ~Ethan~
On 17/04/2018 15:01, Chris Angelico wrote:
On Tue, Apr 17, 2018 at 10:17 PM, Nick Coghlan ncoghlan@gmail.com wrote:
As this adds another way to spell some of the same effects as can already be done, it is worth noting a few broad recommendations. These could be included in PEP 8 and/or other style guides.
If either assignment statements or assignment expressions can be used, prefer statements; they are a clear declaration of intent.
If using assignment expressions would lead to ambiguity about execution order, restructure it to use statements instead.
Chaining multiple assignment expressions should generally be avoided. More than one assignment per expression can detract from readability. Given the many different uses for ":" identified on python-ideas, I'm inclined to suggest making these proposed style guidelines more prescriptive (at least initially) by either:
I'll channel that Guido would be happiest if this rule were followed:
Given an assignment statement using "=", the meaning is the same if "=" is replaced with ":=".
In particular, the expression at the far right is evaluated once, and
Otherwise the semantics of "=" and ":=" can be very different indeed.
So, then, e.g., and assuming the rule above always applies:
[Nick]
Tuple unpacking:
What's the result type for "a, b, c := range(3)"? Is it a range()
object? Or is it a 3-tuple? If it's a 3-tuple, is that 3-tuple "(1, 2, 3)" or "(a, b, range(3))"?
It's the range object range(3). Same as in:
x = a, b, c = range(3)
x
is bound to the range object range(3).
Once you have your answer, what about "a, b,
c := iter(range(3))"
A range_iterator object, same as what x
is bound to in:
x = a, b, c = iter(range(3))
However, list(x)
then returns an empty list, because iter(range(3))
was evaluated only once, and the iterator was run to exhaustion when
unpacking it for the a, b, c
target.
or "a, b, *c := range(10)"?
The range object range(10).
Whichever answers we chose would be surprising at least some of the time, so it seems simplest to disallow such ambiguous constructs, such that the only possible interpretation is as "(a, b, range(3))"
That's why Guido would be happiest with the rule at the top. "The answers" can already be surprising at times with current assignment statements, but they are well defined. It would be mondo bonkers to make up entirely different subtle answers ;-)
Subscript assignment:
What's the final value of "result" in "seq = list(); result =
(seq[:] := range(3))"? Is it "range(3)"? Or is it "[1, 2, 3]"?
As above, it's range(3).
As for tuple unpacking, does your preferred
answer change for the
case of "seq[:] := iter(range(3))"?
As above, a range_iterator object, but one that's already been run to exhaustion.
More generally, if I write "container[k] :=
value", does only
"type(container).__setitem__" get called, or does "type(container).__getitem__" get called as well?
The rule at the top implies __setitem_ is called once, and __getitem__
not at all. The value of the assignment is the object value
was
bound to at the start, regardless of how tricky __setitem__ may be.
And in
k := container[k] := value
k
is bound to value
before container[k]
is
evaluated. Why?
Because that's how assignment _statements_ have always worked.
Attribute assignment:
If I write "obj.attr := value", does only "type(obj).__setattr__"
get called, or does "type(obj).__getattribute__" get called as well?
As above, only __setattr__.
While I can't think of a simple obviously ambiguous example using builtins or the standard library, result ambiguity exists even for the attribute access case, since type or value coercion may occur either when setting the attribute, or when retrieving it, so it makes a difference as to whether a reference to the right hand side is passed through directly as the assignment expression result, or if the attribute is stored and then retrieved again.
This is already defined for assignment statements. While the PEP doesn't say "and the same for assignment expressions", my guess is that it won't be accepted unless it does.
Or, indeed, the target is limited to a name. But Guido wasn't keen on that.
In short, I think the PEP's chance of acceptance increases the _more_ assignment expressions act like assignment statements, not the less, and is highest if they act exactly the same (except for returning a value; e.g., while
a = 3
at a shell displays nothing,
a := 3
should display 3).
On Wed, Apr 18, 2018 at 5:28 AM, Tim Peters tim.peters@gmail.com wrote:
I'll channel that Guido would be happiest if this rule were followed:
Given an assignment statement using "=", the meaning is the same if "=" is replaced with ":=".
That's broadly the intention. At the moment, there are two exceptions:
1) Augmented assignment isn't a thing 2) Chained assignment isn't a thing, which means that the assignments operate right-to-left 2a) Assignment
In particular, the expression at the far right is evaluated once, and
I'll toy with this and see if I can implement it sanely. If so, that'll eliminate one more distinction.
Otherwise the semantics of "=" and ":=" can be very different indeed.
TBH, the common cases won't actually be much affected. You give this example:
k := container[k] := value
but that's not going to be more common. What I'm more likely to see is something like this:
k, container[k] = new_key(), new_value()
which can instead be written:
container[k := new_key()] = new_value()
and is, IMO, clearer that way.
So, then, e.g., and assuming the rule above always applies:
[Nick]
Tuple unpacking:
What's the result type for "a, b, c := range(3)"? Is it a range()
object? Or is it a 3-tuple? If it's a 3-tuple, is that 3-tuple "(1, 2, 3)" or "(a, b, range(3))"?
It's the range object range(3). Same as in:
x = a, b, c = range(3)
x
is bound to the range object range(3).
At the moment, "x = a, b, c := range(3)" will set c to range(3), then build a tuple of that with the existing values of a and b. You can, however, parenthesize the (a, b, c) part, and then it'll behave as you say.
Whichever answers we chose would be surprising at least some of the time, so it seems simplest to disallow such ambiguous constructs, such that the only possible interpretation is as "(a, b, range(3))"
That's why Guido would be happiest with the rule at the top. "The answers" can already be surprising at times with current assignment statements, but they are well defined. It would be mondo bonkers to make up entirely different subtle answers ;-)
Wholeheartedly agreed.
ChrisA
[Tim]
I'll channel that Guido would be happiest if this rule were followed:
Given an assignment statement using "=", the meaning is the same if "=" is replaced with ":=".
[Chris]
That's broadly the intention. At the moment, there are two exceptions:
1) Augmented assignment isn't a thing
Doesn't have to be :-) "Augmented assignment statement" is already a different thing than "assignment statement" (for example, in an augmented assignment statement, there is no chaining, and the sole target can' t be, e.g., a slice or any form of unpacking syntax).
2) Chained assignment isn't a thing, which means that the assignments operate right-to-left
In particular, the expression at the far right is evaluated once, and
I'll toy with this and see if I can implement it sanely. If so, that'll eliminate one more distinction.
Otherwise the semantics of "=" and ":=" can be very different indeed.
TBH, the common cases won't actually be much affected.
Or at all! That's not the point here, though: if making assignment expressions work as exactly like assignment statements as possible is what's needed for the PEP to pass, it's the _annoying_ cases that have to be looked at.
Personally, after considerable staring at my own code, I would be perfectly happy to settle for assignment expressions no fancier than
identifier ":=" expression
That alone covers over 99% of the cases I'd be tempted to use the new
feature at all, and then gobs of general-case assignment-statement
difficulties go away, including the "right-to-left or left-to-right?"
distinction (there's no way to tell which order bindings happen in x
:= y := z := 3
short of staring at the generated code).
But so far I haven't gotten the impression that Guido is fond of that. He should be, though ;-)
You give this example:
k := container[k] := value
but that's not going to be more common. What I'm more likely to see is something like this:
Not about what's common, but about the full range of what's possible to express.
...
[Nick]
Tuple unpacking:
What's the result type for "a, b, c := range(3)"? Is it a range()
object? Or is it a 3-tuple? If it's a 3-tuple, is that 3-tuple "(1, 2, 3)" or "(a, b, range(3))"?
It's the range object range(3). Same as in:
x = a, b, c = range(3)
x
is bound to the range object range(3).
At the moment, "x = a, b, c := range(3)" will set c to range(3), then build a tuple of that with the existing values of a and b. You can, however, parenthesize the (a, b, c) part, and then it'll behave as you say.
Which would be really annoying to "repair".
Whichever answers we chose would be surprising at least some of the time, so it seems simplest to disallow such ambiguous constructs, such that the only possible interpretation is as "(a, b, range(3))"
That's why Guido would be happiest with the rule at the top. "The answers" can already be surprising at times with current assignment statements, but they are well defined. It would be mondo bonkers to make up entirely different subtle answers ;-)
Wholeheartedly agreed.
I'd like Guido to chime in again, because I'm pretty sure he won't accept what's currently on the table. There are two plausible ways to repair that:
Continue down the road of making assignment expressions "exactly like" assignment statements in their full generality.
Back off and limit assignment expressions to what appears to be the overwhelmingly most common case motivated by looking at real code (as opposed to constructing examples to illustrate pitfalls & obscurities):
identifier ":=" expression
On Tue, Apr 17, 2018 at 3:23 PM, Tim Peters tim.peters@gmail.com wrote:
[Tim]
I'll channel that Guido would be happiest if this rule were followed:
Given an assignment statement using "=", the meaning is the same if "=" is replaced with ":=".
Thanks for channeling me. :=)
I'd like Guido to chime in again, because I'm pretty sure he won't
accept what's currently on the table. There are two plausible ways to repair that:
Continue down the road of making assignment expressions "exactly like" assignment statements in their full generality.
Back off and limit assignment expressions to what appears to be the overwhelmingly most common case motivated by looking at real code (as opposed to constructing examples to illustrate pitfalls & obscurities):
identifier ":=" expression
I haven't had the time to follow this thread in detail; fortunately I don't have to because of Tim's excellent channeling.
I am fine with this, it certainly seems the easiest to implement, with the fewest corner cases, and the easiest restriction to explain.
(I was thinking there would be a use case for basic tuple unpacking, like seen a lot in for-loop, but the only examples I tried to come up with were pretty sub-optimal, so I don't worry about that any more.)
-- --Guido van Rossum (python.org/~guido)
[Guido, makes peace with identifier := expression
]
... I am fine with this, it certainly seems the easiest to implement, with the fewest corner cases, and the easiest restriction to explain.
(I was thinking there would be a use case for basic tuple unpacking, like seen a lot in for-loop, but the only examples I tried to come up with were pretty sub-optimal, so I don't worry about that any more.)
Chris's pain threshold appears to be higher than ours ;-)
So I would really like to see if anyone has plausibly realistic uses for fancier forms of assignment expression.
I have plenty of code that does stuff like this:
while True:
x, y = func_returning_tuple()
if y is None:
break
...
Maybe it's just that I'm used to it, but I find that very easy to understand now. If we had fancy assignment expressions, my first thought was I could write it like so instead:
while ((x, y) := func_returning_tuple()) and y is not None:
...
and pray that I put in enough parens to get the intended meaning.
And maybe it's just that I'm _not_ used to that, but I do find it
harder to understand. Contributing factor: I don't really want "and"
there - what the context requires is really more like C's comma
operator (take only the last value from a sequence of expressions).
As is, I'm relying on that a 2-tuple is truthy regardless of its
content (so that and
always goes on to evaluate its RHS).
And, for some reason, I find this even worse:
while ((x, y) := func_returning_tuple())[1] is not None:
...
The rub there: I gave y
a name but can't use it in the test?!
And those are the same kinds of headaches I saw over & over in my own "fancier" code: stuff that's already perfectly clear would become more obscure instead.
Tuple unpacking works great in for-loops because the only effect there
is to give names to the tuple components, none of which are needed
_in_ the for
statement itself. But in a while"
or
ifstatement,
I would typically _also_ want to use the names _in_ the
whileor
if`
tests. But, as in C, that's what the comma operator is for, not
the assignment operator.
while (s = function_returning_struct_with_x_and_y_members(), s.y != NULL) {
...
}
In contrast, ,many plausible uses I saw for identifier := expression
in a while
or if
statement would have been improvements, and
most
of the rest neutral: I'm still wondering whether this one is better
or worse ;-):
def newton(f, fprime, x):
import math
while not math.isclose((next_x := x - f(x) / fprime(x)), x):
x = next_x
return next_x
On 18 April 2018 at 11:35, Tim Peters tim.peters@gmail.com wrote:
And, for some reason, I find this even worse:
while ((x, y) := func_returning_tuple())[1] is not None:
...
The rub there: I gave y
a name but can't use it in the test?!
And those are the same kinds of headaches I saw over & over in my own "fancier" code: stuff that's already perfectly clear would become more obscure instead.
Whereas I think:
while (s := func_returning_tuple())[1] is not None:
s = x, y
...
compares favourably with the loop-and-a-half version.
It does make the guarantee that "y is not None" harder to spot than it is in the loop-and-a-half version, though.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
[Tim]
And, for some reason, I find this even worse:
while ((x, y) := func_returning_tuple())[1] is not None:
...
The rub there: I gave y
a name but can't use it in the test?!
And those are the same kinds of headaches I saw over & over in my own "fancier" code: stuff that's already perfectly clear would become more obscure instead.
[Nick]
Whereas I think:
while (s := func_returning_tuple())[1] is not None:
s = x, y
...
compares favourably with the loop-and-a-half version.
Obviously not, since it really needs to be
x, y = s
instead ;-)
In context, I was looking for realistic cases in which assignment expressions _fancier than_
identifier ":=" expression
is a real improvement. You found an improvement instead by _replacing_ a "fancier than" instance with a plain-single-name target. I already have lots of examples from real code where plain-single-name target reads better to me. I don't have any yet from real code where something fancier does.
In this specific case, I find your rewriting about as readable as the loop-and-a-half, except for the obvious drawback of the former:
It does make the guarantee that "y is not None" harder to spot than it is in the loop-and-a-half version, though.
Over time, the functions in the real codes from which the example was synthesized change, sometimes carrying more or less state in tuples. When that happens, the original
x, y = s
will helpfully blow up (size mismatch in unpacking), But, if the tuple length increased, is it still the case that I want to test the 1'th component? The test is now divorced from the unpacking. I do know that I'll still want to test the component I think of as being "the 'y' component", and the loop-and-a-half version accommodates that naturally.
Then again, I could switch to new-fanged namedtuples instead, and do
while (s := func_returning_tuple()).y is not None:
to get the best of all worlds.
On 18 April 2018 at 10:20, Guido van Rossum guido@python.org wrote: [Tim Peters wrote]
Back off and limit assignment expressions to what appears to be the overwhelmingly most common case motivated by looking at real code (as opposed to constructing examples to illustrate pitfalls & obscurities):
identifier ":=" expression
I haven't had the time to follow this thread in detail; fortunately I don't have to because of Tim's excellent channeling.
I am fine with this, it certainly seems the easiest to implement, with the fewest corner cases, and the easiest restriction to explain.
(I was thinking there would be a use case for basic tuple unpacking, like seen a lot in for-loop, but the only examples I tried to come up with were pretty sub-optimal, so I don't worry about that any more.)
In the other direction I was thinking about the question "Then why do I think tuple unpacking is OK in comprehensions?", and realised that it's because in that situation there are keywords as delimiters on both sides (i.e. "... for name [, name]* in ..."), so it's harder for the unpacking operation to get confused with other uses of commas as separators. Similarly, in regular assignments, the unpacking target is always either between two "=" or else from the start of the line to the first "=".
By contrast, for assignment expressions, the only potential explicit opening delimiter is "(", and that's also the case for tuple literals.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 04/17/2018 12:46 AM, Chris Angelico wrote:
PEP: 572 Title: Assignment Expressions Author: Chris Angelico rosuav@gmail.com
+1
Thanks for all the hard work, Chris! Hoping this PEP finally breaks your streak! ;)
-- ~Ethan~
On 4/17/2018 12:46 AM, Chris Angelico wrote:
gen = (x for x in rage(10)) # NameError
gen = (x for x in 10) # TypeError (not iterable)
gen = (x for x in range(1/0)) # Exception raised during evaluation
This brings such generator expressions in line with a simple translation to function form::
def <genexp>():
for x in rage(10):
yield x
gen = <genexp>() # No exception yet
tng = next(gen) # NameError
Detecting these errors more quickly is nontrivial. It is, however, the exact same problem as generator functions currently suffer from, and this proposal brings the genexp in line with the most natural longhand form.
As of Python 3.7, the outermost iterable in a genexp is evaluated early, and the result passed to the implicit function as an argument. With PEP 572, this would no longer be the case. Can we still, somehow, evaluate it before moving on? One possible implementation would be::
gen = (x for x in rage(10))
# translates to
def <genexp>():
iterable = iter(rage(10))
yield None
for x in iterable:
yield x
gen = <genexp>()
next(gen)
I think "rage" is supposed to be "range" in four places in the quoted block.
Hi Chris,
Thank you for working on this PEP! Inline assignments is a long requested feature and this seems to be the first serious attempt at adding it.
That said I'm very -1 on the idea.
JavaScript has inline assignments and they are useful to get the match object after applying a regex. You use the same example in your PEP. But in my experience, this is the only common pattern in JavaScript. I don't see people using inline assignments for anything else, at least it's not a common pattern.
C is low-level and has no exceptions. It uses function return values to signal if there was an error or not. It's a popular pattern to call a function from an 'if' statement like this: "if ((int ret = func()))" to save a line of code. If we ignore this particular pattern, we see that inline assignment isn't used that often.
In your PEP you use comprehensions and regex match object to show how inline assignment can simplify the code. In my experience, comprehensions that are a little more complex than "(f(x) for x in something)" are always better being rewritten to an expanded form. I don't find "stuff = [[y := f(x), x/y] for x in range(5)]" very readable, and yes, I think that the simple expanded version of this comprehension is better.
Using inline assignments in "while" statements is neat, but how often do we use "while" statements?
I simply don't see a very compelling use case to have two forms of assignment in Python. It does complicate the grammar by adding a new operator, it invites people to write more complex code, and it has only a couple good use cases.
Yury
On 4/17/2018 1:03 PM, Yury Selivanov wrote:
Using inline assignments in "while" statements is neat, but how often do we use "while" statements?
Beginners commonly write game and entry loops and commonly stumble over the need to write an infinite loop and a half. The evidence is on both python-list and StackOverflow. Reducing 4 lines to 1 here reduces complexity and, I think, increases clarity.
There is currently no way, not one way, to bind a name to an expression for use within the same statement. I recommended to Chris that he remove from the PEP everything but this. No other targets. No chaining. (And no redefining class-scope comprehensions.) This would remove all but the unavoidable overlap and reduce the added complexity of what people could write.
I simply don't see a very compelling use case to have two forms of assignment in Python.
We already have more than two ;-).
-- Terry Jan Reedy
On 4/17/2018 3:46 AM, Chris Angelico wrote:
This is a proposal for creating a way to assign to names within an expression.
I started at -something as this is nice but not necessary. I migrated to +something for the specific, limited proposal you wrote above: expressions of the form "name := expression".
Additionally, the precise scope of comprehensions is adjusted, to maintain consistency and follow expectations.
We fiddled with comprehension scopes, and broke some code, in 3.0. I oppose doing so again. People expect their 3.x code to continue working in future versions. Breaking that expectation should require deprecation for at least 2 versions.
Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse.
Right. In other words, "name := expression".
Merely introducing a way to assign as an expression would create bizarre edge cases around comprehensions, though, and to avoid the worst of the confusions, we change the definition of comprehensions, causing some edge cases to be interpreted differently, but maintaining the existing behaviour in the majority of situations.
If it is really true that introducing 'name := expression' requires such a side-effect, then I might oppose it.
In any context where arbitrary Python expressions can be used, a named
expression can appear. This is of the form target := expr
where
expr
is any valid Python expression, and target
is any valid
assignment target.
This generalization is different from what you said in the abstract and rationale. No rationale is given. After reading Nick's examination of the generalization, and your response, -1.
The value of such a named expression is the same as the incorporated expression, with the additional side-effect that the target is assigned that value::
As someone else noted, you only use names as targets, thus providing no rationale for anything else.
# Handle a matched regex
if (match := pattern.search(data)) is not None:
...
# A more explicit alternative to the 2-arg form of iter() invocation
while (value := read_next_item()) is not None:
...
To me, being able to name and test expressions fits with Python names not being typed. To me, usage such as the above is the justification for the limited proposal.
# Share a subexpression between a
comprehension filter clause and its output
filtered_data = [y for x in data if (y := f(x)) is not None]
And this is secondary.
Most importantly, since :=
is an expression, it can be used in contexts
where statements are illegal, including lambda functions and comprehensions.
An assignment statement can assign to multiple targets, left-to-right::
x = y = z = 0
This is a bad example as there is very seldom a reason to assign multiple names, as opposed to multiple targets. Here is a typical real example.
self.x = x = expression
# Use local x in the rest of the method.
In "x = y = 0", x and y likely represent two different concepts (variables) that happen to be initialized with the same value. One could instead write "x,y = 0,0".
The equivalent assignment expression
should be a syntax error.
is parsed as separate binary operators,
':=' is not a binary operator, any more than '=' is, as names, and targets in general, are not objects. Neither fetch and operate on the current value, if any, of the name or target. Therefore neither has an 'associativity'.
and is therefore processed right-to-left, as if it were spelled thus::
assert 0 == (x := (y := (z := 0)))
Parentheses should be required, to maintain the syntax "name := expression".
Augmented assignment is not supported in expression form::
>>> x +:= 1
File "<stdin>", line 1
x +:= 1
^
SyntaxError: invalid syntax
I would have expected :+=, but agree with the omission.
Otherwise, the semantics of assignment are identical in statement and expression forms.
Mostly replacing '=' with ':=' is a different proposal and a different goal than naming expressions within an expression for reuse (primarily) within the expression (including compound expressions).
Proposing an un-augmentable, un-chainable, name_only := expression expression would not be duplicating assignment statements.
The current behaviour of list/set/dict comprehensions and generator expressions has some edge cases that would behave strangely if an assignment expression were to be used.
You have not shown this. Your examples do not involve assignment expressions, and adding them should make no difference. Changing the scoping of comprehensions should be a separate PEP.
Therefore the proposed semantics are changed,
removing the current edge cases, and instead altering their behaviour only in a class scope.
As of Python 3.7, the outermost iterable of any comprehension is evaluated in the surrounding context, and then passed as an argument to the implicit function that evaluates the comprehension.
Under this proposal, the entire body of the comprehension is evaluated in its implicit function. Names not assigned to within the comprehension are located in the surrounding scopes, as with normal lookups. As one special case, a comprehension at class scope will eagerly bind any name which is already defined in the class scope.
A list comprehension can be unrolled into an equivalent function. With Python 3.7 semantics::
numbers = [x + y for x in range(3) for y in range(4)]
# Is approximately equivalent to
def <listcomp>(iterator):
result = []
for x in iterator:
for y in range(4):
result.append(x + y)
return result
numbers = <listcomp>(iter(range(3)))
Under the new semantics, this would instead be equivalent to::
def <listcomp>():
result = []
for x in range(3):
for y in range(4):
result.append(x + y)
return result
numbers = <listcomp>()
Why make the change?
When a class scope is involved, a naive transformation into a function would prevent name lookups (as the function would behave like a method)::
class X:
names = ["Fred", "Barney", "Joe"]
prefix = "> "
prefixed_names = [prefix + name for name in names]
With Python 3.7 semantics,
I believe in all of 3.x ..
this will evaluate the outermost iterable at class scope, which will succeed; but it will evaluate everything else in a function::
class X:
names = ["Fred", "Barney", "Joe"]
prefix = "> "
def <listcomp>(iterator):
result = []
for name in iterator:
result.append(prefix + name)
return result
prefixed_names = <listcomp>(iter(names))
The name prefix
is thus searched for at global scope, ignoring the class
name.
And today it fails. This has nothing to do with adding name assignment expressions.
Under the proposed semantics, this name will be eagerly bound; and the same early binding then handles the outermost iterable as well. The list comprehension is thus approximately equivalent to::
class X:
names = ["Fred", "Barney", "Joe"]
prefix = "> "
def <listcomp>(names=names, prefix=prefix):
result = []
for name in names:
result.append(prefix + name)
return result
prefixed_names = <listcomp>()
With list comprehensions, this is unlikely to cause any confusion. With
generator expressions, this has the potential to affect behaviour, as the
eager binding means that the name could be rebound between the creation of
the genexp and the first call to next()
. It is, however, more closely
aligned to normal expectations. The effect is ONLY seen with names that
are looked up from class scope; global names (eg range()
) will still
be late-bound as usual.
One consequence of this change is that certain bugs in genexps will not
be detected until the first call to next()
, where today they would be
caught upon creation of the generator. See 'open questions' below.
I consider this secondary and would put it second.
I would put this first, as you did above.
Assignment expressions can be used to good effect in
the header of
an if
or while
statement::
# Proposed syntax
while (command := input("> ")) != "quit":
print("You entered:", command)
# Capturing regular expression match objects
# See, for instance, Lib/pydoc.py, which uses a multiline spelling
# of this effect
if match := re.search(pat, text):
print("Found:", match.group(0))
# Reading socket data until an empty string is returned
while data := sock.read():
print("Received data:", data)
# Equivalent in current Python, not caring about function return value
while input("> ") != "quit":
print("You entered a command.")
# To capture the return value in current Python demands a four-line
# loop header.
while True:
command = input("> ");
if command == "quit":
break
print("You entered:", command)
This idiom is not obvious to beginners and is awkward at best, so I consider eliminating this the biggest gain. Beginners commonly write little games and entry loops and get tripped up trying to do so.
Particularly with the while
loop, this
can remove the need to have an
infinite loop, an assignment, and a condition. It also creates a smooth
parallel between a loop which simply uses a function call as its condition,
and one which uses that as its condition but also uses the actual value.
...
Bottom line: I suggest rewriting again, as indicated, changing title to 'Name Assignment Expressions'.
-- Terry Jan Reedy
On 2018-04-17 22:53, Terry Reedy wrote:
On 4/17/2018 3:46 AM, Chris Angelico wrote: [snip]
Augmented assignment is not supported in expression form::
>>> x +:= 1
File "<stdin>", line 1
x +:= 1
^
SyntaxError: invalid syntax
I would have expected :+=, but agree with the omission.
x = x op 1 is abbreviated to x op= 1, so x := x op 1 would be abbreviated to x op:= 1. That's what's used in the Icon language.
[snip]
On Wed, Apr 18, 2018 at 7:53 AM, Terry Reedy tjreedy@udel.edu wrote:
On 4/17/2018 3:46 AM, Chris Angelico wrote:
This is a proposal for creating a way to assign to names within an expression.
I started at -something as this is nice but not necessary. I migrated to +something for the specific, limited proposal you wrote above: expressions of the form "name := expression".
Additionally, the precise scope of comprehensions is adjusted, to maintain consistency and follow expectations.
We fiddled with comprehension scopes, and broke some code, in 3.0. I oppose doing so again. People expect their 3.x code to continue working in future versions. Breaking that expectation should require deprecation for at least 2 versions.
The changes here are only to edge and corner cases, other than as they specifically relate to assignment expressions. The current behaviour is intended to "do the right thing" according to people's expectations, and it largely does so; those cases are not changing. For list comprehensions at global or function scope, the ONLY case that can change (to my knowledge) is where you reuse a variable name:
[t for t in t.__parameters__ if t not in tvars]
This works in 3.7 but will fail easily and noisily (UnboundLocalError) with PEP 572. IMO this is a poor way to write a loop, and the fact that it "happened to work" is on par with code that depended on dict iteration order in Python 3.2 and earlier. Yes, the implementation is well defined, but since you can achieve exactly the same thing by picking a different variable name, it's better to be clear.
Note that the second of the open questions would actually return this to current behaviour, by importing the name 't' into the local scope.
The biggest semantic change is to the way names are looked up at class scope. Currently, the behaviour is somewhat bizarre unless you think in terms of unrolling a loop as a function; there is no way to reference names from the current scope, and you will instead ignore the surrounding class and "reach out" into the next scope outwards (probably global scope).
Out of all the code in the stdlib, the only one that needed changing was in Lib/typing.py, where the above comprehension was found. (Not counting a couple of unit tests whose specific job is to verify this behaviour.) The potential for breakage is extremely low. Non-zero, but far lower than the cost of introducing a new keyword, for instance, which is done without deprecation cycles.
Merely introducing a way to assign as an expression would create bizarre edge cases around comprehensions, though, and to avoid the worst of the confusions, we change the definition of comprehensions, causing some edge cases to be interpreted differently, but maintaining the existing behaviour in the majority of situations.
If it is really true that introducing 'name := expression' requires such a side-effect, then I might oppose it.
It's that comprehensions/genexps are currently bizarre, only people don't usually notice it because there aren't many ways to recognize the situation. Introducing assignment expressions will make the existing weirdnesses more visible.
In any context where arbitrary Python expressions can be used, a named
expression can appear. This is of the form target := expr
where
expr
is any valid Python expression, and target
is any valid
assignment target.
This generalization is different from what you said in the abstract and rationale. No rationale is given. After reading Nick's examination of the generalization, and your response, -1.
Without trying it or looking up any reference documentation, can you tell me whether these statements are legal?
with open(f) as self.file: pass
try: pass except Exception as self.exc: pass
The rationale for assignment to arbitrary targets is the same as for assigning to names: it's useful to be able to assign as an expression.
Most importantly, since :=
is an expression, it can be used in
contexts
where statements are illegal, including lambda functions and
comprehensions.
An assignment statement can assign to multiple targets, left-to-right::
x = y = z = 0
This is a bad example as there is very seldom a reason to assign multiple names, as opposed to multiple targets. Here is a typical real example.
self.x = x = expression
# Use local x in the rest of the method.
In "x = y = 0", x and y likely represent two different concepts (variables) that happen to be initialized with the same value. One could instead write "x,y = 0,0".
Personally, if I need to quickly set a bunch of things to zero or None, I'll use chained assignment. But sure. If you want to, you can repeat the zero. Don't forget that adding or removing a target then also requires that you update the tuple, and that it's not a syntax error to fail to do so.
The equivalent assignment expression
should be a syntax error.
is parsed as separate binary operators,
':=' is not a binary operator, any more than '=' is, as names, and targets in general, are not objects. Neither fetch and operate on the current value, if any, of the name or target. Therefore neither has an 'associativity'.
What would you call it then? I need some sort of word to use.
When a class scope is involved, a naive transformation into a function would prevent name lookups (as the function would behave like a method)::
class X:
names = ["Fred", "Barney", "Joe"]
prefix = "> "
prefixed_names = [prefix + name for name in names]
With Python 3.7 semantics,
I believe in all of 3.x ..
Probably, but that isn't my point.
this will evaluate the outermost iterable at class scope, which will succeed; but it will evaluate everything else in a function::
class X:
names = ["Fred", "Barney", "Joe"]
prefix = "> "
def <listcomp>(iterator):
result = []
for name in iterator:
result.append(prefix + name)
return result
prefixed_names = <listcomp>(iter(names))
The name prefix
is thus searched for at global scope, ignoring the
class
name.
And today it fails. This has nothing to do with adding name assignment expressions.
Fails in what way?
Bottom line: I suggest rewriting again, as indicated, changing title to 'Name Assignment Expressions'.
You're welcome to write a competing proposal :)
I'm much happier promoting a full-featured assignment expression than something that can only be used in a limited set of situations. Is there reason to believe that extensions to the := operator might take it in a different direction? If not, there's very little to lose by permitting any assignment target, and then letting style guides frown on it if they like.
ChrisA
On Wed, Apr 18, 2018 at 10:13:58AM +1000, Chris Angelico wrote:
[regarding comprehensions]
The changes here are only to edge and corner cases, other than as they specifically relate to assignment expressions. The current behaviour is intended to "do the right thing" according to people's expectations, and it largely does so; those cases are not changing. For list comprehensions at global or function scope, the ONLY case that can change (to my knowledge) is where you reuse a variable name:
[t for t in t.__parameters__ if t not in tvars]
This works in 3.7 but will fail easily and noisily (UnboundLocalError) with PEP 572.
That's a major semantic change, and the code you show is no better or worse than:
t = ...
result = []
for t in t.parameters:
if t not in tvars:
result.append(t)
which is allowed. I think you need a better justification for breaking it than merely the opinion:
IMO this is a poor way to write a loop,
Reusing names is permitted in Python. If you're going to break code, that surely needs a future import or deprecation period. As you say yourself, you've already found one example in the standard library that will break.
and the fact that it "happened to work" is on par with code that depended on dict iteration order in Python 3.2 and earlier.
I don't think that's justified. As far as I can tell, the fact that it works is not a mere accident of implementation but a consequence of the promised semantics of comprehensions and Python's scoping rules.
If that's not the case, I think you need to justify exactly why it isn't guaranteed.
Yes, the implementation is well defined, but since you can achieve exactly the same thing by picking a different variable name, it's better to be clear.
Ah, but the aim of the PEP is not to prohibit ugly or unclear code.
Note that the second of the open questions would actually return this to current behaviour, by importing the name 't' into the local scope.
Indeed. Maybe this needs to stop being an open question and become a settled question.
The biggest semantic change is to the way names are looked up at class scope. Currently, the behaviour is somewhat bizarre unless you think in terms of unrolling a loop as a function; there is no way to reference names from the current scope, and you will instead ignore the surrounding class and "reach out" into the next scope outwards (probably global scope).
Out of all the code in the stdlib, the only one that needed changing was in Lib/typing.py, where the above comprehension was found. (Not counting a couple of unit tests whose specific job is to verify this behaviour.)
If there are tests which intentionally verify this behaviour, that really hurts your position that the behaviour is an accident of implementation. It sounds like the behaviour is intended and required.
The potential for breakage is extremely low. Non-zero, but far lower than the cost of introducing a new keyword, for instance, which is done without deprecation cycles.
Which new keywords are you thinking of? The most recent new keywords I can think of were "True/False", "as" and "with".
True, False became keywords in 3.x during the "breaking code is allowed" 2 -> 3 transition;
"as" became a keyword in 2.6 following a deprecation period in 2.5:
py> as = 1
<stdin>:1: Warning: 'as' will become a reserved keyword in Python 2.6
Have I missed any?
-- Steve
On Wed, Apr 18, 2018 at 11:20 AM, Steven D'Aprano steve@pearwood.info wrote:
On Wed, Apr 18, 2018 at 10:13:58AM +1000, Chris Angelico wrote:
[regarding comprehensions]
The changes here are only to edge and corner cases, other than as they specifically relate to assignment expressions. The current behaviour is intended to "do the right thing" according to people's expectations, and it largely does so; those cases are not changing. For list comprehensions at global or function scope, the ONLY case that can change (to my knowledge) is where you reuse a variable name:
[t for t in t.__parameters__ if t not in tvars]
This works in 3.7 but will fail easily and noisily (UnboundLocalError) with PEP 572.
That's a major semantic change, and the code you show is no better or worse than:
t = ...
result = []
for t in t.parameters:
if t not in tvars:
result.append(t)
which is allowed. I think you need a better justification for breaking it than merely the opinion:
IMO this is a poor way to write a loop,
Ah but that isn't what the list comp is equivalent to. If you want to claim that "for t in t.parameters" is legal, you first have to assume that you're overwriting t, not shadowing it. In the list comp as it is today, the "for t in" part is inside an implicit nested function, but the "t.parameters" part is outside that function.
Try this instead:
t = ... def listcomp(): result = [] for t in t.parameters: if t not in tvars: result.append(t) return result listcomp()
Except that it isn't that either, because the scope isn't quite that clean. It actually involves a function parameter, and the iterator is fetched before it's passed as a parameter, and then NOT fetched inside the loop. So you actually can't write perfectly equivalent longhand.
PEP 572 will reduce the edge cases and complexity.
Note that the second of the open questions would actually return this to current behaviour, by importing the name 't' into the local scope.
Indeed. Maybe this needs to stop being an open question and become a settled question.
Okay. The question is open if you wish to answer it. Are you happy with the extra complexity that this would entail? Discuss.
The biggest semantic change is to the way names are looked up at class scope. Currently, the behaviour is somewhat bizarre unless you think in terms of unrolling a loop as a function; there is no way to reference names from the current scope, and you will instead ignore the surrounding class and "reach out" into the next scope outwards (probably global scope).
Out of all the code in the stdlib, the only one that needed changing was in Lib/typing.py, where the above comprehension was found. (Not counting a couple of unit tests whose specific job is to verify this behaviour.)
If there are tests which intentionally verify this behaviour, that really hurts your position that the behaviour is an accident of implementation. It sounds like the behaviour is intended and required.
These changes also broke some tests of disassembly, which quote the exact bytecode created for specific pieces of code. Does that mean that we can't change anything? They're specifically verifying (asserting) the behaviour that currently exists.
I've never tried to claim that this is going to have no impact. Of course it will change things. Otherwise why have a PEP?
The potential for breakage is extremely low. Non-zero, but far lower than the cost of introducing a new keyword, for instance, which is done without deprecation cycles.
Which new keywords are you thinking of? The most recent new keywords I can think of were "True/False", "as" and "with".
async, await? They became "soft keywords" and then full keywords, but that's not exactly a deprecation period.
rosuav@sikorsky:~$ python3.5 -c "await = 1; print(await)" 1 rosuav@sikorsky:~$ python3.6 -c "await = 1; print(await)" 1 rosuav@sikorsky:~$ python3.7 -c "await = 1; print(await)" File "<string>", line 1 await = 1; print(await) ^ SyntaxError: invalid syntax
The shift from 3.6 to 3.7 breaks any code that uses 'async' or 'await' as an identifier. And that kind of breakage CAN happen in a minor release. Otherwise, it would be virtually impossible to improve anything in the language.
PEP 572 will make changes. The result of these changes will be fewer complex or unintuitive interactions between different features in Python, most notably comprehensions/genexps and class scope. It also makes the transformation from list comp to external function more accurate and easier to understand. For normal usage, the net result will be the same, but the differences are visible if you actually probe for them.
ChrisA
On Tue, Apr 17, 2018 at 7:04 PM, Chris Angelico rosuav@gmail.com wrote:
On Wed, Apr 18, 2018 at 11:20 AM, Steven D'Aprano steve@pearwood.info wrote:
On Wed, Apr 18, 2018 at 10:13:58AM +1000, Chris Angelico wrote:
[regarding comprehensions]
The changes here are only to edge and corner cases, other than as they specifically relate to assignment expressions. The current behaviour is intended to "do the right thing" according to people's expectations, and it largely does so; those cases are not changing. For list comprehensions at global or function scope, the ONLY case that can change (to my knowledge) is where you reuse a variable name:
[t for t in t.__parameters__ if t not in tvars]
This works in 3.7 but will fail easily and noisily (UnboundLocalError) with PEP 572.
That's a major semantic change, and the code you show is no better or worse than:
t = ...
result = []
for t in t.parameters:
if t not in tvars:
result.append(t)
which is allowed. I think you need a better justification for breaking it than merely the opinion:
IMO this is a poor way to write a loop,
Ah but that isn't what the list comp is equivalent to. If you want to claim that "for t in t.parameters" is legal, you first have to assume that you're overwriting t, not shadowing it. In the list comp as it is today, the "for t in" part is inside an implicit nested function, but the "t.parameters" part is outside that function.
Try this instead:
t = ... def listcomp(): result = [] for t in t.parameters: if t not in tvars: result.append(t) return result listcomp()
Except that it isn't that either, because the scope isn't quite that clean. It actually involves a function parameter, and the iterator is fetched before it's passed as a parameter, and then NOT fetched inside the loop. So you actually can't write perfectly equivalent longhand.
PEP 572 will reduce the edge cases and complexity.
I can't tell from this what the PEP actually says should happen in that example. When I first saw it I thought "Gaah! What a horrible piece of code." But it works today, and people's code will break if we change its meaning.
However we won't have to break that. Suppose the code is (perversely)
t = range(3) a = [t for t in t if t]
If we translate this to
t = range(3) def listcomp(t=t): a = [] for t in t: if t: a.append(t) return a a = listcomp()
Then it will still work. The trick will be to recognize "imported" names that are also assigned and capture those (as well as other captures as already described in the PEP).
-- --Guido van Rossum (python.org/~guido)
On Wed, Apr 18, 2018 at 11:58 PM, Guido van Rossum guido@python.org wrote:
I can't tell from this what the PEP actually says should happen in that example. When I first saw it I thought "Gaah! What a horrible piece of code." But it works today, and people's code will break if we change its meaning.
However we won't have to break that. Suppose the code is (perversely)
t = range(3) a = [t for t in t if t]
If we translate this to
t = range(3) def listcomp(t=t): a = [] for t in t: if t: a.append(t) return a a = listcomp()
Then it will still work. The trick will be to recognize "imported" names that are also assigned and capture those (as well as other captures as already described in the PEP).
That can be done. However, this form of importing will have one of two consequences:
1) Referencing an unbound name will scan to outer scopes at run time, changing the semantics of Python name lookups 2) Genexps will eagerly evaluate a lookup if it happens to be the same name as an internal iteration variable.
Of the two, #2 is definitely my preference, but it does mean more eager binding. While this won't make a difference in the outermost iterable (since that's already eagerly bound), it might make a difference with others:
t = range(3) gen = (t for _ in range(1) for t in t if t) t = [4, 5, 6] print(next(gen)) print(next(gen))
Current semantics: UnboundLocalError on first next() call.
PEP 572 semantics: Either UnboundLocalError (with current reference implementation) or it yields 1 and 2 (with eager lookups).
So either we change things for the outermost iterable, or we change things for everything BUT the outermost iterable. Either way, I'm happy to eliminate the special-casing of the outermost iterable. Yes, it's a change in semantics, but a change that removes special cases is generally better than one that creates them.
ChrisA
On Wed, Apr 18, 2018 at 7:35 AM, Chris Angelico rosuav@gmail.com wrote:
On Wed, Apr 18, 2018 at 11:58 PM, Guido van Rossum guido@python.org wrote:
I can't tell from this what the PEP actually says should happen in that example. When I first saw it I thought "Gaah! What a horrible piece of code." But it works today, and people's code will break if we change its meaning.
However we won't have to break that. Suppose the code is (perversely)
t = range(3) a = [t for t in t if t]
If we translate this to
t = range(3) def listcomp(t=t): a = [] for t in t: if t: a.append(t) return a a = listcomp()
Then it will still work. The trick will be to recognize "imported" names that are also assigned and capture those (as well as other captures as already described in the PEP).
That can be done. However, this form of importing will have one of two consequences:
1) Referencing an unbound name will scan to outer scopes at run time, changing the semantics of Python name lookups
I'm not even sure what this would do.
2) Genexps will eagerly evaluate a lookup if it happens to be the same name as an internal iteration variable.
I think we would have to specify this more precisely.
Let's say by "eagerly evaluate a lookup" you mean "include it in the
function parameters with a default value being the lookup (i.e. starting in
the outer scope), IOW "t=t" as I showed above. The question is when we
would do this. IIUC the PEP already does this if the "outer scope" is a
class scope for any names that a simple static analysis shows are
references to variables in the class scope. (I don't know exactly what this
static analysis should do but it could be as simple as gathering all names
that are assigned to in the class, or alternatively all names assigned to
before the point where the comprehension occurs. We shouldn't be distracted
by dynamic definitions like exec()
although we should perhaps be aware of
del
.)
My proposal is to extend this static analysis for certain loop control variables (any simple name assigned to in a for-clause in the comprehension), regardless of what kind of scope the outer scope is. If the outer scope is a function we already know how to do this. If it's a class we use the analysis referred to above. If the outer scope is the global scope we have to do something new. I propose to use the same simple static analysis we use for class scopes.
Furthermore I propose to only do this for the loop control variable(s) of the outermost for-clause, since that's the only place where without all this rigmarole we would have a clear difference in behavior with Python 3.7 in cases like [t for t in t]. Oh, and probably we only need to do this if that loop control variable is also used as an expression in the iterable (so we don't waste time doing any of this for e.g. [t for t in q]).
(But what about [t for _ in t for t in t]? That's currently an UnboundLocalError and we shouldn't try to "fix" that case.)
Since we now have once again introduced an exception for the outermost loop control variable and the outermost iterable, we can consider doing this only as a temporary measure. We could have a goal to eventually make [t for t in t] fail, and in the meantime we would deprecate it -- e.g. in 3.8 a silent deprecation, in 3.9 a noisy one, in 3.10 break it. Yes, that's a lot of new static analysis for deprecating an edge case, but it seems reasonable to want to preserve backward compatibility when breaking this edge case since it's likely not all that uncommon. Even if most occurrences are bad style written by lazy programmers, we should not break working code, if it is reasonable to expect that it's relied upon in real code.
Of the two, #2 is definitely my preference, but it does mean more eager binding.While this won't make a difference in the outermost iterable (since that's already eagerly bound), it might make a difference with others:
t = range(3) gen = (t for _ in range(1) for t in t if t) t = [4, 5, 6] print(next(gen)) print(next(gen))
I don't like this particular example, because it uses an obscure bit of semantics of generator expressions. It's fine to demonstrate the finer details of how those work, but it's unlikely to see real code relying on this. (As I argued before, generator expressions are typically either fed into other code that eagerly evaluates them before reaching the next line, or returned from a function, and in the latter case intentional modification of some variable in that function's scope to affect the meaning of the generator expression would seem a remote possibility at best, and an accident waiting to happen at worst.)
Current semantics: UnboundLocalError on first next() call.
PEP 572 semantics: Either UnboundLocalError (with current reference implementation) or it yields 1 and 2 (with eager lookups).
So either we change things for the outermost iterable, or we change things for everything BUT the outermost iterable. Either way, I'm happy to eliminate the special-casing of the outermost iterable. Yes, it's a change in semantics, but a change that removes special cases is generally better than one that creates them.
Hopefully my proposal above satisfies you.
-- --Guido van Rossum (python.org/~guido)
On Thu, Apr 19, 2018 at 2:18 AM, Guido van Rossum guido@python.org wrote:
On Wed, Apr 18, 2018 at 7:35 AM, Chris Angelico rosuav@gmail.com wrote: >
On Wed, Apr 18, 2018 at 11:58 PM, Guido van Rossum guido@python.org wrote:
I can't tell from this what the PEP actually says should happen in that example. When I first saw it I thought "Gaah! What a horrible piece of code." But it works today, and people's code will break if we change its meaning.
However we won't have to break that. Suppose the code is (perversely)
t = range(3) a = [t for t in t if t]
If we translate this to
t = range(3) def listcomp(t=t): a = [] for t in t: if t: a.append(t) return a a = listcomp()
Then it will still work. The trick will be to recognize "imported" names that are also assigned and capture those (as well as other captures as already described in the PEP).
That can be done. However, this form of importing will have one of two consequences:
1) Referencing an unbound name will scan to outer scopes at run time, changing the semantics of Python name lookups
I'm not even sure what this would do.
The implicit function of the listcomp would attempt to LOAD_FAST 't', and upon finding that it doesn't have a value for it, would go and look for the name 't' in a surrounding scope. (Probably LOAD_CLOSURE.)
2) Genexps will eagerly evaluate a lookup if it happens to be the same name as an internal iteration variable.
I think we would have to specify this more precisely.
Let's say by "eagerly evaluate a lookup" you mean "include it in the function parameters with a default value being the lookup (i.e. starting in the outer scope), IOW "t=t" as I showed above.
Yes. To be technically precise, there's no default argument involved, and the call to the implicit function explicitly passes all the arguments.
The question is when we would do this. IIUC the PEP already does this if the "outer scope" is a class scope for any names that a simple static analysis shows are references to variables in the class scope.
Correct.
(I don't know exactly what this static
analysis should do but it could be as simple as gathering all names that are
assigned to in the class, or alternatively all names assigned to before the
point where the comprehension occurs. We shouldn't be distracted by dynamic
definitions like exec()
although we should perhaps be aware of
del
.)
At the moment, it isn't aware of 'del'. The analysis is simple and 100% static: If a name is in the table of names the class uses AND it's in the table of names the comprehension uses, it gets passed as a parameter. I don't want to try to be aware of del, because of this:
class X: x = 1 if y: del x print(x) z = (q for q in x if q)
If y is true, this will eagerly look up x using the same semantics in both the print and the genexp (on construction, not when you iterate over the genexp). If y is false, it'll still eagerly look up x, and it'll still use the same semantics for print and the genexp (and it'll find an 'x' in a surrounding scope).
(The current implementation actually is a bit different from that. I'm not sure whether it's possible to do it as simply as given without an extra compilation pass. But it's close enough.)
My proposal is to extend this static analysis for certain loop control variables (any simple name assigned to in a for-clause in the comprehension), regardless of what kind of scope the outer scope is. If the outer scope is a function we already know how to do this. If it's a class we use the analysis referred to above. If the outer scope is the global scope we have to do something new. I propose to use the same simple static analysis we use for class scopes.
Furthermore I propose to only do this for the loop control variable(s) of the outermost for-clause, since that's the only place where without all this rigmarole we would have a clear difference in behavior with Python 3.7 in cases like [t for t in t]. Oh, and probably we only need to do this if that loop control variable is also used as an expression in the iterable (so we don't waste time doing any of this for e.g. [t for t in q]).
Okay. Here's something that would be doable:
If the name is written to within the comprehension, AND it is read from in the outermost iterable, it is flagged early-bind.
I'll have to try implementing that to be sure, but it should be possible I think. It would cover a lot of cases, keeping them the same as we currently have.
Since we now have once again introduced an exception for the outermost loop control variable and the outermost iterable, we can consider doing this only as a temporary measure. We could have a goal to eventually make [t for t in t] fail, and in the meantime we would deprecate it -- e.g. in 3.8 a silent deprecation, in 3.9 a noisy one, in 3.10 break it. Yes, that's a lot of new static analysis for deprecating an edge case, but it seems reasonable to want to preserve backward compatibility when breaking this edge case since it's likely not all that uncommon. Even if most occurrences are bad style written by lazy programmers, we should not break working code, if it is reasonable to expect that it's relied upon in real code.
Fair enough. So the outermost iterable remains special for a short while, with deprecation.
I'll get onto the coding side of it during my Copious Free Time, hopefully this week some time.
Here's hoping!
ChrisA
On Apr 18, 2018, at 11:17, Chris Angelico rosuav@gmail.com wrote:
At the moment, it isn't aware of 'del’.
I don’t know if it’s relevant to the current discussion, but don’t forget about implicit dels:
def foo(): x = 1 try: 1/0 except ZeroDivisionError as x: pass print(x)
This is one of my favorite Python oddities because it always makes me look like a genius when I diagnose it. :)
-Barry
On Thu, Apr 19, 2018 at 6:26 AM, Barry Warsaw barry@python.org wrote:
On Apr 18, 2018, at 11:17, Chris Angelico rosuav@gmail.com wrote:
At the moment, it isn't aware of 'del’.
I don’t know if it’s relevant to the current discussion, but don’t forget about implicit dels:
def foo(): x = 1 try: 1/0 except ZeroDivisionError as x: pass print(x)
This is one of my favorite Python oddities because it always makes me look like a genius when I diagnose it. :)
Heh, yeah. My intention is to ignore that altogether. The general policy in Python is "if ever it MIGHT be assigned to, it belongs to that scope" (so even "if 0: x = 1" will mark x as local), so sticking to that would mean treating x as local regardless of the try/except.
(On an unrelated subject, I'm keeping the "sublocal scope" concept from the original PEP on ice. It might be worth implementing exception name binding with a sublocal name. But that's for a completely separate PEP.)
ChrisA
On Wed, Apr 18, 2018 at 11:17 AM, Chris Angelico rosuav@gmail.com wrote:
On Thu, Apr 19, 2018 at 2:18 AM, Guido van Rossum guido@python.org wrote:
On Wed, Apr 18, 2018 at 7:35 AM, Chris Angelico rosuav@gmail.com wrote: >
On Wed, Apr 18, 2018 at 11:58 PM, Guido van Rossum guido@python.org wrote:
I can't tell from this what the PEP actually says should happen in that example. When I first saw it I thought "Gaah! What a horrible piece of code." But it works today, and people's code will break if we change its meaning.
However we won't have to break that. Suppose the code is (perversely)
t = range(3) a = [t for t in t if t]
If we translate this to
t = range(3) def listcomp(t=t): a = [] for t in t: if t: a.append(t) return a a = listcomp()
Then it will still work. The trick will be to recognize "imported" names that are also assigned and capture those (as well as other captures as already described in the PEP).
That can be done. However, this form of importing will have one of two consequences:
1) Referencing an unbound name will scan to outer scopes at run time, changing the semantics of Python name lookups
I'm not even sure what this would do.
The implicit function of the listcomp would attempt to LOAD_FAST 't', and upon finding that it doesn't have a value for it, would go and look for the name 't' in a surrounding scope. (Probably LOAD_CLOSURE.)
We agree that that's too dynamic to be explainable.
2) Genexps will eagerly evaluate a lookup if it happens to be the same name as an internal iteration variable.
I think we would have to specify this more precisely.
Let's say by "eagerly evaluate a lookup" you mean "include it in the function parameters with a default value being the lookup (i.e. starting in the outer scope), IOW "t=t" as I showed above.
Yes. To be technically precise, there's no default argument involved, and the call to the implicit function explicitly passes all the arguments.
OK, and the idea is the same -- it's explicitly evaluated in the outer scope either way.
The question is when we would do this. IIUC the PEP already does this if the "outer scope" is a class scope for any names that a simple static analysis shows are references to variables in the class scope.
Correct.
(I don't know exactly what this static
analysis should do but it could be as simple as gathering all names that
are
assigned to in the class, or alternatively all names assigned to before
the
point where the comprehension occurs. We shouldn't be distracted by
dynamic
definitions like exec()
although we should perhaps be aware of
del
.)
At the moment, it isn't aware of 'del'. The analysis is simple and 100% static: If a name is in the table of names the class uses AND it's in the table of names the comprehension uses, it gets passed as a parameter. I don't want to try to be aware of del, because of this:
class X: x = 1 if y: del x print(x) z = (q for q in x if q)
If y is true, this will eagerly look up x using the same semantics in both the print and the genexp (on construction, not when you iterate over the genexp). If y is false, it'll still eagerly look up x, and it'll still use the same semantics for print and the genexp (and it'll find an 'x' in a surrounding scope).
(The current implementation actually is a bit different from that. I'm not sure whether it's possible to do it as simply as given without an extra compilation pass. But it's close enough.)
Yeah, I threw 'del' in there mostly so we wouldn't get too confident. I see a fair amount of this:
d = {} for x, y in blah(): d[x] = y del x, y
My proposal is to extend this static analysis for certain loop control variables (any simple name assigned to in a for-clause in the comprehension), regardless of what kind of scope the outer scope is. If the outer scope is a function we already know how to do this. If it's a class we use the analysis referred to above. If the outer scope is the global scope we have to do something new. I propose to use the same simple static analysis we use for class scopes.
Furthermore I propose to only do this for the loop control variable(s) of the outermost for-clause, since that's the only place where without all this rigmarole we would have a clear difference in behavior with Python 3.7 in cases like [t for t in t]. Oh, and probably we only need to do this if that loop control variable is also used as an expression in the iterable (so we don't waste time doing any of this for e.g. [t for t in q]).
Okay. Here's something that would be doable:
If the name is written to within the comprehension, AND it is read from in the outermost iterable, it is flagged early-bind.
OK, that's close enough to what I am looking for that I don't think it matters.
I'll have to try implementing that to be sure, but it should be possible I think. It would cover a lot of cases, keeping them the same as we currently have.
Since we now have once again introduced an exception for the outermost loop control variable and the outermost iterable, we can consider doing this only as a temporary measure. We could have a goal to eventually make [t for t in t] fail, and in the meantime we would deprecate it -- e.g. in 3.8 a silent deprecation, in 3.9 a noisy one, in 3.10 break it. Yes, that's a lot of new static analysis for deprecating an edge case, but it seems reasonable to want to preserve backward compatibility when breaking this edge case since it's likely not all that uncommon. Even if most occurrences are bad style written by lazy programmers, we should not break working code, if it is reasonable to expect that it's relied upon in real code.
Fair enough. So the outermost iterable remains special for a short while, with deprecation.
I'll get onto the coding side of it during my Copious Free Time, hopefully this week some time.
Here's hoping!
Don't get your hopes up too high. A lot of respectable core devs have expressed a -1.
-- --Guido van Rossum (python.org/~guido)
On 19 April 2018 at 02:18, Guido van Rossum guido@python.org wrote:
On Wed, Apr 18, 2018 at 7:35 AM, Chris Angelico rosuav@gmail.com wrote: >
On Wed, Apr 18, 2018 at 11:58 PM, Guido van Rossum guido@python.org wrote: 2) Genexps will eagerly evaluate a lookup if it happens to be the same name as an internal iteration variable.
I think we would have to specify this more precisely.
Let's say by "eagerly evaluate a lookup" you mean "include it in the
function parameters with a default value being the lookup (i.e. starting in
the outer scope), IOW "t=t" as I showed above. The question is when we
would do this. IIUC the PEP already does this if the "outer scope" is a
class scope for any names that a simple static analysis shows are references
to variables in the class scope. (I don't know exactly what this static
analysis should do but it could be as simple as gathering all names that are
assigned to in the class, or alternatively all names assigned to before the
point where the comprehension occurs. We shouldn't be distracted by dynamic
definitions like exec()
although we should perhaps be aware of
del
.)
My proposal is to extend this static analysis for certain loop control variables (any simple name assigned to in a for-clause in the comprehension), regardless of what kind of scope the outer scope is. If the outer scope is a function we already know how to do this. If it's a class we use the analysis referred to above. If the outer scope is the global scope we have to do something new. I propose to use the same simple static analysis we use for class scopes.
Furthermore I propose to only do this for the loop control variable(s) of the outermost for-clause, since that's the only place where without all this rigmarole we would have a clear difference in behavior with Python 3.7 in cases like [t for t in t]. Oh, and probably we only need to do this if that loop control variable is also used as an expression in the iterable (so we don't waste time doing any of this for e.g. [t for t in q]).
I'm not sure the symtable pass is currently clever enough to make these kinds of distinctions - it's pretty thoroughly block scope oriented. (Although I guess it does know enough now to treat the outermost comprehension as being part of the surrounding scope in terms of where names are referenced, so it might be feasible to adapt that logic to enable the eager binding you're describing).
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Tue, Apr 17, 2018 at 6:20 PM, Steven D'Aprano steve@pearwood.info wrote:
If there are tests which intentionally verify this behaviour, that really hurts your position that the behaviour is an accident of implementation. It sounds like the behaviour is intended and required.
It is nonetheless bizarre and unexpected behavior.
prefix = 'global' [prefix+c for c in 'abc'] ['globala', 'globalb', 'globalc']
def func(): ... prefix = 'local' ... print([prefix+c for c in 'abc']) func() ['locala', 'localb', 'localc']
class klass: ... prefix = 'classy' ... items = [prefix+c for c in 'abc'] print(klass.items) ['globala', 'globalb', 'globalc']
In Python 2, that last one would produce 'classya' and friends, due to the "broken" comprehension scope.
I'd like to break a lance for PEP 572.
I read that in the bad old days Python used to have a "=" operator in expressions that had the meaning of today's "==". Perhaps there were other reasons, but this choice alone meant that assignment (that was using the same token) could not be made an expression.
When the equality comparison operator was changed to "==", nothing changed for assignments which remained statements.
Let's imagine, just for a moment, that Python is redesigned from scratch. Would it make sense to make assignment an expression? (For clarity an operator that is different from "=" would be used, say ":=".)
I would say that the answer to the above question depends on whether assignment expressions can (a) fully replace assignment statements (so that both don't need to coexist) and (b) are useful and not too prone to abuse. I will try to answer both questions separately.
It seems to me that assignment expressions can be defined in a way that is fully analogous with the assignment statements of today's Python (and that consequently is equivalent in confusion potential):
The "value" of any assignment statement from today's Python can be defined as the value that will be captured when "__value__ =" is prepended to that statement.
So what should be the value of 'result' after the following snippet?
a := list() b := list() result := (a[:] := b[:] := iter(range(3)))
It seems to me that it should be the same as with today's
result = a[:] = b[:] = iter(range(3))
Accepting this convention would mean that parens in assignment expressions would behave differently from C. For example
result := (a[:] := (b[:] := iter(range(3))))
would produce a different 'result'. But I think that's OK, just like it's OK that the Python expression
-1 < 0 < 1
is not equivalent to
(-1 < 0) < 1
The question remains whether assignment expressions are actually useful and not too prone to abuse? I think that the situation is similar to the "or" and "and" operators when used for their values. The snippet
value = dic.get(key) or default
could be replaced by an if clause, so there's no "one obvious way to do it" here. Still, many will agree that the above one-liner is nicer and more expressive. It wouldn't be possible if there were no boolean expressions and the use of "or" was limited to within if and while statements.
In a similar way assignment expressions have many uses that are not absolutely needed but make code more expressive. The danger of abuse is also similar to "and" and "or" where it's also possible to write expressions that are difficult to understand.
Here's an example from real life where the existence of assignment expressions would crucially simplify an API. I'm working on a library that has "learners" that are driven by "runners". The simplest possible synchronous runner looks like this:
def sync_runner(learner, f, static_hint): while True: points = learner.get(static_hint) if not points: break learner.feed(f(points))
(Note that more advanced runners supply a "hint" that dynamically depends on currently available resources for example.)
With assignment expressions the body of the above function could be simplified to
while points := learner.get(static_hint): learner.feed(f(points))
making it crucially simpler. Using the learner API becomes so natural that a 'sync_runner()' function seems not even necessary. This API could be even adopted by other learner-providing libraries that want to support being driven by advanced asynchronous runners but would like to remain easily usable directly.
Surely there are other uses of similar idioms.
Perhaps you agree with me that assignment should be an expression if Python was to be redesigned from scratch, but that is not going to happen. But couldn't Python simply introduce ":=" as described above and keep "=" for backwards compatibility?
New users of Python could by taught to use ":=" everywhere. Old users could either convert their code base, or ignore the addition. There's no problem in mixing both old and new style code.
Like with other expressive features, there's potential for confusion, but I think that it's limited. Wouldn't it be a pity not to liberate assignments from their boring statement existence?
Cheers, Christoph
Christoph Groth writes:
Wouldn't it be a pity not to liberate assignments from their boring statement existence?
Maybe not. While it would be nice to solve the loop-and-a-half "problem" and the loop variable initialization "problem" (not everyone agrees these are even problems, especially now that we have comprehensions and generator expressions), as a matter of taste I like the fact that this particular class of side effects is given weighty statement syntax rather than more lightweight expression syntax.
That is, I find statement syntax more readable.
Steve
-- Associate Professor Division of Policy and Planning Science http://turnbull/sk.tsukuba.ac.jp/ Faculty of Systems and Information Email: turnbull@sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN
On Fri, Apr 20, 2018 at 1:30 PM, Stephen J. Turnbull
turnbull.stephen.fw@u.tsukuba.ac.jp wrote:
Christoph Groth writes:
Wouldn't it be a pity not to liberate assignments from their boring statement existence?
Maybe not. While it would be nice to solve the loop-and-a-half "problem" and the loop variable initialization "problem" (not everyone agrees these are even problems, especially now that we have comprehensions and generator expressions), as a matter of taste I like the fact that this particular class of side effects is given weighty statement syntax rather than more lightweight expression syntax.
That is, I find statement syntax more readable.
If you've read the PEP, you'll see that it encourages the use of assignment statements whereever possible. If statement syntax is generally more readable, by all means, use it. That doesn't mean there aren't situations where the expression syntax is FAR more readable.
Tell me, is this "more readable" than a loop with an actual condition in it?
def sqrt(n): guess, nextguess = 1, n while True: if math.isclose(guess, nextguess): return guess guess = nextguess nextguess = n / guess
Readable doesn't mean "corresponds closely to its disassembly", despite the way many people throw the word around. It also doesn't mean "code I like", as opposed to "code I don't like". The words for those concepts are "strongly typed" and "dynamically typed", as have been demonstrated through MANY online discussions. (But I digress.) Readable code is code which expresses an algorithm, expresses the programmer's intent. It adequately demonstrates something at a higher abstraction level. Does the algorithm demonstrated here include an infinite loop? No? Then it shouldn't have "while True:" in it.
Now, this is a pretty obvious example. I deliberately wrote it so you could simply lift the condition straight into the while header. And I hope that everyone here agrees that this would be an improvement:
def sqrt(n): guess, nextguess = 1, n while not math.isclose(guess, nextguess): guess = nextguess nextguess = n / guess return guess
But what if the condition were more complicated?
def read_document(file): doc = "" while (token := file.get_next_token()) != "END": doc += token return doc
The loop condition is "while the token is not END", or "while get_next_token() doesn't return END", depending on your point of view. Is it "more readable" to put that condition into the while header, or to use an infinite loop and a break statement, or to duplicate a line of code before the loop and at the bottom of the loop? Which one best expresses the programmer's intention?
ChrisA
On 20 April 2018 at 07:46, Chris Angelico rosuav@gmail.com wrote:
On Fri, Apr 20, 2018 at 1:30 PM, Stephen J. Turnbull
turnbull.stephen.fw@u.tsukuba.ac.jp wrote:
Christoph Groth writes:
Wouldn't it be a pity not to liberate assignments from their boring statement existence?
Maybe not. While it would be nice to solve the loop-and-a-half "problem" and the loop variable initialization "problem" (not everyone agrees these are even problems, especially now that we have comprehensions and generator expressions), as a matter of taste I like the fact that this particular class of side effects is given weighty statement syntax rather than more lightweight expression syntax.
That is, I find statement syntax more readable.
If you've read the PEP, you'll see that it encourages the use of assignment statements whereever possible. If statement syntax is generally more readable, by all means, use it. That doesn't mean there aren't situations where the expression syntax is FAR more readable.
Tell me, is this "more readable" than a loop with an actual condition in it?
def sqrt(n): guess, nextguess = 1, n while True: if math.isclose(guess, nextguess): return guess guess = nextguess nextguess = n / guess
Readable doesn't mean "corresponds closely to its disassembly", despite the way many people throw the word around. It also doesn't mean "code I like", as opposed to "code I don't like". The words for those concepts are "strongly typed" and "dynamically typed", as have been demonstrated through MANY online discussions. (But I digress.) Readable code is code which expresses an algorithm, expresses the programmer's intent. It adequately demonstrates something at a higher abstraction level. Does the algorithm demonstrated here include an infinite loop? No? Then it shouldn't have "while True:" in it.
Thanks Chris - this is a very good explanation of how we can (somewhat) objectively look at "readability", and not one I'd really considered before. It's also an extremely good argument (IMO) that the loop-and-a-half construct would benefit from improvement.
In my opinion, it's only partially related to the assignment expression discussion, though. Yes, assignment expressions "solve" the loop-and-a-half situation. I'm unsure how much I like the look of the resulting code, but I concede that's a "code I like" vs "code I don't like" situation. But assignment expressions are much more general than that, and as a general construct, they should be evaluated based on how many problems like this they solve, and whether the downsides justify it. We've already had the comprehension use case marginalised as no longer being a key use case for the proposal, because they weren't as "obviously" improved as some people had hoped. So overall, I think assignment expressions have proved to be a bit useful in some cases, and less so in others.
Clearly any proposal can be picked to death with enough time to look for flaws. And part of the Zen is "Now is better than never". But I think in this case, "Although never is often better than right now" applies - we've had some very productive discussions, and you've done an incredible job of managing them and capturing the results, but it feels to me that the overall result is that there's likely a better solution still out there, that needs a new intuition to solve.
Now, this is a pretty obvious example. I deliberately wrote it so you could simply lift the condition straight into the while header. And I hope that everyone here agrees that this would be an improvement:
def sqrt(n): guess, nextguess = 1, n while not math.isclose(guess, nextguess): guess = nextguess nextguess = n / guess return guess
But what if the condition were more complicated?
def read_document(file): doc = "" while (token := file.get_next_token()) != "END": doc += token return doc
The loop condition is "while the token is not END", or "while get_next_token() doesn't return END", depending on your point of view. Is it "more readable" to put that condition into the while header, or to use an infinite loop and a break statement, or to duplicate a line of code before the loop and at the bottom of the loop? Which one best expresses the programmer's intention?
The version that captures the value and tests it. I agree completely here. But we do have other options:
def read_document(file): doc = "" for token in token_stream(file, terminator="END"): doc += token return doc
(This point about rewriting to use for and an iterator applies to Chris Barker's fp.readline() example as well).
Sure, token_stream might need a loop-and-a-half internally[1]. But from the user's point of view that's "low level" code, so not so important (ultimately, this is all about abstracting intent). And people maybe aren't used to writing "helper" iterators quite this freely, but that's a matter of education. So agreed - assignment expressions help with loop-and-a-half constructs. But we have other ways of dealing with them, so that's a relatively manageable situation.
It's still all about cost-benefit trade-offs, with no clear winner (in my view).
Paul
[1] Although actually not - in this case, iter(file.get_next_token, 'END') is exactly what you need. But I concede that it's possible to demonstrate examples where that isn't the case.
On Fri, Apr 20, 2018 at 6:32 PM, Paul Moore p.f.moore@gmail.com wrote:
def read_document(file): doc = "" while (token := file.get_next_token()) != "END": doc += token return doc
The loop condition is "while the token is not END", or "while get_next_token() doesn't return END", depending on your point of view. Is it "more readable" to put that condition into the while header, or to use an infinite loop and a break statement, or to duplicate a line of code before the loop and at the bottom of the loop? Which one best expresses the programmer's intention?
The version that captures the value and tests it. I agree completely here. But we do have other options:
def read_document(file): doc = "" for token in token_stream(file, terminator="END"): doc += token return doc
(This point about rewriting to use for and an iterator applies to Chris Barker's fp.readline() example as well).
Sure, token_stream might need a loop-and-a-half internally[1]. But from the user's point of view that's "low level" code, so not so important (ultimately, this is all about abstracting intent). And people maybe aren't used to writing "helper" iterators quite this freely, but that's a matter of education. So agreed - assignment expressions help with loop-and-a-half constructs. But we have other ways of dealing with them, so that's a relatively manageable situation.
It's still all about cost-benefit trade-offs, with no clear winner (in my view).
You can always add another level of indirection. Always. Pushing something off into another function is helpful ONLY if you can usefully name that function, such that anyone who's reading the calling code can ignore the function and know everything they need to know about it. Otherwise, all you've done is force them to look elsewhere for the code. A bunch of single-use helper functions does not generally improve a module.
[1] Although actually not - in this case, iter(file.get_next_token, 'END') is exactly what you need. But I concede that it's possible to demonstrate examples where that isn't the case.
The easiest way is to pick any comparison other than equality. If you want "is" / "is not", or if there are several termination conditions (so the check is "in {x, y, z}"), iter() can't help you.
ChrisA
Steven Turnbull wrote:
Christoph Groth writes:
Wouldn't it be a pity not to liberate assignments from their boring statement existence?
Maybe not. While it would be nice to solve the loop-and-a-half "problem" and the loop variable initialization "problem" (not everyone agrees these are even problems, especially now that we have comprehensions and generator expressions), as a matter of taste I like the fact that this particular class of side effects is given weighty statement syntax rather than more lightweight expression syntax.
I think that this is the crucial point. If it is indeed a design principle of Python that expressions should not have the side-effect of assigning names, than the whole discussion of PEP 572 could have been stopped early on. I didn't have this impression since core devs participated constructively in the discussion.
That is, I find statement syntax more readable.
Many people will agree that := is more readable when used in cases where it's meant to be used (as listed in the PEP). Your objection seems to refer to the potential for "clever" misuse, like
i := (a := list(iterator)).index(elem)
instead of
a := list(iterator) i := a.index(elem)
I think that the ":=" syntax catches the eye and makes it easy to spot even hidden assignment expressions that shouldn't have been used.
Note that the proposed syntax can be actually more readable even when used as a statement, like in
equal := a == b
Personally, I even slightly prefer
a := 3
to the commonplace
a = 3
because it visually expresses the asymmetry of the operation. (And no, Turbo Pascal was not my first programming language. :-)
Christoph
On 20 April 2018 at 12:25, Christoph Groth christoph@grothesque.org wrote:
Maybe not. While it would be nice to solve the loop-and-a-half "problem" and the loop variable initialization "problem" (not everyone agrees these are even problems, especially now that we have comprehensions and generator expressions), as a matter of taste I like the fact that this particular class of side effects is given weighty statement syntax rather than more lightweight expression syntax.
I think that this is the crucial point. If it is indeed a design principle of Python that expressions should not have the side-effect of assigning names, than the whole discussion of PEP 572 could have been stopped early on. I didn't have this impression since core devs participated constructively in the discussion.
I don't think it's a "design principle" as such, but it is true that until this point, functions with side effects excluded, Python's expression syntax has not included any means of assigning names, and that's expected behaviour (in the sense that when looking for where a name could have been bound, Python programmers do not look closely at the detail of expressions, because they can't be the source of an assignment). This PEP explicitly changes that, and that's a fairly radical change from current expectations.
As to why core devs participated in the discussion, I can't speak for anyone else but for me:
Paul
On Fri, Apr 20, 2018 at 10:51 PM, Paul Moore p.f.moore@gmail.com wrote:
Depending on your definition of "assignment", a lambda function could count as a means of assigning a variable in a subexpression. But yes, there is no convenient way to assign to something in a wider scope.
ChrisA
On 21 April 2018 at 01:49, Chris Angelico rosuav@gmail.com wrote:
On Fri, Apr 20, 2018 at 10:51 PM, Paul Moore p.f.moore@gmail.com wrote:
Depending on your definition of "assignment", a lambda function could count as a means of assigning a variable in a subexpression. But yes, there is no convenient way to assign to something in a wider scope.
We used to sort of have one (Python 2 list comprehensions), and the behaviour was sufficiently unpopular that Py3 introduced an implicitly nested scope to keep the iteration variable name from leaking :)
That history is a non-trivial part of why I advocated more strongly for the original sublocal scoping version of the proposal: with tighter lexical scoping for expression level assignments, I think we'd be able to avoid most of the downsides that come from reinstating the ability for expressions to bind and rebind names.
However, given how those original discussions went, I now think the only way that option might be successfully pitched to people would be to propose a statement-level variant of it first (perhaps in the form of a heavily revised variant of PEP 3150's given clause), and then only propose "expression level name binding with implicit sublocal scopes" after the semantics of sublocal scoping were already established elsewhere.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Personally, I even slightly prefer
a := 3
to the commonplace
a = 3 because it visually expresses the asymmetry of the operation.
Careful here! That’s a fine argument for using := in a new language, but people using := when they don’t need an expression because they like the symbol better is a reason NOT to do this.
And yes, normally aspirated Pascal WAS my first programming language. :-)
-CHB
On 04/20/2018 08:07 AM, Chris Barker - NOAA Federal wrote:
On 04/20/2018 04:25 AM, Christoph Groth wrote:
Personally, I even slightly prefer
a := 3
to the commonplace
a = 3 because it visually expresses the asymmetry of the operation.
Careful here! That’s a fine argument for using := in a new language, but people using := when they don’t need an expression because they like the symbol better is a reason NOT to do this.
Unless it's a bug magnet, that doesn't strike me as a good reason NOT to do this.
-- ~Ethan~
Chris Barker - NOAA Federal wrote:
Personally, I even slightly prefer
a := 3
to the commonplace
a = 3 because it visually expresses the asymmetry of the operation.
Careful here! That’s a fine argument for using := in a new language, but people using := when they don’t need an expression because they like the symbol better is a reason NOT to do this.
Perhaps you are right and it is indeed unrealistic to expect people to (eventually) shift to using := for simple assignments after 28 years of Python...
Then I think it would be also OK to introduce a fully general ":=" but discourage its use in assignment statements. However, it seems strange to forbid the use of one expression (namely ":=") as a statement while all other expressions are allowed. (So there seems no alternative to accepting both = and := in statements, and if I understand you correctly you consider this a problem.)
One way or the other, I'd like to underline a point that I made yesterday: I believe that it's important for sanity that taking any existing assignment statement and replacing all occurrences of "=" by ":=" does not have any effect on the program.
PEP 572 currently proposes to make ":=" a binary operator that is evaluated from right to left.
Christoph
On Sat, Apr 21, 2018 at 2:17 AM, Christoph Groth christoph@grothesque.org wrote:
Chris Barker - NOAA Federal wrote:
Personally, I even slightly prefer
a := 3
to the commonplace
a = 3 because it visually expresses the asymmetry of the operation.
Careful here! That’s a fine argument for using := in a new language, but people using := when they don’t need an expression because they like the symbol better is a reason NOT to do this.
Perhaps you are right and it is indeed unrealistic to expect people to (eventually) shift to using := for simple assignments after 28 years of Python...
It's not just 28 years of Python. It's also that other languages use "=" for assignment. While this is by no means a clinching argument, it does have some weight; imagine if Python used "=" for comparison and ":=" for assignment - anyone who works simultaneously with multiple languages is going to constantly type the wrong operator. (I get this often enough with comment characters, but my editor will usually tell me straight away if I type "// blah" in Python, whereas it won't always tell me that I used "x = 1" when I wanted one of the other forms.)
One way or the other, I'd like to underline a point that I made yesterday: I believe that it's important for sanity that taking any existing assignment statement and replacing all occurrences of "=" by ":=" does not have any effect on the program.
PEP 572 currently proposes to make ":=" a binary operator that is evaluated from right to left.
This is one of the points that I was halfway through working on when I finally gave up on working on a reference implementation for a likely-doomed PEP. It might be possible to make := take an entire sequence of assignables and then set them left to right; however, this would be a lot more complicated, and I'm not even sure I want that behaviour. I don't want to encourage people to replace all "=" with ":=" just for the sake of it. The consistency is good if it can be achieved, but you shouldn't actually DO that sort of thing normally.
Consider: one of the important reasons to define the assignment order is so you can reference a subscript and also use it. For instance:
idx, items[idx] = new_idx, new_val
But you don't need that with :=, because you can:
items[idx := new_idx] = new_val
(and you can use := for the second one if you wish). And actually, this one wouldn't even change, because it's using tuple unpacking, not the assignment order of chained assignments. I cannot think of any situation where you'd want to write this:
idx = items[idx] = f()
inside an expression, and thus need to write it as:
g(items[idx] := idx := f())
So I have no problem with a style guide saying "yeah just don't do that", and the PEP saying "if you do this, the semantics won't be absolutely identical to '='". Which it now does.
Now, if someone else wants to work on the reference implementation, they're welcome to create this feature and then see whether they like it. But since I can't currently prove it's possible, I'm not going to specify it in the PEP.
ChrisA
It's horrors like this:
g(items[idx] := idx := f())
That make me maybe +0 if the PEP only allowed simple name targets, but decisively -1 for any assignment target in the current PEP.
I would much rather never have to read awful constructs like that than get the minor convenience of:
if (val := some_expensive_func()) > 0:
x = call_something(val)
On Fri, Apr 20, 2018, 3:39 PM Chris Angelico rosuav@gmail.com wrote:
On Sat, Apr 21, 2018 at 2:17 AM, Christoph Groth christoph@grothesque.org wrote:
Chris Barker - NOAA Federal wrote:
Personally, I even slightly prefer
a := 3
to the commonplace
a = 3 because it visually expresses the asymmetry of the operation.
Careful here! That’s a fine argument for using := in a new language, but people using := when they don’t need an expression because they like the symbol better is a reason NOT to do this.
Perhaps you are right and it is indeed unrealistic to expect people to (eventually) shift to using := for simple assignments after 28 years of Python...
It's not just 28 years of Python. It's also that other languages use "=" for assignment. While this is by no means a clinching argument, it does have some weight; imagine if Python used "=" for comparison and ":=" for assignment - anyone who works simultaneously with multiple languages is going to constantly type the wrong operator. (I get this often enough with comment characters, but my editor will usually tell me straight away if I type "// blah" in Python, whereas it won't always tell me that I used "x = 1" when I wanted one of the other forms.)
One way or the other, I'd like to underline a point that I made yesterday: I believe that it's important for sanity that taking any existing assignment statement and replacing all occurrences of "=" by ":=" does not have any effect on the program.
PEP 572 currently proposes to make ":=" a binary operator that is evaluated from right to left.
This is one of the points that I was halfway through working on when I finally gave up on working on a reference implementation for a likely-doomed PEP. It might be possible to make := take an entire sequence of assignables and then set them left to right; however, this would be a lot more complicated, and I'm not even sure I want that behaviour. I don't want to encourage people to replace all "=" with ":=" just for the sake of it. The consistency is good if it can be achieved, but you shouldn't actually DO that sort of thing normally.
Consider: one of the important reasons to define the assignment order is so you can reference a subscript and also use it. For instance:
idx, items[idx] = new_idx, new_val
But you don't need that with :=, because you can:
items[idx := new_idx] = new_val
(and you can use := for the second one if you wish). And actually, this one wouldn't even change, because it's using tuple unpacking, not the assignment order of chained assignments. I cannot think of any situation where you'd want to write this:
idx = items[idx] = f()
inside an expression, and thus need to write it as:
g(items[idx] := idx := f())
So I have no problem with a style guide saying "yeah just don't do that", and the PEP saying "if you do this, the semantics won't be absolutely identical to '='". Which it now does.
Now, if someone else wants to work on the reference implementation, they're welcome to create this feature and then see whether they like it. But since I can't currently prove it's possible, I'm not going to specify it in the PEP.
ChrisA
Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/mertz%40gnosis.cx
On Sat, Apr 21, 2018 at 6:04 AM, David Mertz mertz@gnosis.cx wrote:
It's horrors like this:
g(items[idx] := idx := f())
That make me maybe +0 if the PEP only allowed simple name targets, but decisively -1 for any assignment target in the current PEP.
But that's my point: you shouldn't need to write that. Can anyone give me a situation where that kind of construct is actually useful? Much more common would be to use := inside the square brackets, which makes the whole thing a lot more sane.
You can ALWAYS write stupid code. Nobody can or will stop you.
ChrisA
Does the PEP currently propose to allow that horrible example? I thought Tim Peters successfully pleaded to only allow a single "NAME := <expr>". You don't have to implement this restriction -- we know it's possible to implement, and if specifying this alone were to pull enough people from -1 to +0 there's a lot of hope!
On Fri, Apr 20, 2018 at 1:12 PM, Chris Angelico rosuav@gmail.com wrote:
On Sat, Apr 21, 2018 at 6:04 AM, David Mertz mertz@gnosis.cx wrote:
It's horrors like this:
g(items[idx] := idx := f())
That make me maybe +0 if the PEP only allowed simple name targets, but decisively -1 for any assignment target in the current PEP.
But that's my point: you shouldn't need to write that. Can anyone give me a situation where that kind of construct is actually useful? Much more common would be to use := inside the square brackets, which makes the whole thing a lot more sane.
You can ALWAYS write stupid code. Nobody can or will stop you.
ChrisA
Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org
-- --Guido van Rossum (python.org/~guido)
On Sat, Apr 21, 2018 at 6:59 AM, Guido van Rossum guido@python.org wrote:
Does the PEP currently propose to allow that horrible example? I thought Tim Peters successfully pleaded to only allow a single "NAME := <expr>". You don't have to implement this restriction -- we know it's possible to implement, and if specifying this alone were to pull enough people from -1 to +0 there's a lot of hope!
I don't see much value in restricting the assignment target to names only, but if that's what it takes, it can be restricted, at least initially. As to chaining... well, since the entire construct (target := expr) is an expression, it can be used on the right of :=, so short of outright forbidding it, there's not a lot to be done.
ChrisA
On Fri, Apr 20, 2018 at 2:04 PM, Chris Angelico rosuav@gmail.com wrote:
On Sat, Apr 21, 2018 at 6:59 AM, Guido van Rossum guido@python.org wrote:
Does the PEP currently propose to allow that horrible example? I thought Tim Peters successfully pleaded to only allow a single "NAME :=
<expr>". You don't have to implement this restriction -- we know it's possible to implement, and if specifying this alone were to pull enough people from -1 to +0 there's a lot of hope!
I don't see much value in restricting the assignment target to names only, but if that's what it takes, it can be restricted, at least initially.
All of this is an exercise in listening and compromise, not in solving puzzles.
As to chaining... well, since the entire construct (target := expr) is an expression, it can be used on the right of :=, so short of outright forbidding it, there's not a lot to be done.
It would be more work but it can definitely be done (perhaps by introducing a syntactic construct of intermediate precedence). People could write "a := (b := foo())" but that way they resolve the ambiguity. Although if we restrict targets to just names there's less concern about ambiguity.
-- --Guido van Rossum (python.org/~guido)
[Chris Angelico rosuav@gmail.com]
I don't see much value in restricting the assignment target to names only, but if that's what it takes, it can be restricted, at least initially.
I believe this point was made most clearly before by Terry Reedy, but it bears repeating :-) This is from the PEP's motivation:
""" Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse. """
As "head arguments" go, that's a good one! But restricting assignment expressions to
identifier ":=" expression
satisfies it. If what's of value is to name the result of an expression, that single case handles that and _only_ that. In a sense, it's "the simplest thing that could possibly work", and that's generally a good thing to aim for.
Python assignment _statements_ are way more complex than that. Besides just giving names to expression results, they can also implicitly invoke arbitrarily complex __setitem__ and __setattr__ methods on targets, rely on all sorts of side effects across chained assignments, and support funky syntax for magically iterating over an expression's iterable result.
While that can all be useful _in_ an assignment statement, the PEP's motivation doesn't say a word about why any of _that_ would also be useful buried inside an assignment expression. There doesn't appear to be a good "head argument" for why, besides "why not?". That's not enough.
I think it's no coincidence that every example of an _intended_ use is of the simple
identifier ":=" expression
form. There are no examples of fancier targets in the PEP, and - more importantly - also none I saw in the hundreds of mailing-list messages since this started. Except for a few of mine, where I tried to demonstrate why _trying_ fancier targets in examples derived from real code made the original "loop and a half" code _worse_ And where other people were illustrating how incomprehensibly code _could_ be written (which isn't a real interest of mine).
Short course: e.g., while a general assignment expression can
"unpack" an iterable expression result, giving names to its elements,
there's no clean way to _use_ the names bound by the unpacking _in_
the "if" or "while" tests. That's fine for for
loops (only the
_body_ of the loop needs the names), but in conditional constructs you
typically want to use the names _in_ the condition being tested.
if ((a, b, c) := func_returning_triple()) and b > 0:
process(a+b, b+c, a+c)
seems to be as good as it gets, but inherently relies on "a trick": that a 3-tuple is always truthy, regardless of content. OTOH,
if ((a, b, c) := func_returning_triple())[1] > 0:
doesn't rely on a trick, but can't use the name b
in the test(!).
if [((a, b, c) := func_returning_triple()), b > 0][-1]::
manages to avoid "a trick", and to use the natural b > 0
, but is ...
strained ;-)
So, to my eyes, this is a clear improvement over all of those:
a, b, c = func_returning_triple()
if b > 0:
process(a+b, b+c, a+c)
Of course I could be cherry-picking a bad example there, but that's not the intent: I'm still waiting for anyone to post an example where a "fancy" assignment-expression target would actually make code clearer. I haven't found one.
There are lots of examples when the target is a plain single name.
Why the stark difference? I don't need deep theoretical reasons to see that there _is_ one, or to conclude that - in the absence of compelling use cases - complex assignment-expression targets are probably a Poor Idea.
Tim Peters wrote:
[Chris Angelico <rosuav at gmail.com>]
I don't see much value in restricting the assignment target to names only, but if that's what it takes, it can be restricted, at least initially.
I believe this point was made most clearly before by Terry Reedy, but it bears repeating :-) This is from the PEP's motivation:
""" Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse. """
As "head arguments" go, that's a good one! But restricting assignment expressions to
identifier ":=" expression
satisfies it. If what's of value is to name the result of an expression, that single case handles that and _only_ that. In a sense, it's "the simplest thing that could possibly work", and that's generally a good thing to aim for.
(...)
Tim, thanks for this clear analysis. Here's the best use case of more general assignment expressions that I can come up with (from real code I'm currently working on):
class Basis: def __init__(self, parent, periods=()): self._parent = parent if len(self._periods := np.asarray(periods, int)): ... else:
# In absence of periods, treat them as an (0, n)-shaped array.
# This avoids a special code path below.
self._periods = np.empty((0, len(parent.periods)), int)
But since this is a weak counterexample, it actually serves to strengthen your point that
identifier ":=" expression
is all that is needed.
Such minimal assignment expressions have the (IMHO important) advantage of not being inconsistent with assignment statements.
Still, it seems weird to have two different ways of binding names in the language where one would be sufficient (i.e. the old one would remain only for backwards compatibility). From the point of view of someone who's new to the language that's two things to learn instead of just one.
[Christoph Groth christoph@grothesque.org]
Tim, thanks for this clear analysis. Here's the best use case of more general assignment expressions that I can come up with (from real code I'm currently working on):
class Basis: def __init__(self, parent, periods=()): self._parent = parent if len(self._periods := np.asarray(periods, int)): ... else:
# In absence of periods, treat them as an (0, n)-shaped array.
# This avoids a special code path below.
self._periods = np.empty((0, len(parent.periods)), int)
But since this is a weak counterexample, it actually serves to strengthen your point that
identifier ":=" expression
is all that is needed.
That's a decent example. In truth, I have no real objection to binding an attribute - but am willing to throw out a bit of soap with the bathwater if doing so can avoid throwing the baby out too ;-)
Such minimal assignment expressions have the (IMHO important) advantage of not being inconsistent with assignment statements.
Still, it seems weird to have two different ways of binding names in the language where one would be sufficient (i.e. the old one would remain only for backwards compatibility). From the point of view of someone who's new to the language that's two things to learn instead of just one.
But they're very different in a key respect. the value of an assignment expression is the value assigned. Asking "what's the value of a statement?" doesn't even make sense in Python (whether an assignment statement or any other kind of statement).
For that reason, _if_ a PEP is reworked to suggest a "binding expression" (I'd prefer the name change to nudge people away from conflating it with the far more general assignment statement), the usage pragmatics are clear: use a binding expression if the context requires using the value bound, else use a simple assignment statement.
":=" doesn't _just_ mean "bind the simple name on the left" in that world, but also "and return the value of the expression on the right".
For that reason, e.g.,
i = 1
would be strongly preferred to
i := 1
as a standalone line, except perhaps when typing at an interactive shell (where you may _want_ to see the value being bound - but usually don't).
Tim Peters wrote:
[Christoph Groth christoph@grothesque.org]
Still, it seems weird to have two different ways of binding names in the language where one would be sufficient (i.e. the old one would remain only for backwards compatibility). From the point of view of someone who's new to the language that's two things to learn instead of just one.
But they're very different in a key respect. the value of an assignment expression is the value assigned. Asking "what's the value of a statement?" doesn't even make sense in Python (whether an assignment statement or any other kind of statement).
There are also no function call statements in Python. People are happily using function call expressions as statements when not interested in their value.
I hope to have shown [1] that the same could be done for assignments. A consistent value can be defined for any assignment statement. So, all assignment statements could be redefined as expressions and the language would continue to work and even be (perfectly?) backwards-compatible.
Syntax-wise, if replacing = by := everywhere is unthinkable, as it seems, there's still the possibility (not completely ruled out by Guido ;-) to use = for assignment expressions but require extra parens for safety.
Thus, it seems to me that redefining assignments as expressions everywhere is a feasible, if radical, idea. Compared to a dedicated syntax for "binding expressions" it would be conceptually simpler, but would provide more possibilities to shoot oneself in the foot.
[1] https://mail.python.org/pipermail/python-dev/2018-April/152780.html
[Christoph Groth christoph@grothesque.org]
Still, it seems weird to have two different ways of binding names in the language where one would be sufficient (i.e. the old one would remain only for backwards compatibility). From the point of view of someone who's new to the language that's two things to learn instead of just one.
[Tim]
But they're very different in a key respect. the value of an assignment expression is the value assigned. Asking "what's the value of a statement?" doesn't even make sense in Python (whether an assignment statement or any other kind of statement).
[Christoph]
There are also no function call statements in Python. People are happily using function call expressions as statements when not interested in their value.
Sure.
I hope to have shown [1] that the same could be done for assignments. A consistent value can be defined for any assignment statement. So, all assignment statements could be redefined as expressions and the language would continue to work and even be (perfectly?) backwards-compatible.
Except for shells. When I type, e.g.,
xs = sorted(iterator_returning_a_billion_strings)
I really don't want to wait for hours before I can type again ;-) In the same way now, when someone calls a function at a shell but doesn't want to see its result, they do something like
xxx = function(a, b, c)
knowing that an assignment statement never displays any output on its own. If an assignment statement did return a result, almost all shells would display it. Shells typically don't care at all what you typed at them, they just care whether or not executing the compiled code returns None:
result = execute_code()
if result is not None:
display(repr(result))
There's also that you're not considering the other half: that every existing assignment statement could be viewed as being as expression does not imply that every existing assignment statement could be used everywhere an expression can be used. Syntax matters, and function call argument lists in particular already bristle with their own meanings for commas, equal signs, and asterisks. The language was designed with "and the twain shall never meet" in mind ;-) For example, what would
f(a=b)
mean?
The worst possible ;-) answer is "well, since
a=b
is fine as an assignment statement, it must mean that we bind the
value of b
to name a
and then pass b's value to f()
as its first
positional argument". That reading would break countless lines of
code using keyword arguments. If you're willing to concede that's too
much breakage to bear, then you have to identify and spell out "the
rules" for every case in which something that "looks like an
assignment expression really isn't, depending on context".
But since I have no interest in pursuing this, I'll stop there :-)
Syntax-wise, if replacing = by := everywhere is unthinkable, as it seems, there's still the possibility (not completely ruled out by Guido ;-) to use = for assignment expressions but require extra parens for safety.
That would be received less well than the current PEP. The people it would hurt the most are newcomers from other languages who habitually put _every_ "if" and "while" test in parentheses, because that's what they're used to doing (e.g., in C). Many of us still remember our initial relief when we realized we'd never piss away hours debugging an
assert(n=1)
or if (x=0.0)
typo/thinko again. Reintroducing that possibility would get an instant -1 from me, because I don't want to debug that same mistake for other people on Stackoverflow either - my time there is wholly consumed by explaining why .1 + .2 doesn't display exactly "0.3" ;-)
Thus, it seems to me that redefining assignments as expressions everywhere is a feasible, if radical, idea. Compared to a dedicated syntax for "binding expressions" it would be conceptually simpler, but would provide more possibilities to shoot oneself in the foot.
As above, it wouldn't remain so simple after hammering out the detailed rules for deciding when and where something that "looks like an assignment expression" really is one.
For an example of a fine language that makes no distinction between "statements" and "expressions" at all, Icon is top on my list. That _can_ work out fine - but Icon was designed that way from the start. And, of course, like every sane language that has wholly general assignment expressions, Icon uses ";=" as the assignment operator, and "=" for numeric equality testing ;-)
Tim Peters wrote:
[Christoph Groth christoph@grothesque.org]
I hope to have shown [1] that the same could be done for assignments. A consistent value can be defined for any assignment statement. So, all assignment statements could be redefined as expressions and the language would continue to work and even be (perfectly?) backwards-compatible.
Except for shells. When I type, e.g.,
xs = sorted(iterator_returning_a_billion_strings)
I really don't want to wait for hours before I can type again ;-) In the same way now, when someone calls a function at a shell but doesn't want to see its result, they do something like
xxx = function(a, b, c)
Yes, that's a serious problem with making all assignments expressions. Assignments are so common in interactive use that displaying their values could be quickly annoying.
There are several possible solutions. For example, the IPython shell interprets a trailing semicolon as "do not show the result of the expression".
A better solution seems to be to only treat assignments that are surrounded by the mandatory parens as expressions and keep the old-style assignments as statements, e.g.
a = 3 (a = 3) # currently a SyntaxError 3
So, strictly speaking, there would be distinct assignment statements and expressions, but it would be still easy conceptually because one could say:
Any valid assignment statement can be turned into an expression by surrounding it with parentheses. There is no difference in semantics.
There's also that you're not considering the other half: that every existing assignment statement could be viewed as being as expression does not imply that every existing assignment statement could be used everywhere an expression can be used. Syntax matters, and function call argument lists in particular already bristle with their own meanings for commas, equal signs, and asterisks. The language was designed with "and the twain shall never meet" in mind ;-) For example, what would
f(a=b)
mean?
It would, of course, mean the same as it does now. (Otherwise backwards compatibility would be broken.) However,
f((a=b))
which currently is a syntax error, would mean: bind the value of 'b' to the name 'a' and call 'f' with the value of that expression. The extra parens would be required around any assignment expression. I believe that this also solves all the other problems that you raise with regard to commas etc.
So, you see, promoting assignments to expressions is indeed feasible. The advantages are the conceptual simplicity, and the familiar syntax.
The syntax is also a disadvantage, because it is somewhat ugly:
while (item = get()): process(item)
There's also potential for misuse, but firstly that is something that is not unheard of in Python and secondly assignment expressions could be (at least initially) limited to only a subset of the forms that are allowed for assignment statements.
If I had to choose between the above and ":= binding expressions", I guess I would tend to prefer the latter because they are sufficient, nicer looking and offer less potential for trouble. But I think that it is worth to fully discuss the above idea as well. IMHO it should be at least briefly mentioned in the "rejected ideas" of PEP 572, because it is arguably the most self-evident way to introduce name-binding expressions into the language.
On Sun, Apr 22, 2018 at 7:29 PM, Christoph Groth christoph@grothesque.org wrote:
If I had to choose between the above and ":= binding expressions", I guess I would tend to prefer the latter because they are sufficient, nicer looking and offer less potential for trouble. But I think that it is worth to fully discuss the above idea as well. IMHO it should be at least briefly mentioned in the "rejected ideas" of PEP 572, because it is arguably the most self-evident way to introduce name-binding expressions into the language.
It's in the FAQ.
ChrisA
2018-04-21 4:44 GMT+03:00 Tim Peters tim.peters@gmail.com:
[Chris Angelico rosuav@gmail.com]
I don't see much value in restricting the assignment target to names only, but if that's what it takes, it can be restricted, at least initially.
I believe this point was made most clearly before by Terry Reedy, but it bears repeating :-) This is from the PEP's motivation:
""" Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse. """
As "head arguments" go, that's a good one! But restricting assignment expressions to
identifier ":=" expression
satisfies it. If what's of value is to name the result of an expression, that single case handles that and _only_ that. In a sense, it's "the simplest thing that could possibly work", and that's generally a good thing to aim for.
Python assignment _statements_ are way more complex than that. Besides just giving names to expression results, they can also implicitly invoke arbitrarily complex __setitem__ and __setattr__ methods on targets, rely on all sorts of side effects across chained assignments, and support funky syntax for magically iterating over an expression's iterable result.
While that can all be useful _in_ an assignment statement, the PEP's motivation doesn't say a word about why any of _that_ would also be useful buried inside an assignment expression. There doesn't appear to be a good "head argument" for why, besides "why not?". That's not enough.
I agree with you. During the discussion on python-ideas there was not
explicitly suggested to limit assignment target to name only but that was
often implicitly implied. So explicit is better than implicit :) The main
reason for such criticism was related to the fact that almost all of the
examples from the PEP use name := expression
form. Also it was noted that
99% of use-cases where this feature will be _nice_ to have is while
and
if
statements (including ternary from). Although one of the prerequisites
for writing this PEP was the use of the assignment expression in the lists,
it will rarely be used in them, and even more rarely it will be a justified
usage of. In addition, in the case of the general assignment expression and
the chosen operator : =
, which solves the problem of distinctness from
==
, I see no reason, or more precisely how to explain, why will not other
forms +=
, *=
become expressions either? And then we are faced
with with all the beauty of side effects, sequnce points, ... And while in
Python it's much easier to resolve this - Python will no longer be
Python. I'm glad that this does not happen.
Since the discussion moves towards a simplified form - binding
expression
, where assignment target can be name only. Will you be _happy_
with the choice of :=
operator? Which is perceived as =
, but
with very
limited capabilities. Therefore, as I see it, with this _limited power_ it
is one of design goals to make the syntax forms of assignment statement
and assignment expression
to be distinct and :=
does not help
with
this. This does not mean that this new syntax form should not be
convenient, but it should be different from the usual =
form. Otherwise,
the question about ".1 + .2" will have competitors :-)
I think it's no coincidence that every example of an _intended_ use is of the simple
identifier ":=" expression
form. There are no examples of fancier targets in the PEP, and - more importantly - also none I saw in the hundreds of mailing-list messages since this started. Except for a few of mine, where I tried to demonstrate why _trying_ fancier targets in examples derived from real code made the original "loop and a half" code _worse_ And where other people were illustrating how incomprehensibly code _could_ be written (which isn't a real interest of mine).
With kind regards, -gdg
2018-04-22 14:10 GMT+03:00 Kirill Balunov kirillbalunov@gmail.com:
>
Although one of the prerequisites for writing this PEP was the use of the assignment expression in the lists
Sorry, typo: in compehensions/generators.
it will rarely be used in them, and even more rarely it will be a justified usage of.
With kind regards, -gdg
>
[Guido, about g(items[idx] := idx := f()) ]
Does the PEP currently propose to allow that horrible example? I thought Tim Peters successfully pleaded to only allow a single "NAME := <expr>".
I was "successful" only in that the two of us agreed that would be far less disruptive, and quite possibly an actual improvement ;-) But I only argued for limiting assignment expressions to the form
identifier ":=" expression
I expected that, given that expressions "naturally nest", chained targets could still be specified:
a := b := c:= 5
but since they're all plain names there's no way to tell whether the bindings occur "left to right" or "right to left" short of staring at the generated code. I have no use case for chaining plain-name targets in assignment expressions, but didn't see a good reason to torture the implementation to forbid it. I expected chaining would just be an unused-in-practice possibility. Much like, e.g.,
a in b in c in d
is an unused-in-practice possibility.
And I'll take this opportunity to repeat the key point for me: I tried hard, but never found a single case based on staring at real code where allowing _fancier_ (than "plain name") targets would be a real improvement. In every case I thought it _might_ help, it turned out that it really didn't unless Python _also_ grew an analog to C's "comma operator" (take only the last result from a sequence of expressions). I'll also note that I asked if anyone else had a real-life example, and got no responses.
There were lots of "real life" cases where plain-name targets allowed for code improvement, though.
You don't have to implement this restriction -- we know it's possible to implement, and if specifying this alone were to pull enough people from -1 to +0 there's a lot of hope!
Given my experience with _trying_ to find use cases for fancier targets, and getting burned every time, I'm on the minus side of the current PEP, because - best I can tell - all the extra complexity would create an "attractive nuisance" :-(
On 2018-04-20 22:33, Tim Peters wrote: [snip]
And I'll take this opportunity to repeat the key point for me: I tried hard, but never found a single case based on staring at real code where allowing _fancier_ (than "plain name") targets would be a real improvement. In every case I thought it _might_ help, it turned out that it really didn't unless Python _also_ grew an analog to C's "comma operator" (take only the last result from a sequence of expressions). I'll also note that I asked if anyone else had a real-life example, and got no responses.
Could a semicolon in a parenthesised expression be an equivalent to C's "comma operator"?
[snip]
[Tim]
And I'll take this opportunity to repeat the key point for me: I tried hard, but never found a single case based on staring at real code where allowing _fancier_ (than "plain name") targets would be a real improvement. In every case I thought it _might_ help, it turned out that it really didn't unless Python _also_ grew an analog to C's "comma operator" (take only the last result from a sequence of expressions). I'll also note that I asked if anyone else had a real-life example, and got no responses.
[MRAB python@mrabarnett.plus.com]
Could a semicolon in a parenthesised expression be an equivalent to C's "comma operator"?
I expect it could, but I it's been many years since I tried hacking Python's grammar, and I wouldn't want a comma operator anyway ;-)
To recycle a recently-posted example, instead of one of these 3:
if ((a, b, c) := func_returning_triple()) and b > 0:
process(a+b, b+c, a+c)
if ((a, b, c) := func_returning_triple())[1] > 0:
....
if [((a, b, c) := func_returning_triple()), b > 0][-1]::
...
it would allow this instead:
if ((a, b, c) := func_returning_triple(); b > 0):
...
That's better than any of the first three, but I'm not sure it's better than the original
a, b, c = func_returning_triple()
if b > 0:
...
It _may_ be more readable in other complex-target examples, though.
It's also what's wanted in one of the running plain-name target examples, _not_ involving a conditional context:
r1, r2 = (D := sqrt(b**-4*a*c); a2 := 2*a; ((-b+D)/a2), (-b-D)/a2))
And if I saw enough code like that, I'd write a PEP suggesting that Python introduce separate assignment statements where name bindings persisted across statement boundaries ;-)
On 2018-04-21 03:15, Tim Peters wrote:
[Tim]
And I'll take this opportunity to repeat the key point for me: I tried hard, but never found a single case based on staring at real code where allowing _fancier_ (than "plain name") targets would be a real improvement. In every case I thought it _might_ help, it turned out that it really didn't unless Python _also_ grew an analog to C's "comma operator" (take only the last result from a sequence of expressions). I'll also note that I asked if anyone else had a real-life example, and got no responses.
[MRAB python@mrabarnett.plus.com]
Could a semicolon in a parenthesised expression be an equivalent to C's "comma operator"?
I expect it could, but I it's been many years since I tried hacking Python's grammar, and I wouldn't want a comma operator anyway ;-) [snip] Just reading this:
https://www.bfilipek.com/2018/04/refactoring-with-c17-stdoptional.html
about C++17, and what did I see? An example with a semicolon in parentheses!
>
Just reading this:
https://www.bfilipek.com/2018/04/refactoring-with-c17-stdoptional.html
about C++17, and what did I see? An example with a semicolon in parentheses!
Isn't that kind of the C++ motto? "Leave no syntactic stone unturned." Whereas in Python, the syntax motto might better be stated, "We will ship no syntax before its time." (With apologies to Ernest and Julio Gallo.)
Skip
On Apr 24, 2018, at 2:10 PM, MRAB python@mrabarnett.plus.com wrote:
On 2018-04-21 03:15, Tim Peters wrote:
[Tim]
And I'll take this opportunity to repeat the key point for me: I tried hard, but never found a single case based on staring at real code where allowing _fancier_ (than "plain name") targets would be a real improvement. In every case I thought it _might_ help, it turned out that it really didn't unless Python _also_ grew an analog to C's "comma operator" (take only the last result from a sequence of expressions). I'll also note that I asked if anyone else had a real-life example, and got no responses.
[MRAB python@mrabarnett.plus.com]
Could a semicolon in a parenthesised expression be an equivalent to C's "comma operator"?
I expect it could, but I it's been many years since I tried hacking Python's grammar, and I wouldn't want a comma operator anyway ;-) [snip] Just reading this:
https://www.bfilipek.com/2018/04/refactoring-with-c17-stdoptional.html
about C++17, and what did I see? An example with a semicolon in parentheses!
A similar pattern shows up in Go's if statement syntax. It is interesting to note that it is part of the grammar specifically for the if statement and not general expression syntax.
IfStmt = "if" [ SimpleStmt ";" ] Expression Block [ "else" ( IfStmt | Block ) ]
.
Bindings that occur inside of SimpleStmt
are only available within the
Expression
and blocks that make up the if statement.
https://golang.org/ref/spec#If_statements
This isn't a good reason to parrot the syntax in Python though. IMO, I consider the pattern to be one of the distinguishing features of golang and would be happy leaving it there.
I have often wondered if adding the venerable for loop syntax from C (and many other languages) would solve some of the needs here though. The for loop syntax in golang is interesting in that it serves as both a standard multipart for statement as well as a while statement.
Changing something like this is more of a Python 4 feature and I think that I would be -0 on the concept. I did want to mention the similarities for the posterity though.
ChrisA - we might want to add explicit mentions of golang's if statement and for loop as "considered" syntaxes since they are in a sibling programing language (e.g., similar to async/await in PEP 492).
"Syntactic sugar causes cancer of the semicolon" - Alan Perlis
On 21 April 2018 at 07:33, Tim Peters tim.peters@gmail.com wrote:
I expected that, given that expressions "naturally nest", chained targets could still be specified:
a := b := c:= 5
but since they're all plain names there's no way to tell whether the bindings occur "left to right" or "right to left" short of staring at the generated code.
The fact class namespaces are ordered by default now allow us to demonstrate the order of multiple target assignments and tuple unpacking without staring at generated code:
class AssignmentOrder: ... a = b = c = 0 ... d, e, f = range(3) ... class ReversedAssignmentOrder: ... c = b = a = 0 ... f, e, d = range(3) ... [attr for attr in AssignmentOrder.__dict__ if not attr.startswith("_")] ['a', 'b', 'c', 'd', 'e', 'f'] [attr for attr in ReversedAssignmentOrder.__dict__ if not attr.startswith("_")] ['c', 'b', 'a', 'f', 'e', 'd']
So that's a situation where "name = alias = value" could end up matching "alias := name := value"
(Even in earlier versions, you can illustrate the same assignment ordering behaviour with the enum module, and there it makes even more of a difference, as it affects which name binding is considered the canonical name, and which are considered aliases).
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
[Tim]
I expected that, given that expressions "naturally nest", chained targets could still be specified:
a := b := c:= 5
but since they're all plain names there's no way to tell whether the bindings occur "left to right" or "right to left" short of staring at the generated code.
[Nick Coghlan ncoghlan@gmail.com]
The fact class namespaces are ordered by default now allow us to demonstrate the order of multiple target assignments and tuple unpacking without staring at generated code:
class AssignmentOrder: ... a = b = c = 0 ... d, e, f = range(3) ... class ReversedAssignmentOrder: ... c = b = a = 0 ... f, e, d = range(3) ... [attr for attr in AssignmentOrder.__dict__ if not attr.startswith("_")] ['a', 'b', 'c', 'd', 'e', 'f'] [attr for attr in ReversedAssignmentOrder.__dict__ if not attr.startswith("_")] ['c', 'b', 'a', 'f', 'e', 'd']
So that's a situation where "name = alias = value" could end up matching "alias := name := value"
Cool! So this is really a killer-strong argument for getting rid of classes - way overdue, too ;-)
(Even in earlier versions, you can illustrate the same assignment ordering behaviour with the enum module, and there it makes even more of a difference, as it affects which name binding is considered the canonical name, and which are considered aliases).
So if binding expressions can be chained, they'll need to ape "left-to-right" binding order.
Or they can't be allowed to chain to begin with.
Either way would be fine by me.
On 20 April 2018 at 21:25, Christoph Groth christoph@grothesque.org wrote:
Steven Turnbull wrote: >
Christoph Groth writes:
Wouldn't it be a pity not to liberate assignments from their boring statement existence?
Maybe not. While it would be nice to solve the loop-and-a-half "problem" and the loop variable initialization "problem" (not everyone agrees these are even problems, especially now that we have comprehensions and generator expressions), as a matter of taste I like the fact that this particular class of side effects is given weighty statement syntax rather than more lightweight expression syntax.
I think that this is the crucial point. If it is indeed a design principle of Python that expressions should not have the side-effect of assigning names, than the whole discussion of PEP 572 could have been stopped early on. I didn't have this impression since core devs participated constructively in the discussion.
There were a couple of factors at play there. Firstly, python-ideas and python-dev play different roles in the process, with python-ideas focused on "Help PEP authors put forward the most compelling proposal possible", and then python-dev providing the far more stringent filter of "Do we sincerely believe the long term improvements in language learnability and code maintainability arising from this change will outweigh the inevitable near term costs?" (python-ideas does consider the latter question as well, but we're more willing to spend time on ideas that only reach the level "Maybe? Depending on your point of view?").
Secondly, the original PEP proposed sublocal scopes precisely to help preserve that property by limiting the impact of any name binding side effects to a single statement (albeit an arbtirarily long nested suite in the case of compound statements).
My own enthusiasm for the idea largely waned after folks successfully campaigned for "we'd prefer side effects to introducing a new kind of scope" (and while I'm definitely sympathetic to the "Python's name lookup scoping rules are already excessively complicated" point of view, I also think that if "=" and ":=" both target the same kind of scope, there isn't enough new expressiveness introduced by the latter to justify the syntactic complexity of adding it).
Cheers, Nick.
P.S. While I'm not planning to work on it myself anytime soon, I think the sublocal scoping semantics proposed in earlier versions of PEP 572 would provide a much better developer experience for PEP 3150's "given" clause (which is currently deferred indefinitely, as even I don't particularly like the current incarnation of that proposal).
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Nick Coghlan wrote:
I also think that if "=" and ":=" both target the same kind of scope, there isn't enough new expressiveness introduced by the latter to justify the syntactic complexity of adding it.
OK, but then how about introducing assignment expressions with the "=" operator but requiring extra parens (similar to how modern C compilers warn about assignment expressions without parens), e.g.
while (obj = get()): process(obj)
The semantics of assignment expressions could be exactly what I proposed for ":=", i.e. completely consistent with assignment statements.
Assignment statements could be either left as they are or could be treated as expressions. That second choice would have consequences for interactive sessions:
a = 3 3
The above would bring the benefits of assignment expressions in a minimally invasive but safe way. Moreover, it would not feel like Pascal! The only downside is that "=" stands out less than ":=" so that the presence of side-effects would be somewhat less visible.
Christoph
On 04/20/2018 11:15 AM, Christoph Groth wrote:
Nick Coghlan wrote:
I also think that if "=" and ":=" both target the same kind of scope, there isn't enough new expressiveness introduced by the latter to justify the syntactic complexity of adding it.
OK, but then how about introducing assignment expressions with the "=" operator but requiring extra parens (similar to how modern C compilers warn about assignment expressions without parens), e.g.
Using a single "=" for assignment expressions isn't going to happen. Period.
-- ~Ethan~
Ethan Furman wrote:
On 04/20/2018 11:15 AM, Christoph Groth wrote:
Nick Coghlan wrote:
I also think that if "=" and ":=" both target the same kind of scope, there isn't enough new expressiveness introduced by the latter to justify the syntactic complexity of adding it.
OK, but then how about introducing assignment expressions with the "=" operator but requiring extra parens (similar to how modern C compilers warn about assignment expressions without parens), e.g.
Using a single "=" for assignment expressions isn't going to happen. Period.
Huh, I didn't want to irritate anyone!
Guido wrote [1] on python-ideas:
I also think it's fair to at least reconsider adding inline assignment, with the "traditional" semantics (possibly with mandatory parentheses). This would be easier to learn and understand for people who are familiar with it from other languages (C++, Java, JavaScript).
I interpreted this in the way that he at least doesn't rule out "= with parens" completely. Perhaps he meant ":= with parens", but that would seem redundant.
[1] https://mail.python.org/pipermail/python-ideas/2018-March/049409.html
On 04/20/2018 11:31 AM, Christoph Groth wrote:
Ethan Furman wrote:
On 04/20/2018 11:15 AM, Christoph Groth wrote:
OK, but then how about introducing assignment expressions with the "=" operator but requiring extra parens (similar to how modern C compilers warn about assignment expressions without parens), e.g.
Using a single "=" for assignment expressions isn't going to happen. Period.
Huh, I didn't want to irritate anyone!
No worries. It's just not going to happen. ;)
Guido wrote [1] on python-ideas:
I also think it's fair to at least reconsider adding inline assignment, with the "traditional" semantics (possibly with mandatory parentheses). This would be easier to learn and understand for people who are familiar with it from other languages (C++, Java, JavaScript).
I interpreted this in the way that he at least doesn't rule out "= with parens" completely. Perhaps he meant ":= with parens", but that would seem redundant.
Ah. I believe he was referring to not having a statement-local binding, but a normal binding instead (so either local or global depending on where the expression occurred).
-- ~Ethan~
Christoph's interpretation is correct. I don't rule that out. I also separately proposed := as something that more people could get behind, though perhaps it's all moot, and perhaps the PEP would gain clarity if it went back to proposing "=". (Mostly kidding.)
On Fri, Apr 20, 2018 at 11:47 AM, Ethan Furman ethan@stoneleaf.us wrote:
On 04/20/2018 11:31 AM, Christoph Groth wrote:
Ethan Furman wrote:
On 04/20/2018 11:15 AM, Christoph Groth wrote:
OK, but then how about introducing assignment expressions with the "="
operator but requiring extra parens (similar to how modern C compilers warn about assignment expressions without parens), e.g.
Using a single "=" for assignment expressions isn't going to happen. Period.
Huh, I didn't want to irritate anyone!
No worries. It's just not going to happen. ;)
Guido wrote [1] on python-ideas: >
I also think it's fair to at least reconsider adding inline
assignment, with the "traditional" semantics (possibly with mandatory parentheses). This would be easier to learn and understand for people who are familiar with it from other languages (C++, Java, JavaScript).
I interpreted this in the way that he at least doesn't rule out "= with parens" completely. Perhaps he meant ":= with parens", but that would seem redundant.
Ah. I believe he was referring to not having a statement-local binding, but a normal binding instead (so either local or global depending on where the expression occurred).
-- ~Ethan~
Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido% 40python.org
-- --Guido van Rossum (python.org/~guido)
On Fri, 20 Apr 2018 13:25:02 +0200 Christoph Groth christoph@grothesque.org wrote:
I think that this is the crucial point. If it is indeed a design principle of Python that expressions should not have the side-effect of assigning names, than the whole discussion of PEP 572 could have been stopped early on. I didn't have this impression since core devs participated constructively in the discussion.
python-dev and python-ideas are two different mailing-lists with different participants, so there's a selection bias here. People who are willing follow lengthy python-ideas discussions may be more tolerant to adventurous language changes. Many core developers (including myself) don't read python-ideas routinely.
Regards
Antoine.
On Apr 19, 2018, at 4:27 PM, Christoph Groth christoph@grothesque.org wrote:
def sync_runner(learner, f, static_hint): while True: points = learner.get(static_hint) if not points: break learner.feed(f(points))
With assignment expressions the body of the above function could be simplified to
while points := learner.get(static_hint): learner.feed(f(points))
making it crucially simpler.
Kinda supports my assertion that what we really want is a different while loop.
Would it be ridiculous if := only worked in a while statement?
-CHB
If you restrict the idea to 'if' and 'while', Why not render this using the existing 'as' form for binding names, already used with 'except' and 'with':
while learner.get(static_hint) as points:
learner.feed(f(points))
The equivalent for 'if' helps with the regex matching case:
if re.match(r"...") as m:
print(m.group(1))
I considered proposing these two forms in a PEP a few years ago, but never got around to it. To my eye, they fit syntactically into the language as-is, without introducing new symbols, operators or keywords, are consistent with existing usage, and address two relatively common causes of displeasing Python code.
Robert
On Fri, Apr 20, 2018 at 4:32 PM, Robert Smallshire rob@sixty-north.com wrote:
If you restrict the idea to 'if' and 'while', Why not render this using the existing 'as' form for binding names, already used with 'except' and 'with':
while learner.get(static_hint) as points:
learner.feed(f(points))
The equivalent for 'if' helps with the regex matching case:
if re.match(r"...") as m:
print(m.group(1))
I considered proposing these two forms in a PEP a few years ago, but never got around to it. To my eye, they fit syntactically into the language as-is, without introducing new symbols, operators or keywords, are consistent with existing usage, and address two relatively common causes of displeasing Python code.
And are limited to conditions that check the truthiness/falsiness of the value you care about. So that works for re.match, but not for anything that might return -1 (a lot of C APIs do that, so if you're working with a thin wrapper, that might be all you get), and it'll encourage people to use this form when "is not None" would be more appropriate (setting up for a failure if ever the API returned a falsey value), etc. It's similar if you use iter(func, None) - it's actually doing an equality check, not an identity check, even though a longhand form would be better written with "is not None".
Also, are you assuming that this is binding to a name, or can it assign to other targets?
if re.match(...) as m[3]: ...
HINT: Saying "it should do whatever except and with do" won't answer the question. Give it a try if you don't believe me. :)
ChrisA
On 2018-04-19 23:52, Chris Angelico wrote:
And are limited to conditions that check the truthiness/falsiness of the value you care about. So that works for re.match, but not for anything that might return -1 (a lot of C APIs do that, so if you're working with a thin wrapper, that might be all you get), and it'll encourage people to use this form when "is not None" would be more appropriate (setting up for a failure if ever the API returned a
From the previously discussed code, it might look like this:
while (file.get_next_token() as token) != -1:
doc += token
Shouldn't be needed often, but I find it readable enough.
More generally, I've been -0 on this idea because I've come to appreciate Python's less-clever i.e. "dumb" loop syntax, and ":=" combined with assignment-expressions doesn't feel like Python at all but rather Pascal and C had a love-child, haha.
I could mildly support the "as" syntax however, since it is so darn readable and has analogues in other places.
That leaves what to do with "with". Guess I missed the part in the discussion where we couldn't fit the syntax into it. Would requiring parens here not work?
with (expr() as name) as conman:
pass
This should rarely be necessary or useful, correct? Perhaps disallow for now.
On assignment to names/subscripts, just names sounds simpler for the first round.
Also the current "while" itself could be a bit simpler by making the expression optional and slightly less verbose:
while:
points = learner.get(static_hint)
if not points:
break
Thanks for the hard work, -Mike
On Sat, Apr 21, 2018 at 2:50 AM, Mike Miller python-dev@mgmiller.net wrote: >
On 2018-04-19 23:52, Chris Angelico wrote: >
And are limited to conditions that check the truthiness/falsiness of the value you care about. So that works for re.match, but not for anything that might return -1 (a lot of C APIs do that, so if you're working with a thin wrapper, that might be all you get), and it'll encourage people to use this form when "is not None" would be more appropriate (setting up for a failure if ever the API returned a
From the previously discussed code, it might look like this:
while (file.get_next_token() as token) != -1:
doc += token
Except that that's now a feature of expressions, NOT of the loop construct. And then you're left with: why not permit this everywhere?
That leaves what to do with "with". Guess I missed the part in the discussion where we couldn't fit the syntax into it. Would requiring parens here not work?
with (expr() as name) as conman:
pass
This should rarely be necessary or useful, correct? Perhaps disallow for now.
What would these mean?
from contextlib import closing with open(fn) as f: with (open(fn) as f): with closing(urlopen(url)) as dl: with closing(urlopen(url) as dl): with (closing(urlopen(url)) as dl):
One of these is not like the others...
Also the current "while" itself could be a bit simpler by making the expression optional and slightly less verbose:
while:
points = learner.get(static_hint)
if not points:
break
As an alias for "while True"? Not a lot of benefit. I'd rather do something like this:
while "get more learners":
which at least tells the next reader that there is a condition, even if not a coded one.
ChrisA
On 2018-04-20 12:43, Chris Angelico wrote:
Except that that's now a feature of expressions, NOT of the loop construct. And then you're left with: why not permit this everywhere?
Sorry, I didn't understand. Didn't mean to imply it couldn't be used everywhere.
What would these mean?
My expectations:
with open(fn) as f: # current behavior
with (open(fn) as f): # syntax error, missing clause
with closing(urlopen(url)) as dl: # current behavior
with closing(urlopen(url) as dl): # syntax error, missing clause
with (closing(urlopen(url)) as dl): # syntax error, missing clause
In other words, the with statement would continue to require an as clause outside of the parentheses. A double name binding doesn't seem very useful however.
-Mike
2018-04-20 14:54 GMT-07:00 Mike Miller python-dev@mgmiller.net:
>
In other words, the with statement would continue to require an as clause outside of the parentheses. A double name binding doesn't seem very useful however.
The with statement does not require an as clause.
On 2018-04-20 14:59, Jelle Zijlstra wrote:
In other words, the with statement would
continue to require an as clause
outside of the parentheses. A double name binding doesn't seem very useful
however.
The with statement does not require an as clause.
Sorry, more precisely a contenxt-manager object to be returned. So perhaps this "with" issue may not be one at all.
On Sat, Apr 21, 2018 at 8:07 AM, Mike Miller python-dev@mgmiller.net wrote:
On 2018-04-20 14:59, Jelle Zijlstra wrote:
In other words, the with statement would
continue to require an as
clause outside of the parentheses. A double name binding doesn't seem very useful however.
The with statement does not require an as clause.
Sorry, more precisely a contenxt-manager object to be returned. So perhaps this "with" issue may not be one at all.
That's completely different, and isn't a syntactic point. They may bomb with AttributeError at run time, but they also may not.
My expectations:
with open(fn) as f: # current behavior
with (open(fn) as f): # syntax error, missing clause
with closing(urlopen(url)) as dl: # current behavior
with closing(urlopen(url) as dl): # syntax error, missing clause
with (closing(urlopen(url)) as dl): # syntax error, missing clause
The second and fifth could be special cased as either the same as first and third, or as SyntaxErrors. (But which?) The fourth one is very tricky. If 'expr as name' is allowed inside arbitrary expressions, why shouldn't it be allowed there? The disconnect between viable syntax and useful statements is problematic here.
ChrisA
I am entirely new to this list, but if I can I would like share my comments :
I do think this proposal <target> := <value> has merit in my opinion; it does make some code more readable.
I think readability is only improved if :
chaining is not allowed - I think the construct :
while (line := input.read_row()) is not None: process_line(line)
Is readable, but :
while (current_line := line := input.read_row()) is not None: line = process_line(line)
is not obvious - and certainly isn't any more obvious than :
while (line := input.read_row()) is not None: current_line = line line = process_line(line)
The current expectations of how comprehensions work should also be honored; I don't claim to have fully followed all of the discussions around this, but it seems to me that comprehensions work in a particular way because of a concerted effect (especially in Python 3) to make them that way. They are self contained and don't leak values in their containing scope. Similarly I think that setting variables within a comprehension is just for the benefit of readable code within the comprehension - i.e. :
stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
can become :
stuff = [[y := f(x), x/y] for x in range(5)]
So - overall from me a conditional +1 - conditions as above; if they are not possible then -1 from me.
-- Anthony Flury email : Anthony.flury@btinternet.com Twitter : @TonyFlury https://twitter.com/TonyFlury/
On Sat, Apr 21, 2018 at 11:38 AM, Anthony Flury via Python-Dev
python-dev@python.org wrote:
I am entirely new to this list, but if I can I would like share my comments :
I do think this proposal <target> := <value> has merit in my opinion; it does make some code more readable.
I think readability is only improved if :
There's that word "readability" again. Sometimes I wish the Zen of Python didn't use it, because everyone seems to think that "readable" means "code I like".
The current expectations of how comprehensions work should also be honored; I don't claim to have fully followed all of the discussions around this, but it seems to me that comprehensions work in a particular way because of a concerted effect (especially in Python 3) to make them that way. They are self contained and don't leak values in their containing scope. Similarly I think that setting variables within a comprehension is just for the benefit of readable code within the comprehension - i.e. :
stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
can become :
stuff = [[y := f(x), x/y] for x in range(5)]
So - overall from me a conditional +1 - conditions as above; if they are not possible then -1 from me.
Perfectly self-contained. They do everything in their own scope. Except ... except that they don't.
$ python3.7 Python 3.7.0a4+ (heads/master:95e4d58913, Jan 27 2018, 06:21:05) [GCC 6.3.0 20170516] on linux Type "help", "copyright", "credits" or "license" for more information.
class X: ... x = ["spam", "ham"] ... print(["1: "+x for x in x]) ... print(["2: "+x for _ in range(1) for x in x]) ... ['1: spam', '1: ham'] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 4, in X File "<stdin>", line 4, in <listcomp> UnboundLocalError: local variable 'x' referenced before assignment class Y: ... prefix = "3: " ... print([prefix + x for x in ["spam", "ham"]]) ... Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in Y File "<stdin>", line 3, in <listcomp> NameError: name 'prefix' is not defined
Introducing expression assignments will make these oddities even more obvious. You'd be able to demonstrate things like this at function scope, not just with a class. But the oddities are there, and they are inherent to the current definition of a comprehension. That's why the changes are being recommended. They will simplify the peculiarities of comprehensions, make them far closer to a naive transformation into longhand, and extremely close to a non-naive-but-still-simple transformation into longhand.
ChrisA
On Sat, Apr 21, 2018 at 12:30:36PM +1000, Chris Angelico wrote:
There's that word "readability" again. Sometimes I wish the Zen of Python didn't use it, because everyone seems to think that "readable" means "code I like".
In fairness, if one can't read code, then one can hardly be expected to like it.
But there's plenty of code I can read that I don't like.
However your point still stands: since we don't have an objective and empirical measure of readability, we cannot help but be subjective about it. And with such subjective judgements, it is very, very hard to divorce personal opinions about what we like from subjective estimates of how readable something is.
[...]
Perfectly self-contained. They do everything in their own scope. Except ... except that they don't. [...]
They being comprehensions inside class scopes.
Class scopes are already a bit weird, and don't quite work the same as non-class scopes even without introducing comprehensions:
py> class X: ... a = 99 ... b = lambda: a+1 ... c = b() ... Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 4, in X File "<stdin>", line 3, in <lambda> NameError: name 'a' is not defined
I get bitten by this all the time. No, I tell a lie: I hardly ever get bitten by this. Judging by the number of questions about it on StackOverflow and the Python-List and Tutor mailing lists, I'd say that I'm not unusual here.
Introducing expression assignments will make these oddities even more obvious. You'd be able to demonstrate things like this at function scope, not just with a class.
In what way?
And are you absolutely sure they will be oddities? To give an analogy, I don't think this is an oddity:
py> def func(a=1, b=a+1): ... pass ... Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'a' is not defined
If anyone expected that the default value for b could make use of the default value for a, the answer is: no, Python function declarations don't work that way. Maybe they could, if we wanted them to, but we don't, so they don't.
So can you explain specifically what odd function-scope behaviour you are referring to? Give an example please?
-- Steve
On Sat, Apr 21, 2018 at 5:11 PM, Steven D'Aprano steve@pearwood.info wrote:
On Sat, Apr 21, 2018 at 12:30:36PM +1000, Chris Angelico wrote:
Introducing expression assignments will make these oddities even more obvious. You'd be able to demonstrate things like this at function scope, not just with a class.
In what way?
And are you absolutely sure they will be oddities? To give an analogy, I don't think this is an oddity:
py> def func(a=1, b=a+1): ... pass ... Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'a' is not defined
If anyone expected that the default value for b could make use of the default value for a, the answer is: no, Python function declarations don't work that way. Maybe they could, if we wanted them to, but we don't, so they don't.
So can you explain specifically what odd function-scope behaviour you are referring to? Give an example please?
doubled_items = [x for x in (items := get_items()) if x * 2 in items]
This will leak 'items' into the surrounding scope (but not 'x').
[x for x in x if x] # This works [x for y in x if x := y] # UnboundLocalError
(x for x in 5) # TypeError (x for _ in [1] for x in 5) # Works
I'm sure you can come up with more examples. The outermost iterable is special and magical.
ChrisA
On 21/04/18 08:46, Chris Angelico wrote:
doubled_items = [x for x in (items := get_items()) if x * 2 in items]
This will leak 'items' into the surrounding scope (but not 'x'). At the risk of stating the obvious - wasn't there work in Python 3 to prevent leakage from comprehensions ? [x for x in x if x] # This works [x for y in x if x := y] # UnboundLocalError
The standard library example given earlier notwithstanding, I can see no benefit in using the same name as the iterator and the loop target name. To be honest I have trouble parsing that first version, and keeping track of which x is which (especially which x is being used in the conditional clause) : surely this would be better : [x_item for x_item in x if x_item]
Your 2nd example makes no sense to me as to the intention of the code - the re-use of the name x is confusing at best.
-- Anthony Flury email : Anthony.flury@btinternet.com Twitter : @TonyFlury https://twitter.com/TonyFlury/
On Sat, Apr 21, 2018 at 6:38 PM, Anthony Flury via Python-Dev
python-dev@python.org wrote:
On 21/04/18 08:46, Chris Angelico wrote: >
doubled_items = [x for x in (items := get_items()) if x * 2 in items]
This will leak 'items' into the surrounding scope (but not 'x').
At the risk of stating the obvious - wasn't there work in Python 3 to prevent leakage from comprehensions ? >
[x for x in x if x] # This works [x for y in x if x := y] # UnboundLocalError
The standard library example given earlier notwithstanding, I can see no benefit in using the same name as the iterator and the loop target name. To be honest I have trouble parsing that first version, and keeping track of which x is which (especially which x is being used in the conditional clause) : surely this would be better : [x_item for x_item in x if x_item]
Your 2nd example makes no sense to me as to the intention of the code - the re-use of the name x is confusing at best.
I agree. The change in behaviour caused by PEP 572 is basically only going to be visible if you reuse a name, or in a very few other cases like yield expressions:
def gen(): yield [x for x in (yield 1)]
g = gen() next(g) g.send(range(5))
Once again, the outermost iterable is bizarre in this way.
ChrisA
On Sat, Apr 21, 2018 at 05:46:44PM +1000, Chris Angelico wrote:
On Sat, Apr 21, 2018 at 5:11 PM, Steven D'Aprano steve@pearwood.info wrote:
So can you explain specifically what odd function-scope behaviour you are referring to? Give an example please?
doubled_items = [x for x in (items := get_items()) if x * 2 in items]
This will leak 'items' into the surrounding scope (but not 'x').
The "not x" part is odd, I agree, but it's a popular feature to have comprehensions run in a separate scope, so that's working as designed.
The "leak items" part is the behaviour I desire, so that's not odd, it's sensible wink
The reason I want items to "leak" into the surrounding scope is mostly so that the initial value for it can be set with a simple assignment outside the comprehension:
items = (1, 2, 3)
[ ... items := items*2 ... ]
and the least magical way to do that is to just make items an ordinary local variable.
[x for x in x if x] # This works
The oddity is that this does work, and there's no assignment expression in sight.
Given that x is a local variable of the comprehension for x in ...
it
ought to raise UnboundLocalError, as the expanded equivalent does:
def demo(): result = [] for x in x: # ought to raise UnboundLocalError if x: result.append(x) return result
That the comprehension version runs (rather than raising) is surprising but I wouldn't call it a bug. Nor would I say it was a language guarantee that we have to emulate in similar expressions.
In the absence of either explicit documentation of this behaviour, or Guido or one of the senior core developers explicitly stating that it is intentional behaviour that should be considered a language promise, I'd call it an accident of implementation.
In which case, the fact that your next example:
[x for y in x if x := y] # UnboundLocalError
"correctly" raises, as does the expanded version:
def demo(): result = [] for y in x: # ought to raise UnboundLocalError x = y # since x is a local if x: result.append(x) return result
shouldn't be seen as a problem. The code is different, so why should it behave the same?
(x for x in 5) # TypeError (x for _ in [1] for x in 5) # Works
Now that last one is more than just odd, it is downright bizarre. Or at least it would, if it did work:
py> list((x for _ in [1] for x in 5)) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <genexpr> TypeError: 'int' object is not iterable
Are you sure about this example?
In any case, since this has no assignment expression in it, I don't see why it is relevant.
-- Steve
On Sat, Apr 21, 2018 at 10:26 PM, Steven D'Aprano steve@pearwood.info wrote:
On Sat, Apr 21, 2018 at 05:46:44PM +1000, Chris Angelico wrote:
On Sat, Apr 21, 2018 at 5:11 PM, Steven D'Aprano steve@pearwood.info wrote:
So can you explain specifically what odd function-scope behaviour you are referring to? Give an example please?
doubled_items = [x for x in (items := get_items()) if x * 2 in items]
This will leak 'items' into the surrounding scope (but not 'x').
The "not x" part is odd, I agree, but it's a popular feature to have comprehensions run in a separate scope, so that's working as designed.
The "leak items" part is the behaviour I desire, so that's not odd, it's sensible wink
The reason I want items to "leak" into the surrounding scope is mostly so that the initial value for it can be set with a simple assignment outside the comprehension:
items = (1, 2, 3)
[ ... items := items*2 ... ]
and the least magical way to do that is to just make items an ordinary local variable.
You can't have your cake and eat it too. Iteration variables and names bound by assignment expressions are both set inside the comprehension. Either they both are local, or they both leak - or else we have a weird rule like "the outermost iterable is magical and special".
[x for x in x if x] # This works
The oddity is that this does work, and there's no assignment expression in sight.
Given that x is a local variable of the comprehension for x in ...
it
ought to raise UnboundLocalError, as the expanded equivalent does:
def demo(): result = [] for x in x: # ought to raise UnboundLocalError if x: result.append(x) return result
That the comprehension version runs (rather than raising) is surprising but I wouldn't call it a bug. Nor would I say it was a language guarantee that we have to emulate in similar expressions.
See, that's the problem. That is NOT how the comprehension expands. It actually expands to this:
def demo(it): result = [] for x in it: if x: result.append(x) return result demo(iter(x))
PEP 572 corrects this by making it behave the way that you, and many other people, expect. Current behaviour is surprising because the outermost iterable is special and magical.
(x for x in 5) # TypeError (x for _ in [1] for x in 5) # Works
Now that last one is more than just odd, it is downright bizarre. Or at least it would, if it did work:
py> list((x for _ in [1] for x in 5)) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <genexpr> TypeError: 'int' object is not iterable
Are you sure about this example?
Yes, I'm sure. You may notice that I didn't iterate over the genexps in my example. The first one will bomb out, even without iteration; the second one gives a valid generator object which, if iterated over (or even stepped once), will bomb. This is because, again, the outermost iterable is special and magical.
In any case, since this has no assignment expression in it, I don't see why it is relevant.
Because an assignment expression in the outermost iterable would, if the semantics are preserved, bind in the surrounding scope. It would be FAR more logical to have it bind in the inner scope. Consider these two completely different results:
def f(*prefix): print([p + name for p in prefix for name in locals()]) print([p + name for name in locals() for p in prefix])
f(" ", "$ ") [' .0', ' p', '$ .0', '$ p', '$ name'] [' prefix', '$ prefix']
The locals() as seen by the outermost iterable are f's locals, and any assignment expression there would be part of f's locals. The locals() as seen by any other iterable, by a condition, or by the primary expression, are the list comp's locals, and any assignment expression there would be part of the list comp's locals.
ChrisA
It feels very strange that the PEP tries to do two almost entirely unrelated things. Assignment expressions are one thing, with merits and demerits discussed at length.
But "fixing" comprehension scoping is pretty much completely orthogonal. Sure, it might be a good idea. And yes there are interactions between the behaviors. However, trying to shoehorn the one issue into a PEP on a different topic makes all of it harder to accept.
The "broken" scoping in some slightly strange edge cases can and has been shown in lots of examples that don't use assignment expressions. Whether or not that should be changed needn't be linked to the real purpose of this PEP.
On Sat, Apr 21, 2018, 10:46 AM Chris Angelico rosuav@gmail.com wrote:
On Sat, Apr 21, 2018 at 10:26 PM, Steven D'Aprano steve@pearwood.info wrote:
On Sat, Apr 21, 2018 at 05:46:44PM +1000, Chris Angelico wrote:
On Sat, Apr 21, 2018 at 5:11 PM, Steven D'Aprano steve@pearwood.info wrote:
So can you explain specifically what odd function-scope behaviour you are referring to? Give an example please?
doubled_items = [x for x in (items := get_items()) if x * 2 in items]
This will leak 'items' into the surrounding scope (but not 'x').
The "not x" part is odd, I agree, but it's a popular feature to have comprehensions run in a separate scope, so that's working as designed.
The "leak items" part is the behaviour I desire, so that's not odd, it's sensible wink
The reason I want items to "leak" into the surrounding scope is mostly so that the initial value for it can be set with a simple assignment outside the comprehension:
items = (1, 2, 3)
[ ... items := items*2 ... ]
and the least magical way to do that is to just make items an ordinary local variable.
You can't have your cake and eat it too. Iteration variables and names bound by assignment expressions are both set inside the comprehension. Either they both are local, or they both leak - or else we have a weird rule like "the outermost iterable is magical and special".
[x for x in x if x] # This works
The oddity is that this does work, and there's no assignment expression in sight.
Given that x is a local variable of the comprehension for x in ...
it
ought to raise UnboundLocalError, as the expanded equivalent does:
def demo(): result = [] for x in x: # ought to raise UnboundLocalError if x: result.append(x) return result
That the comprehension version runs (rather than raising) is surprising but I wouldn't call it a bug. Nor would I say it was a language guarantee that we have to emulate in similar expressions.
See, that's the problem. That is NOT how the comprehension expands. It actually expands to this:
def demo(it): result = [] for x in it: if x: result.append(x) return result demo(iter(x))
PEP 572 corrects this by making it behave the way that you, and many other people, expect. Current behaviour is surprising because the outermost iterable is special and magical.
(x for x in 5) # TypeError (x for _ in [1] for x in 5) # Works
Now that last one is more than just odd, it is downright bizarre. Or at least it would, if it did work:
py> list((x for _ in [1] for x in 5)) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <genexpr> TypeError: 'int' object is not iterable
Are you sure about this example?
Yes, I'm sure. You may notice that I didn't iterate over the genexps in my example. The first one will bomb out, even without iteration; the second one gives a valid generator object which, if iterated over (or even stepped once), will bomb. This is because, again, the outermost iterable is special and magical.
In any case, since this has no assignment expression in it, I don't see why it is relevant.
Because an assignment expression in the outermost iterable would, if the semantics are preserved, bind in the surrounding scope. It would be FAR more logical to have it bind in the inner scope. Consider these two completely different results:
def f(*prefix): print([p + name for p in prefix for name in locals()]) print([p + name for name in locals() for p in prefix])
f(" ", "$ ") [' .0', ' p', '$ .0', '$ p', '$ name'] [' prefix', '$ prefix']
The locals() as seen by the outermost iterable are f's locals, and any assignment expression there would be part of f's locals. The locals() as seen by any other iterable, by a condition, or by the primary expression, are the list comp's locals, and any assignment expression there would be part of the list comp's locals.
ChrisA
Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/mertz%40gnosis.cx
On 22 April 2018 at 01:44, David Mertz mertz@gnosis.cx wrote:
It feels very strange that the PEP tries to do two almost entirely unrelated things. Assignment expressions are one thing, with merits and demerits discussed at length.
But "fixing" comprehension scoping is pretty much completely orthogonal. Sure, it might be a good idea. And yes there are interactions between the behaviors. However, trying to shoehorn the one issue into a PEP on a different topic makes all of it harder to accept.
The "broken" scoping in some slightly strange edge cases can and has been shown in lots of examples that don't use assignment expressions. Whether or not that should be changed needn't be linked to the real purpose of this PEP.
The reason it's covered in the PEP is because the PEP doesn't want to lock in the current "binds the name in the surrounding scope" semantics when assignment expressions are used in the outermost iterable in a comprehension.
However, resolving that question could be postponed more simply by making that a SyntaxError, rather than trying to move the expression evaluation inside the implicitly nested scope.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
It could also be postponed simply by saying assignment expressions follow the same semantics as other bindings in comprehensions... which are subject to change pending PEP XXXX (i.e. some different number).
On the other hand, I am one who doesn't really care about assignment expressions in comprehensions and only see the real benefit for 'if' and 'while' statements. I'm sure if it's added, I'll wind up using them in comprehensions, but I've been perfectly happy with this for years:
stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
There's nothing quite analogous in current Python for:
while (command := input("> ")) != "quit": print("You entered:", command)
On Sat, Apr 21, 2018, 11:57 AM Nick Coghlan ncoghlan@gmail.com wrote:
On 22 April 2018 at 01:44, David Mertz mertz@gnosis.cx wrote:
It feels very strange that the PEP tries to do two almost entirely unrelated things. Assignment expressions are one thing, with merits and demerits discussed at length.
But "fixing" comprehension scoping is pretty much completely orthogonal. Sure, it might be a good idea. And yes there are interactions between the behaviors. However, trying to shoehorn the one issue into a PEP on a different topic makes all of it harder to accept.
The "broken" scoping in some slightly strange edge cases can and has been shown in lots of examples that don't use assignment expressions. Whether or not that should be changed needn't be linked to the real purpose of this PEP.
The reason it's covered in the PEP is because the PEP doesn't want to lock in the current "binds the name in the surrounding scope" semantics when assignment expressions are used in the outermost iterable in a comprehension.
However, resolving that question could be postponed more simply by making that a SyntaxError, rather than trying to move the expression evaluation inside the implicitly nested scope.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sat, Apr 21, 2018 at 03:44:48PM +0000, David Mertz wrote:
It feels very strange that the PEP tries to do two almost entirely unrelated things. Assignment expressions are one thing, with merits and demerits discussed at length.
But "fixing" comprehension scoping is pretty much completely orthogonal.
This.
Sure, it might be a good idea. And yes there are interactions between the behaviors. However, trying to shoehorn the one issue into a PEP on a different topic makes all of it harder to accept.
Indeed.
And harder to understand.
As I see it, we ought to just decide on the semantics of assignment- expressions as they relate to comprehensions: do they bind to the comprehension scope, or the local scope? I prefer the second, for the reasons I stated earlier.
A third (slightly more complex) choice would be that they remain bound to the comprehension (like the loop variable) but they are initialised from any surrounding scope. I'd be happy with that as a "best of both worlds" compromise:
# don't leak from comprehensions
x = 1
[(x := y+1) for y in items if x%2 == 0]
assert x == 1
# but still support running totals and similar use-cases
total = 0
[(total := total + y) for y in items]
# and you can still get UnboundLocalError
del total
[(total := total + y) for y in items] # total has no initial value
This is not entirely unprecedented in Python: it is analogous (although not identical) to binding default values to parameters:
def running_total(items, total=total):
# Here total is local to the function, but the default
# is taken from the surrounding scope.
Cleaning up the odd interactions involved in comprehensions could be done separately, or later, or not at all. After all, this PEP isn't introducing those oddities. As Chris' earlier examples show, they already exist.
-- Steve
On 22 April 2018 at 02:31, Steven D'Aprano steve@pearwood.info wrote:
This is not entirely unprecedented in Python: it is analogous (although not identical) to binding default values to parameters:
def running_total(items, total=total):
# Here total is local to the function, but the default
# is taken from the surrounding scope.
The stronger precedent for "look up elsewhere until first use" is class scopes:
>>> x = "global"
>>> class C:
... print(x)
... x = "class attribute to be"
... print(x)
...
global
class attribute to be
However, that has its own quirks, in that it ignores function scopes entirely:
>>> def f():
... x = "function local"
... class C:
... print(x)
... x = "class attribute to be"
... print(x)
...
>>> f()
global
class attribute to be
Whereas if you don't rebind the name in the class body, the class scope can see the function local as you'd expect:
>>> def f2():
... x = "function local"
... class C:
... print(x)
...
>>> f2()
function local
While I haven't explicitly researched the full history, my assumption is that references from class scopes prior to a local name rebinding are an edge case that https://www.python.org/dev/peps/pep-0227/ didn't fully account for, so they retain their original pre-PEP-227 behaviour.
Cheers, Nick.
P.S. It may be becoming clearer why the earlier iterations of PEP 572 proposed sublocal scoping semantics for the new name binding expression: it not only gives greater differentiation from traditional assignments and limits the potential for obviously unwanted side effects like accidentally clobbering a name that's already in use, it also sidesteps a lot of these quirky name resolution issues that arise when you use full local name bindings.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, Apr 22, 2018 at 12:45:36AM +1000, Chris Angelico wrote:
The reason I want items to "leak" into the surrounding scope is mostly so that the initial value for it can be set with a simple assignment outside the comprehension:
items = (1, 2, 3)
[ ... items := items*2 ... ]
and the least magical way to do that is to just make items an ordinary local variable.
You can't have your cake and eat it too. Iteration variables and names bound by assignment expressions are both set inside the comprehension.
You say that as if it were a law of physics, rather than an implementation choice.
Either they both are local, or they both leak - or else we have a weird rule like "the outermost iterable is magical and special".
We already have the rule that the outermost iterable is special, except it isn't a rule precisely, since (as far as I know) it isn't documented anywhere, nor was it ever planned as a feature. It's just an accidental(?) consequence of the implementation choices made.
py> spam = [(1,2), (3, 4)] py> [spam for x in spam for spam in x] # first loop is magic [1, 2, 3, 4]
but:
py> spam = [(1,2), (3, 4)] py> [spam for _ in [1] for x in spam for spam in x] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> UnboundLocalError: local variable 'spam' referenced before assignment
However the second example worked fine in Python 2. Changing the implementation of comprehensions to be like generator expressions and avoid leaking the loop variables seems to have had the (accidental?) side-effect of making the first loop magical.
[...]
PEP 572 corrects this by making it behave the way that you, and many other people, expect. Current behaviour is surprising because the outermost iterable is special and magical.
This shouldn't be PEP 572's job.
It's unfair on you to be shouldered with sheparding through what is effectively two complex PEPs ("assignment-expressions" plus "fix comprehension scoping") in one. Especially if you had no idea at the start that this is what is involved.
And it's even more unfair on those who may not care two hoots about assignment-expressions, but would be really interested in comprehension scoping if only they knew we were talking about that.
And it makes this PEP harder work for readers who don't care about comprehension scoping.
I think that we should be able to make any of the following choices (or no choice at all) regarding comprehensions:
regardless of what is decided about PEP 572.
[...]
Are you sure about this example?
Yes, I'm sure. You may notice that I didn't iterate over the genexps in my example.
No, I didn't notice that was your intent. I thought it was just short-hand.
The first one will bomb out, even without iteration;
And that's yet another oddity, one I didn't think of. It's downright bizarre that these two genexps behave differently:
spam = [1, 2] eggs = 12 (x+y for x in spam for y in eggs) # okay (x+y for y in eggs for x in spam) # TypeError
and I'd be surprised to learn that this behavour was planned in advance. ("Early binding and ahead-of-time type-testing for the first loop, late binding and just-in-time type-testing for the second loop. All in favour?")
But it is what it is, and who knows, maybe we decide we want this behaviour, bizarre as it is. It isn't clear to me that:
it's necessarily "broken" and needs fixing;
if if does need fixing, it needs to be fixed right now;
that acceptance or rejection of PEP 572 needs to hinge on the decision about comprehensions;
and especially that a change to comprehensions ought to be smuggled in via an unrelated PEP.
(I know that 4 is not your intention, but that's the way it may appear.)
-- Steve
On 22 April 2018 at 03:41, Steven D'Aprano steve@pearwood.info wrote:
We already have the rule that the outermost iterable is special, except it isn't a rule precisely, since (as far as I know) it isn't documented anywhere, nor was it ever planned as a feature.
It's a deliberate feature of generator expressions: https://www.python.org/dev/peps/pep-0289/#the-details
Covered in the language reference here: https://docs.python.org/3/reference/expressions.html#generator-expressions
It's then inherited by comprehensions through the intentional semantic equivalences between:
[x for x in iterable] <-> list(x for x in iterable)
{x for x in iterable} <-> set(x for x in iterable)
{k(x):v(x) for x in iterable} <-> set((k(x), v(x) for x in iterable))
The consequences of those equivalences weren't historically spelled out in the language reference, but we fixed that omission when deprecating the use of "yield" & "yield from" in comprehensions for 3.7: https://github.com/python/cpython/commit/73a7e9b10b2ec9636e3c6396cf7b3695f8e... (we had to in order to explain why you can still use "yield" and "yield from" in the outermost iterable - it's because the restriction only applies to the implicitly nested scope).
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 2018-04-21, 07:46 GMT, Chris Angelico wrote:
doubled_items = [x for x in (items := get_items()) if x * 2 in items]
Aside from other concerns expressed elsewhere by other people, do you really like this? I know and agree that “readability” is a subjective term, but it is my firm persuasion that whenever I need to think about what particular list comprehension means, it is the moment I should write a separate function or normal cycle. I think we should encourage writing simple (I didn’t use the r* word here, you see!) code than something which has potential to slide us towards Perl.
Best,
https://matej.ceplovi.cz/blog/, Jabber: mcepl@ceplovi.cz GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8
A day without sunshine is like night.
On 21 April 2018 at 17:11, Steven D'Aprano steve@pearwood.info wrote:
So can you explain specifically what odd function-scope behaviour you are referring to? Give an example please?
Once we allow name binding as an expression, there are three main cases to consider in comprehensions:
The first two cases are fine (they happen in the implicit nested scope, and hence don't affect the scope containing the comprehension), but the behaviour in the third case bothered people, because it broke down into two distinct subcases:
3a. For the outermost iterable, the binding always happens in the surrounding scope, and hence will not be accessible from the rest of the comprehension when used at class scope. 3b. For any nested iterables, the binding happens in the implicit nested scope, as for other comprehension subexpressions
In the original version of PEP 572 (the one with sublocal scopes), the consequences of 3a were just that you couldn't meaningfully use assignment expressions in the outermost iterable expression of a comprehension, since neither the implicitly nested scope nor the surrounding scope would be able to see them. Too bad, so sad, don't do that then (since it's pointless).
In the revised version of PEP 572 that just used regular local assignment, the side effects of 3a were more concerning, since they meant that we'd be bringing back the comprehension variable leakage problem, albeit in a far more esoteric form. Even less defensibly, the construct would just work at function scope, work, but define an unexpected module attribute at module scope, and simply not work at all at class scope. Hence the changes to the PEP to move even the evaluation of the outermost iterable inside the implicitly nested scope, rather than leaving it outside the way it is now.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 21 April 2018 at 03:30, Chris Angelico rosuav@gmail.com wrote:
There's that word "readability" again. Sometimes I wish the Zen of Python didn't use it, because everyone seems to think that "readable" means "code I like".
Readability is subjective, yes. But it's not the same as "liking". If a significant number of people say that they find a piece of code hard to read/understand, then that's a problem. It's all too easy to say "you don't have to write code like that", but as someone who has been a maintenance programmer for his whole career, I can tell you that people don't always have that luxury. And supporting code that's written in a language that prioritises "readability" (whatever that may mean) is a much easier task than supporting code written in a language that doesn't. There's a reason far fewer people write systems in Perl these days, and it's not because you can't write clear and maintainable code in Perl...
I think that enough people have flagged up readability concerns, that the PEP should take that seriously. One improvement would be to limit the proposal to assignment of simple names (there have been far fewer complaints about readability for such examples). Another would be to simply address the concern more seriously than the current "This can be used to create ugly code" section. How about a heading like "Code that uses this construct could be difficult to maintain", with a discussion that acknowledges that maintenance programmers usually didn't write the code, and sometimes don't have the freedom to rewrite working code. It could mention other easy-to-misuse constructs like over-complicated comprehensions, and point out that in practice things haven't turned out as bad with those as was feared.
Paul.
On Sat, Apr 21, 2018 at 7:12 PM, Paul Moore p.f.moore@gmail.com wrote:
On 21 April 2018 at 03:30, Chris Angelico rosuav@gmail.com wrote:
There's that word "readability" again. Sometimes I wish the Zen of Python didn't use it, because everyone seems to think that "readable" means "code I like".
Readability is subjective, yes. But it's not the same as "liking". If a significant number of people say that they find a piece of code hard to read/understand, then that's a problem. It's all too easy to say "you don't have to write code like that", but as someone who has been a maintenance programmer for his whole career, I can tell you that people don't always have that luxury. And supporting code that's written in a language that prioritises "readability" (whatever that may mean) is a much easier task than supporting code written in a language that doesn't. There's a reason far fewer people write systems in Perl these days, and it's not because you can't write clear and maintainable code in Perl...
But you haven't answered anything about what "readable" means. Does it mean "if I look at this code, I can predict what dis.dis() would output"? Or does it mean "this code clearly expresses an algorithm and the programmer's intent"? Frequently I hear people complain that something is unreadable because it fails the former check. I'm much more interested in the latter check. For instance, this line of code expresses the concept "generate the squares of odd numbers":
[x*x for x in range(100) if x % 2]
But it doesn't clearly express the disassembly. Is that a problem? Are list comprehensions a bad feature for that reason? I don't think so.
ChrisA
On 21/04/18 11:18, Chris Angelico wrote:
But you haven't answered anything about what "readable" means. Does it mean "if I look at this code, I can predict what dis.dis() would output"? Or does it mean "this code clearly expresses an algorithm and the programmer's intent"? Frequently I hear people complain that something is unreadable because it fails the former check. I'm much more interested in the latter check. For instance, this line of code expresses the concept "generate the squares of odd numbers":
[x*x for x in range(100) if x % 2]
But it doesn't clearly express the disassembly. Is that a problem? Are list comprehensions a bad feature for that reason? I don't think so.
ChrisA
For what it worth - readability for me is all about understanding the intent. I don't care (most of the time) about how the particular code construct is actually implemented. When I am maintaining code (or trying to) I need to understand what the developer intended (or in the case of a bug, the gap between the outcome and the intention).
One of the challenges about readability is it partially depends on skill level - for a beginner the comprehension may well be baffling where as someone with more skills would understand it - almost intuitively; as an example: I have been using Python for 7 years - and comprehensions with more than one for loop still are not intuitive for me, I can't read them without an amount of deep thought about how the loops work together.
Anthony Flury email : Anthony.flury@btinternet.com Twitter : @TonyFlury https://twitter.com/TonyFlury/
Chris Angelico writes:
There's that word "readability" again. Sometimes I wish the Zen of Python didn't use it, because everyone seems to think that "readable" means "code I like".
Hey, man, that hurts. Some of us not only have precise statements of the aspects of readability we invoked, but we provided them in our posts.
I sympathize that you get tired of the repetitions of various claims about readability, as well as the proliferation of purely subjective claims about it, but that doesn't mean they deserve to be dismissed this way.
That said, subjectivity is a real problem, and it's not a PEP protagonist's responsibility to deal with it. I would like to recommend that posters who want to make claims about "readability" remember that it's one aspect of Pythonic language design out of many, and that it is a complex concept itself. A claim about readability is a claim about an aspect of the value of a construct, but the word "readability" alone is too subjective to explain why it has (or lacks) that value.
If you can't describe, let alone define, what you mean by "readable", but still feel strongly enough to post, think whether you really mean that word. If so, apologize for that lack, because PEP protagonists have no obligation to figure out what you can't. If not, find another word to express the feeling. Or postpone posting until you have a description or better word.
Language design has a lot of these words involving complexity and subjectivity: readability, expressiveness, power. Remember that your "vote" counts even when subjective: you don't have to justify your likes. But if it's personal, you can express that succinctly with +1/-1. If you want to claim that the subjective feeling is common to many users of Python, you need to communicate it. Try to define the aspect you're using. Do so explicitly, unless the definition you're using was given already in a recent post and is current in the thread.
Steve
Round 2 (Changed order, see below):
1. with open(fn) as f: # current behavior
2. with (open(fn) as f): # same
3. with closing(urlopen(url)) as dl: # current behavior
5. with (closing(urlopen(url)) as dl): # same
4. with closing(urlopen(url) as dl): # urlopener named early
On 2018-04-20 17:15, Chris Angelico wrote:
The second and fifth could be special cased as either the same as first and third, or as SyntaxErrors. (But which?)
If they are expressions, they should be the same once evaluated, no?
(I had a brief episode where I wrote that "as" was required with "with", instead of the CM object, sorry. :)
The fourth one is very tricky. If 'expr as name' is allowed inside arbitrary expressions, why shouldn't it be allowed there?
Yes, they should be allowed there.
The disconnect between viable syntax and useful statements is problematic here.
Number 4 appears to name the urlopener early. Since closing() returns it as well, might it work anyway?
Might be missing something else, but #4 looks like a mistake with the layout of the parentheses, which can happen anywhere. I don't get the sense it will happen often.
Cheers, -Mike
On Sun, Apr 22, 2018 at 12:04 PM, Mike Miller python-dev@mgmiller.net wrote:
Round 2 (Changed order, see below):
1. with open(fn) as f: # current behavior
2. with (open(fn) as f): # same
3. with closing(urlopen(url)) as dl: # current behavior
5. with (closing(urlopen(url)) as dl): # same
4. with closing(urlopen(url) as dl): # urlopener named early
On 2018-04-20 17:15, Chris Angelico wrote: >
The second and fifth could be special cased as either the same as first and third, or as SyntaxErrors. (But which?)
If they are expressions, they should be the same once evaluated, no?
(I had a brief episode where I wrote that "as" was required with "with", instead of the CM object, sorry. :)
The fourth one is very tricky. If 'expr as name' is allowed inside arbitrary expressions, why shouldn't it be allowed there?
Yes, they should be allowed there.
The disconnect between viable syntax and useful statements is problematic here.
Number 4 appears to name the urlopener early. Since closing() returns it as well, might it work anyway?
Might be missing something else, but #4 looks like a mistake with the layout of the parentheses, which can happen anywhere. I don't get the sense it will happen often.
It's actually semantically identical to option 3, but not semantically identical to option 5, unless there is a magical special case that says that a 'with' statement is permitted to have parentheses for no reason. The 'closing' context manager returns the inner CM, not the closing CM itself. If we rewrite these into approximate equivalents without the 'with' statement, what we have is this:
1. with open(fn) as f: #
current behavior
file = open(fn) f = file.__enter__() assert file is f # passes for file objects
2. with (open(fn) as f): #
same
f = open(fn) f.__enter__()
# The return value from enter is discarded
3. with closing(urlopen(url)) as dl: #
current behavior
downloader = urlopen(url) closer = closing(downloader) dl = closer.__enter__() assert dl is downloader # passes for closing objects
5. with (closing(urlopen(url)) as dl): #
same
downloader = urlopen(url) dl = closing(downloader) dl.__enter__()
# Return value from __enter__ is discarded
4. with closing(urlopen(url) as dl): #
urlopener named early
dl = urlopen(url) closer = closing(dl) closer.__enter__()
# Return value is discarded again
Notice how there are five distinctly different cases here. When people say there's a single obvious way to solve the "with (expr as name):" case, they generally haven't thought about all the different possibilities. (And I haven't mentioned the possibility that __enter__ returns something that you can't easily reference from inside the expression, though it's not materially different from closing().)
There are a few ways to handle it. One is to create a special case in the grammar for 'with' statement parentheses:
with_stmt: 'with' with_item (',' with_item)* ':' suite with_item: (test ['as' expr]) | ('(' test ['as' expr] ')')
which will mean that these two do the same thing:
with spam as ham:
with (spam as ham):
but this won't:
with ((spam as ham)):
And even with that special case, the use of 'as' inside a 'with' statement is subtly different from its behaviour anywhere else, so it would be confusing. So a better way is to straight-up disallow 'as' expressions inside 'with' headers (meaning you get a SyntaxError if the behaviour would be different from the unparenthesized form). Still confusing ("why can't I do this?"). And another way is to just not use 'as' at all, and pick a different syntax. That's why the PEP now recommends ':='.
ChrisA
On 2018-04-21 19:57, Chris Angelico wrote:
Thanks for being patient.
Looks like the crux of the issue is that "with … as" binds the result of the enter function rather than the context-manager object, as it might first appear. Therefore it's not compatible with how "as" is used for direct name bindings after "import" statements or this sub-proposal. Additionally, "except Class as instance" names the instance rather than the class.
So, the "as" keyword is already operating at an intuitive level rather than idealistic perfection. Three different semantics for import/with/except, correct? This sub-proposal lines up with the import use, I believe.
Given that there are no use cases for using assignment-expressions in the import/with/except statements, and it could be documented that if one insists an extra set of parens could make it work:
with (callable() as cm_obj) as enter_result_obj:
pass
It doesn't feel like this issue should be a blocker.
TL;DR - Been feebly trying to make the argument that everyday "intuitive consistency" (where the expression will be used) is more important than avoiding theoretical problems. I've never seen complex with/except statements in the wild and don't expect this feature to significantly alter that.
-Mike
On 2018-04-22 12:37, Chris Angelico wrote:
Kinda, except that that's not quite a match either. But mainly, the comparison with 'with' and 'except' is dangerously incompatible.
Hmm, looks very close conceptually, though mechanics are different.
Dangerous feels like an exaggeration however. I've made the argument that occurrences would be very rare, but if I'm wrong, the code should blow up on its first run. Perhaps a sanity check could be put in?
There is a section of your PEP that argues against the "bad code could potentially be written" argument, and think it applies here.
Maybe not, but why not just use ':=' to avoid that?
Don't hate it but feels like Pascal and C and not "Pythonic." Too many colons, avoiding the questions about the difference between "=" and ":=". Expression first is another win. People know how to use "as".
Intuitive consistency isn't enough to handle complex cases. Programming languages that favour intuitive consistency end up with a million special cases.
Ok, but I think we have all the tools we need here, there's just an extra place to stub your toe out in the weeds.
To turn the question around, are we really worried that this awkward code (or some variant) is about to be written?
with (cm_obj := callable()) as enter_result_obj:
cm_obj.write() # AttributeError
If not, I argue it is a theoretical problem that, if hit, blows up immediately.
-Mike
On Mon, Apr 23, 2018 at 6:22 AM, Mike Miller python-dev@mgmiller.net wrote:
On 2018-04-22 12:37, Chris Angelico wrote:
Kinda, except that that's not quite a match either. But mainly, the comparison with 'with' and 'except' is dangerously incompatible.
Hmm, looks very close conceptually, though mechanics are different.
Dangerous feels like an exaggeration however. I've made the argument that occurrences would be very rare, but if I'm wrong, the code should blow up on its first run. Perhaps a sanity check could be put in?
with open(fn) as f: with (open(fn) as f):
These two do the same thing, but only because a file object's __enter__ returns self. So it's dangerous, because it WILL work... and people will get into the habit of parenthesizing to permit a 'with' statement to go across line breaks. And then they'll use a different context manager, like closing(), or a PsycoPG2 database connection (I think), where it returns something else. And it'll work, until they go over multiple lines, and then suddenly the semantics change. It's as bad as writing JavaScript code like this:
function f(x) { return x
+ 1;
}
and then transforming it to this:
function f(x) { return x + 1; }
and having it change in behaviour. (Yes, it happens. Welcome to JavaScript, where implicit semicolons are a thing.)
Intuitive consistency isn't enough to handle complex cases. Programming languages that favour intuitive consistency end up with a million special cases.
Ok, but I think we have all the tools we need here, there's just an extra place to stub your toe out in the weeds.
To turn the question around, are we really worried that this awkward code (or some variant) is about to be written?
with (cm_obj := callable()) as enter_result_obj:
cm_obj.write() # AttributeError
If not, I argue it is a theoretical problem that, if hit, blows up immediately.
Were it to blow up immediately, I wouldn't be too bothered.
ChrisA
On 2018-04-22 14:33, Chris Angelico wrote:
with open(fn) as f: with (open(fn) as f):
These two do the same thing, but only because a file object's __enter__ returns self. So it's dangerous, because it WILL work... and people will get into the habit of parenthesizing to permit a 'with' statement to go across line breaks. And then they'll use a different context manager, like closing(), or a PsycoPG2 database connection (I think), where it returns something else. And it'll work, until they go over multiple lines, and then suddenly the semantics change.
Why do you think folks will be rushing to parenthesize with statements when it has always been a syntax error, there is years of code and docs that show otherwise, no use cases, and will take years for 3.8 to trickle out?
Seems remote, and there are mitigations that could be done.
Again it's back to "people could write bad code," but they already can with only +/* and ().
-Mike
On Mon, Apr 23, 2018 at 8:20 AM, Mike Miller python-dev@mgmiller.net wrote: >
On 2018-04-22 14:33, Chris Angelico wrote: >
with open(fn) as f: with (open(fn) as f):
These two do the same thing, but only because a file object's __enter__ returns self. So it's dangerous, because it WILL work... and people will get into the habit of parenthesizing to permit a 'with' statement to go across line breaks. And then they'll use a different context manager, like closing(), or a PsycoPG2 database connection (I think), where it returns something else. And it'll work, until they go over multiple lines, and then suddenly the semantics change.
Why do you think folks will be rushing to parenthesize with statements when it has always been a syntax error, there is years of code and docs that show otherwise, no use cases, and will take years for 3.8 to trickle out?
Because it's been requested a number of times as a way to allow a 'with' statement to go across lines without backslashes.
If it becomes possible, it will be used.
ChrisA
Please stop debating <expr> as <name>
. Nobody is being swayed
by anything
in this subthread. Let's move on.
On Sun, Apr 22, 2018 at 3:27 PM, Chris Angelico rosuav@gmail.com wrote:
On Mon, Apr 23, 2018 at 8:20 AM, Mike Miller python-dev@mgmiller.net wrote: >
On 2018-04-22 14:33, Chris Angelico wrote: >
with open(fn) as f: with (open(fn) as f):
These two do the same thing, but only because a file object's __enter__ returns self. So it's dangerous, because it WILL work... and people will get into the habit of parenthesizing to permit a 'with' statement to go across line breaks. And then they'll use a different context manager, like closing(), or a PsycoPG2 database connection (I think), where it returns something else. And it'll work, until they go over multiple lines, and then suddenly the semantics change.
Why do you think folks will be rushing to parenthesize with statements when it has always been a syntax error, there is years of code and docs that show otherwise, no use cases, and will take years for 3.8 to trickle out?
Because it's been requested a number of times as a way to allow a 'with' statement to go across lines without backslashes.
If it becomes possible, it will be used.
ChrisA
Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org
-- --Guido van Rossum (python.org/~guido)
On 2018-04-17 08:46, Chris Angelico wrote:
Having survived four rounds in the boxing ring at python-ideas, PEP 572 is now ready to enter the arena of python-dev.
I would like to suggest one more motivating example for "Capturing condition values": multiple regex matches with 'elif'.
if match := re.search(pat1, text): print("Found one:", match.group(0)) elif match := re.search(pat2, text): print("Found two:", match.group(0)) elif match := re.search(pat3, text): print("Found three:", match.group(0))
Without assignment expressions, you have an annoying choice between a cascade of 'else's with an ever-increasing indent and evaluating all the matches up front (so doing unnecessary work).
-M-
[Matthew Woodcraft matthew@woodcraft.me.uk]
I would like to suggest one more motivating example for "Capturing condition values": multiple regex matches with 'elif'.
if match := re.search(pat1, text): print("Found one:", match.group(0)) elif match := re.search(pat2, text): print("Found two:", match.group(0)) elif match := re.search(pat3, text): print("Found three:", match.group(0))
Without assignment expressions, you have an annoying choice between a cascade of 'else's with an ever-increasing indent and evaluating all the matches up front (so doing unnecessary work).
That's a reasonable use, but would more likely be written like so today:
for tag, pat in (("one", pat1), ("two", pat2), ("three", pat3). ("four", pat4), ...): match = re.search(pat, text) if match: print("Found", tag + ":", match.group(0)) break
Which would still read a bit nicer if the first two loop body lines could be collapsed to
if match := re.search(pat, text):
On 2018-04-21 19:02, Tim Peters wrote:
[Matthew Woodcraft matthew@woodcraft.me.uk]
I would like to suggest one more motivating example for "Capturing condition values": multiple regex matches with 'elif'.
if match := re.search(pat1, text): print("Found one:", match.group(0)) elif match := re.search(pat2, text): print("Found two:", match.group(0)) elif match := re.search(pat3, text): print("Found three:", match.group(0))
Without assignment expressions, you have an annoying choice between a cascade of 'else's with an ever-increasing indent and evaluating all the matches up front (so doing unnecessary work).
That's a reasonable use, but would more likely be written like so today:
for tag, pat in (("one", pat1), ("two", pat2), ("three", pat3). ("four", pat4), ...): match = re.search(pat, text) if match: print("Found", tag + ":", match.group(0)) break
Well, that's a reason to make the example a bit more realistic, then.
Say:
if match := re.search(pat1, text): do_something_with(match.group(0)) elif match := re.search(pat2, text): do_something_else_with(match.group(0), match.group(1)) elif match := re.search(pat3, text): do_some_other_things_with(match.group(0)) and_also_with(match.group(1), match.group(2))
-M-
On Sat, Apr 21, 2018 at 08:35:51PM +0100, Matthew Woodcraft wrote:
Well, that's a reason to make the example a bit more realistic, then.
Say:
if match := re.search(pat1, text): do_something_with(match.group(0)) elif match := re.search(pat2, text): do_something_else_with(match.group(0), match.group(1)) elif match := re.search(pat3, text): do_some_other_things_with(match.group(0)) and_also_with(match.group(1), match.group(2))
I don't think that a bunch of generic "do_something_with" functions is precisely "realistic".
If I saw something like that, I'd try very hard to find a way to refactor it into code like this:
for handler in handlers: if handler.match(text): handler.process() break else:
# handle no-match case here
where the knowledge of what to search for, where to search for it, how to search for it, and what to do when found, was encapsulated in the handler objects. Your tastes may vary.
But your point is well-taken that the version with binding assignment (thanks Tim!) is nicer to read than the current procedural version:
match = re.search(pat1, text) if match: do_something_with(match.group(0)) else: match = re.search(pat2, text) if match: do_something_else_with(match.group(0), match.group(1)) else: match = = re.search(pat3, text) do_some_other_things_with(match.group(0)) and_also_with(match.group(1), match.group(2))
I just don't think it counts as a motivating use-case distinct from the single match case.
-- Steve
On Sat, Apr 21, 2018 at 6:13 PM, Steven D'Aprano steve@pearwood.info wrote:
On Sat, Apr 21, 2018 at 08:35:51PM +0100, Matthew Woodcraft wrote:
Well, that's a reason to make the example a bit more realistic, then.
Say:
if match := re.search(pat1, text): do_something_with(match.group(0)) elif match := re.search(pat2, text): do_something_else_with(match.group(0), match.group(1)) elif match := re.search(pat3, text): do_some_other_things_with(match.group(0)) and_also_with(match.group(1), match.group(2))
I don't think that a bunch of generic "do_something_with" functions is precisely "realistic".
If I saw something like that, I'd try very hard to find a way to refactor it into code like this:
for handler in handlers: if handler.match(text): handler.process() break else:
# handle no-match case here
where the knowledge of what to search for, where to search for it, how to search for it, and what to do when found, was encapsulated in the handler objects. Your tastes may vary.
But your point is well-taken that the version with binding assignment (thanks Tim!) is nicer to read than the current procedural version:
match = re.search(pat1, text) if match: do_something_with(match.group(0)) else: match = re.search(pat2, text) if match: do_something_else_with(match.group(0), match.group(1)) else: match = = re.search(pat3, text) do_some_other_things_with(match.group(0)) and_also_with(match.group(1), match.group(2))
I just don't think it counts as a motivating use-case distinct from the single match case.
The version of this code found in reality is not as regular as the example quoted, and the rebuttal "but I would rewrite it with a loop" shoots a straw man. To me the if-elif-elif portion of the example is very much a separate motivation, since being able to put the assignment in the elif clause avoids runaway indentation. I've regretted not being able to use elif in this kind of situation many times, whereas in the single match case I don't find it a burden to assign the variable in a separate statement preceding the if-clause. (I guess this is a case of "flat is better than nested" -- thanks Tim! :-)
-- --Guido van Rossum (python.org/~guido)
[Matthew Woodcraft]
Well, that's a reason to make the example a bit more realistic, then.
Say:
if match := re.search(pat1, text): do_something_with(match.group(0)) elif match := re.search(pat2, text): do_something_else_with(match.group(0), match.group(1)) elif match := re.search(pat3, text): do_some_other_things_with(match.group(0)) and_also_with(match.group(1), match.group(2))
[Steven D'Aprano steve@pearwood.info]
I don't think that a bunch of generic "do_something_with" functions is precisely "realistic".
If I saw something like that, I'd try very hard to find a way to refactor it into code like this:
for handler in handlers: if handler.match(text): handler.process() break else:
# handle no-match case here
where the knowledge of what to search for, where to search for it, how to search for it, and what to do when found, was encapsulated in the handler objects. Your tastes may vary.
But your point is well-taken that the version with binding assignment (thanks Tim!) is nicer to read than the current procedural version:
match = re.search(pat1, text) if match: do_something_with(match.group(0)) else: match = re.search(pat2, text) if match: do_something_else_with(match.group(0), match.group(1)) else: match = = re.search(pat3, text) do_some_other_things_with(match.group(0)) and_also_with(match.group(1), match.group(2))
I just don't think it counts as a motivating use-case distinct from the single match case.
[Guido]
The version of this code found in reality is not as regular as the example quoted, and the rebuttal "but I would rewrite it with a loop" shoots a straw man. To me the if-elif-elif portion of the example is very much a separate motivation, since being able to put the assignment in the elif clause avoids runaway indentation. I've regretted not being able to use elif in this kind of situation many times, whereas in the single match case I don't find it a burden to assign the variable in a separate statement preceding the if-clause. (I guess this is a case of "flat is better than nested" -- thanks Tim! :-)
Au contraire - thank you for forcing me to channel you succinctly lo those many years ago ;-)
And for pointing out this real use case, which I'm not sure has been stressed before. The PEP could clearly use more motivating examples, and this is a fine class of them. Few things are more maddening than runaway cascading indentation :-(
And noting again that a simple "binding expression" (my neologism for
identifier ":=" expression
, to break the reflexive horror at
imagining the full complexity of assignment statements being allowed
everywhere expressions are allowed) is sufficient to address it.
This example makes me want “if expr as name:” (same semantics as ‘with’, and the name is always bound to the expression result regardless of truthiness), but doesn’t move me on assignment expressions.
Cheers, Steve
Top-posted from my Windows phone
From: Guido van Rossum Sent: Saturday, April 21, 2018 19:09 To: Steven D'Aprano Cc: Python-Dev Subject: Re: [Python-Dev] PEP 572: Assignment Expressions
On Sat, Apr 21, 2018 at 6:13 PM, Steven D'Aprano steve@pearwood.info wrote: On Sat, Apr 21, 2018 at 08:35:51PM +0100, Matthew Woodcraft wrote:
Well, that's a reason to make the example a bit more realistic, then.
Say:
if match := re.search(pat1, text): do_something_with(match.group(0)) elif match := re.search(pat2, text): do_something_else_with(match.group(0), match.group(1)) elif match := re.search(pat3, text): do_some_other_things_with(match.group(0)) and_also_with(match.group(1), match.group(2))
I don't think that a bunch of generic "do_something_with" functions is precisely "realistic".
If I saw something like that, I'd try very hard to find a way to refactor it into code like this:
for handler in handlers: if handler.match(text): handler.process() break else: # handle no-match case here
where the knowledge of what to search for, where to search for it, how to search for it, and what to do when found, was encapsulated in the handler objects. Your tastes may vary.
But your point is well-taken that the version with binding assignment (thanks Tim!) is nicer to read than the current procedural version:
match = re.search(pat1, text) if match: do_something_with(match.group(0)) else: match = re.search(pat2, text) if match: do_something_else_with(match.group(0), match.group(1)) else: match = = re.search(pat3, text) do_some_other_things_with(match.group(0)) and_also_with(match.group(1), match.group(2))
I just don't think it counts as a motivating use-case distinct from the single match case.
The version of this code found in reality is not as regular as the example quoted, and the rebuttal "but I would rewrite it with a loop" shoots a straw man. To me the if-elif-elif portion of the example is very much a separate motivation, since being able to put the assignment in the elif clause avoids runaway indentation. I've regretted not being able to use elif in this kind of situation many times, whereas in the single match case I don't find it a burden to assign the variable in a separate statement preceding the if-clause. (I guess this is a case of "flat is better than nested" -- thanks Tim! :-)
-- --Guido van Rossum (python.org/~guido)
Hi,
On 17/04/18 08:46, Chris Angelico wrote:
Having survived four rounds in the boxing ring at python-ideas, PEP 572 is now ready to enter the arena of python-dev.
I'm very strongly opposed to this PEP.
Would Python be better with two subtly different assignment operators? The answer of "no" seems self evident to me.
Do we need an assignment expression at all (regardless of the chosen
operator)? I think we do not.
Assignment is clear at the moment largely because of the context;
it can only occur at the statement level.
Consequently, assignment and keyword arguments are never confused
despite have the same form name = expr
The PEP uses the term "simplifying" when it really means "shortening". One example is stuff = [[y := f(x), x/y] for x in range(5)] as a simplification of stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
IMO, the "simplest" form of the above is the named helper function.
def meaningful_name(x): t = f(x) return t, x/t
[meaningful_name(i) for i in range(5)]
Is longer, but much simpler to understand.
I am also concerned that the ability to put assignments anywhere allows weirdnesses like these:
try: ... except (x := Exception) as x: ...
with (x: = open(...)) as x: ...
def do_things(fire_missiles=False, plant_flowers=False): ... do_things(plant_flowers:=True) # whoops!
It is easy to say "don't do that", but why allow it in the first place?
Cheers, Mark.
On Tue, May 1, 2018 at 12:30 AM, Mark Shannon mark@hotpy.org wrote:
The PEP uses the term "simplifying" when it really means "shortening". One example is stuff = [[y := f(x), x/y] for x in range(5)] as a simplification of stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
Now try to craft the equivalent that captures the condition in an if:
results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]
Do that one with a lambda function.
IMO, the "simplest" form of the above is the named helper function.
def meaningful_name(x): t = f(x) return t, x/t
[meaningful_name(i) for i in range(5)]
Is longer, but much simpler to understand.
Okay, but what if there is no meaningful name? It's easy to say "pick a meaningful name". It's much harder to come up with an actual name that is sufficiently meaningful that a reader need not go look at the definition of the function.
I am also concerned that the ability to put assignments anywhere allows weirdnesses like these:
try: ... except (x := Exception) as x: ...
with (x: = open(...)) as x: ...
We've been over this argument plenty, and I'm not going to rehash it.
def do_things(fire_missiles=False, plant_flowers=False): ... do_things(plant_flowers:=True) # whoops!
If you want your API to be keyword-only, make it keyword-only. If you want a linter that recognizes unused variables, get a linter that recognizes unused variables. Neither of these is the fault of the proposed syntax; you could just as easily write this:
do_things(plant_flowers==True)
but we don't see myriad reports of people typing too many characters and blaming the language.
ChrisA
Le 30/04/2018 à 17:30, Chris Angelico a écrit :
def do_things(fire_missiles=False, plant_flowers=False): ... do_things(plant_flowers:=True) # whoops!
If you want your API to be keyword-only, make it keyword-only. If you want a linter that recognizes unused variables, get a linter that recognizes unused variables. Neither of these is the fault of the proposed syntax; you could just as easily write this:
do_things(plant_flowers==True)
Unless you have a plant_flowers
variable already defined, this will
raise a NameError, not plant a silent bug.
Regards
Antoine.
On Mon, Apr 30, 2018 at 11:32 AM Chris Angelico rosuav@gmail.com wrote:
On Tue, May 1, 2018 at 12:30 AM, Mark Shannon mark@hotpy.org wrote:
The PEP uses the term "simplifying" when it really means "shortening". One example is stuff = [[y := f(x), x/y] for x in range(5)] as a simplification of stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
Now try to craft the equivalent that captures the condition in an if:
results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]
Easy:
results = [] for x in input_data: y = f(x) if y > 0: results.append((x, y, x/y))
Longer, but way more readable and debuggable if you're into that. This has worked for us many years and only a handful of people complained about this.
OTOH, I see plenty of people complaining that nested list comprehensions are hard to read. In my own code reviews I ask people to avoid using complex comprehensions all the time.
Do that one with a lambda function.
Why would I? Is using lambda functions mandatory?
IMO, the "simplest" form of the above is the named helper function.
def meaningful_name(x): t = f(x) return t, x/t
[meaningful_name(i) for i in range(5)]
Is longer, but much simpler to understand.
Okay, but what if there is no meaningful name? It's easy to say "pick a meaningful name". It's much harder to come up with an actual name that is sufficiently meaningful that a reader need not go look at the definition of the function.
That's a weird argument, Chris :-)
If f(x)
has no meaningful name, then what is the result of the
comprehension? Perhaps some meaningless data? ;)
I am also concerned that the ability to put assignments anywhere allows weirdnesses like these:
try: ... except (x := Exception) as x: ...
with (x: = open(...)) as x: ...
We've been over this argument plenty, and I'm not going to rehash it.
Hand-waving the question the way you do simply alienates more core devs to the PEP. And PEP 572 hand-waves a lot of questions and concerns. Asking people to dig for answers in 700+ emails about the PEP is a bit too much, don't you agree?
I think it's PEP's author responsibility to address questions right in their PEP.
def do_things(fire_missiles=False, plant_flowers=False): ... do_things(plant_flowers:=True) # whoops!
If you want your API to be keyword-only, make it keyword-only. If you
Another hand-waving. Should we deprecate passing arguments by name if their corresponding parameters are not keyword-only?
Mark shows another potential confusion between '=' and ':=' that people will have, and it's an interesting one.
want a linter that recognizes unused variables, get a linter that recognizes unused variables.
Many want Python to be readable and writeable without linters.
Neither of these is the fault of the proposed syntax; you could just as easily write this:
do_things(plant_flowers==True)
but we don't see myriad reports of people typing too many characters and blaming the language.
Strange. I see people who struggle to format their code properly or use the language properly every day ;)
Yury
On Tue, May 1, 2018 at 2:53 AM, Yury Selivanov yselivanov.ml@gmail.com wrote:
On Mon, Apr 30, 2018 at 11:32 AM Chris Angelico rosuav@gmail.com wrote:
On Tue, May 1, 2018 at 12:30 AM, Mark Shannon mark@hotpy.org wrote:
The PEP uses the term "simplifying" when it really means "shortening". One example is stuff = [[y := f(x), x/y] for x in range(5)] as a simplification of stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
Now try to craft the equivalent that captures the condition in an if:
results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]
Easy:
results = [] for x in input_data: y = f(x) if y > 0: results.append((x, y, x/y))
Longer, but way more readable and debuggable if you're into that. This has worked for us many years and only a handful of people complained about this.
OTOH, I see plenty of people complaining that nested list comprehensions are hard to read. In my own code reviews I ask people to avoid using complex comprehensions all the time.
Do that one with a lambda function.
Why would I? Is using lambda functions mandatory?
The claim was that assignment expressions were nothing more than a shorthand for lambda functions. You can't rewrite my example with a lambda function, ergo assignment expressions are not a shorthand for lambda functions.
Do you agree?
IMO, the "simplest" form of the above is the named helper function.
def meaningful_name(x): t = f(x) return t, x/t
[meaningful_name(i) for i in range(5)]
Is longer, but much simpler to understand.
Okay, but what if there is no meaningful name? It's easy to say "pick a meaningful name". It's much harder to come up with an actual name that is sufficiently meaningful that a reader need not go look at the definition of the function.
That's a weird argument, Chris :-)
If f(x)
has no meaningful name, then what is the result of the
comprehension? Perhaps some meaningless data? ;)
f(x) might have side effects. Can you give a meaningful name to the trivial helper function? Not every trivial helper can actually have a name that saves people from having to read the body of the function.
I am also concerned that the ability to put assignments anywhere allows weirdnesses like these:
try: ... except (x := Exception) as x: ...
with (x: = open(...)) as x: ...
We've been over this argument plenty, and I'm not going to rehash it.
Hand-waving the question the way you do simply alienates more core devs to the PEP. And PEP 572 hand-waves a lot of questions and concerns. Asking people to dig for answers in 700+ emails about the PEP is a bit too much, don't you agree?
I think it's PEP's author responsibility to address questions right in their PEP.
If I answer every question, I make that number into 800+, then 900+, then 1000+. If I don't, I'm alienating everyone by being dismissive. If every question is answered in the PEP, the document itself becomes so long that nobody reads it. Damned if I do, damned if I don't. Got any alternative suggestions?
def do_things(fire_missiles=False, plant_flowers=False): ... do_things(plant_flowers:=True) # whoops!
If you want your API to be keyword-only, make it keyword-only. If you
Another hand-waving. Should we deprecate passing arguments by name if their corresponding parameters are not keyword-only?
Mark shows another potential confusion between '=' and ':=' that people will have, and it's an interesting one.
A very rare one compared to the confusions that we already have with '=' and '=='. And this is another argument that we've been over, multiple times.
want a linter that recognizes unused variables, get a linter that recognizes unused variables.
Many want Python to be readable and writeable without linters.
And it will be. But there are going to be certain types of bug that you won't catch as quickly. You can't use language syntax to catch every bug. That's provably impossible.
Neither of these is the fault of the proposed syntax; you could just as easily write this:
do_things(plant_flowers==True)
but we don't see myriad reports of people typing too many characters and blaming the language.
Strange. I see people who struggle to format their code properly or use the language properly every day ;)
And do they blame the language for having a comparison operator that is so easy to type? Or do they fix their bugs and move on? Again, language syntax is not the solution to bugs.
ChrisA
On Mon, Apr 30, 2018 at 1:03 PM Chris Angelico rosuav@gmail.com wrote:
That's a weird argument, Chris :-)
If f(x)
has no meaningful name, then what is the result of the
comprehension? Perhaps some meaningless data? ;)
f(x) might have side effects. Can you give a meaningful name to the trivial helper function?
I don't understand your question. How is f(x)
having side effects or
not
having them is relevant to the discussion? Does ':=' work only with pure
functions?
Not every trivial helper can actually have a name that saves people from having to read the body of the function.
I don't understand this argument either, sorry.
We've been over this argument plenty, and I'm not going to rehash it.
Hand-waving the question the way you do simply alienates more core devs to the PEP. And PEP 572 hand-waves a lot of questions and concerns. Asking people to dig for answers in 700+ emails about the PEP is a bit too much, don't you agree?
I think it's PEP's author responsibility to address questions right in their PEP.
If I answer every question, I make that number into 800+, then 900+, then 1000+. If I don't, I'm alienating everyone by being dismissive. If every question is answered in the PEP, the document itself becomes so long that nobody reads it. Damned if I do, damned if I don't. Got any alternative suggestions?
IMO, big part of why that we have 100s of emails is because people are very concerned with readability. The PEP just hand-waives the question entirely, instead of listing good and realistic examples of code, as well as listing bad examples. So that, you know, people could compare them and understand both pros and cons.
Instead we have a few very questionable examples in the PEP that most people don't like at all. Moreover, half of the PEP is devoted to fixing comprehensions scoping, which is almost an orthogonal problem to adding a new syntax.
So my suggestion remains to continue working on the PEP, improving it and making it more comprehensive. You're free to ignore this advice, but don't be surprised that you see new emails about what ':=' does to code readability (with the same arguments). PEP 572 proponents answering to every email with the same dismissive template doesn't help either.
def do_things(fire_missiles=False, plant_flowers=False): ... do_things(plant_flowers:=True) # whoops!
If you want your API to be keyword-only, make it keyword-only. If you
Another hand-waving. Should we deprecate passing arguments by name if their corresponding parameters are not keyword-only?
Mark shows another potential confusion between '=' and ':=' that people will have, and it's an interesting one.
A very rare one compared to the confusions that we already have with '=' and '=='. And this is another argument that we've been over, multiple times.
How do you know if it's rare or not? '=' is used to assign, ':=' is used to assign, '==' is used to compare. I can easily imagine people being confused why '=' works for setting an argument, and why ':=' doesn't. Let's agree to disagree on this one :)
Strange. I see people who struggle to format their code properly or use the language properly every day ;)
And do they blame the language for having a comparison operator that is so easy to type? Or do they fix their bugs and move on? Again, language syntax is not the solution to bugs.
I'm not sure how to correlate what I was saying with your reply, sorry.
Anyways, Chris, I think that the PEP hand-waves a lot of questions and doesn't have a comprehensive analysis of how the PEP will affect syntax and readability. It's up to you to consider taking my advice or not. I'll try to (again) restrain myself posting about this topic.
Y
On 04/30/2018 07:30 AM, Mark Shannon wrote:
Would Python be better with two subtly different assignment operators? The answer of "no" seems self evident to me.
Maybe this has been covered in the thread earlier--if so, I missed it, sorry. But ISTM that Python already has multiple ways to perform an assignment.
All these statements assign to x:
x = y
for x in y:
with y as x:
except Exception as x:
And, if you want to get super pedantic:
import x
def x(): ...
class x: ...
I remain -1 on 572, but I'm not sure it can genuinely be said that Python only has one way to assign a value to a variable.
//arry/
On Mon, Apr 30, 2018 at 05:27:08PM -0700, Larry Hastings wrote:
On 04/30/2018 07:30 AM, Mark Shannon wrote:
Would Python be better with two subtly different assignment operators? The answer of "no" seems self evident to me.
Maybe this has been covered in the thread earlier--if so, I missed it, sorry. But ISTM that Python already has multiple ways to perform an assignment.
All these statements assign to x: [snip SEVEN distinct binding operations]
I remain -1 on 572, but I'm not sure it can genuinely be said that Python only has one way to assign a value to a variable.
"Not sure"? Given that you listed seven ways, how much more evidence do you need to conclude that it is simply wrong to say that Python has a single way to assign a value to a name?
:-)
-- Steve
Steven D'Aprano wrote:
"Not sure"? Given that you listed seven ways, how much more evidence do you need to conclude that it is simply wrong to say that Python has a single way to assign a value to a name?
Yes, but six of those are very specialised, and there's rarely any doubt about when it's appropriate to use them.
The proposed :=, on the other hand, would overlap a lot with the functionality of =. It's not a case of "Python already has seven ways of assigning, so one more can't hurt."
-- Greg