Proposal: Complex comprehensions containing statements
This is a proposal for a new syntax where a comprehension is written as the appropriate brackets containing a loop which can contain arbitrary statements. Here are some simple examples. Instead of: [ f(x) for y in z for x in y if g(x) ] one may write: [ for y in z: for x in y: if g(x): f(x) ] Instead of: lst = [] for x in y: if cond(x): break z = f(x) lst.append(z * 2) one may write: lst = [ for x in y: if cond(x): break z = f(x) yield z * 2 ] Instead of: [ {k: v for k, v in foo} for foo in bar ] one may write: [ for foo in bar: {for k, v in foo: k: v} ] ## Specification A list/set/dict comprehension or generator expression is written as the appropriate brackets containing a `for` or `while` loop. In the general case some expressions have `yield` in front and they become the values of the comprehension, like a generator function. If the comprehension contains exactly one expression statement at any level of nesting, i.e. if there is only one place where a `yield` can be placed at the start of a statement, then `yield` is not required and the expression is implicitly yielded. In particular this means that any existing comprehension translated into the new style doesn't require `yield`. If the comprehension doesn't contain exactly one expression statement and doesn't contain a `yield`, it's a SyntaxError. ### Dictionary comprehensions For dictionary comprehensions, a `key: value` pair is allowed as its own pseudo-statement or in a yield. It's not a real expression and cannot appear inside other expressions. This can potentially be confused with variable type annotations with no assigned value, e.g. `x: int`. But we can essentially apply the same rule as other comprehensions: either use `yield`, or only have one place where a `yield` could be added in front of a statement. So if there is only one pair `x: y` we try to implicitly yield that. The only way this could be misinterpreted is if a user declared the type of exactly one expression and completely forgot to give their comprehension elements, and the program would almost certainly fail spectacularly. ### Whitespace If placing the loop on a single line would be valid syntax outside a comprehension (i.e. it just contains a simple statement) then we call this an *inline* comprehension. It can be inserted in the same line(s) as other code and formatted however the writer likes - there are no concerns about whitespace. For a more complex comprehension, the loop must start and end with a newline, i.e. the lines containing the loop cannot contain any tokens from outside, including the enclosing brackets. For example, this is allowed: foo = [ for x in y: if x > 0: f(x) ] but this is not: foo = [for x in y: if x > 0: f(x)] This ensures that code is readable even at a quick glance. The eyes can quickly find where the loop starts and distinguish the embedded statements from the rest of the enclosing expression. Furthermore, it's easy to copy paste entire lines to move them around, whereas refactoring the invalid example above without specific tools would be annoying and error-prone. It also makes it easy to adjust code outside the comprehension (e.g. rename `foo` to something longer) without messing up indentation and alignment. Inside the loop, the rules for indentation and such are the same as anywhere else. The syntax of the loop is valid only if it's also valid as a normal loop outside any expression. The body of the loop must be more indented than the for/while keyword that starts the loop. ### Variable scope Since comprehensions look like normal loops they should maybe behave like them again, including executing in the same scope and 'leaking' the iteration variable(s). Assignments via the walrus operator already affect the outer scope, only the iteration variable currently behaves differently. My understanding is that this is influenced by the fact that there is little reason to use the value of the iteration variable after a list comprehension completes since it will always be the last value in the iterable. But since the new syntax allows `break`, the value may become useful again. I don't know what the right approach is here and I imagine it can generate plenty of debate. Given that this whole proposal is already controversial and likely to be rejected this may not be the best place to start discussion. But maybe it is, I don't know. ## Benefits/comparison to current methods ### Uniform syntax The new comprehensions just look like normal loops in brackets, or generator functions. This should make them easier for beginners to learn than the old comprehensions. A particular concept that's easier to learn is comprehensions that contain multiple loops. Consider this comprehension over a nested list: [ f(cell) for row in matrix for cell in row ] For beginners this can easily be confusing, [and sometimes for experienced coders too](https://mail.python.org/archives/list/python-ideas@python.org/message/BX7LWU... ). Yes there's a rule that one can learn, but putting it in reverse also seems logical, perhaps even more so: [ f(cell) for cell in row for row in matrix ] Now the comprehension is 'consistently backwards', it reads more like English, and the usage of `cell` is right next to its definition. But of course that order is wrong...unless we want a nested list comprehension that produces a new nested list: [ [ f(cell) for cell in row ] for row in matrix ] Again, it's not hard for an experienced coder to understand this, but for a beginner grappling with new concepts this is not great. Now consider how the same two comprehensions would be written in the new syntax: [ for row in matrix: for cell in row: f(cell) ] [ for row in matrix: [ for cell in row: f(cell) ] ] ### Power and flexibility Comprehensions are great and I love using them. I want to be able to use them more often. I know I can solve any problem with a loop, but it's obvious that comprehensions are much nicer or we wouldn't need to have them at all. Compare this code: new_matrix = [] for row in matrix: new_row = [] for cell in row: try: new_row.append(f(cell)) except ValueError: new_row.append(0) new_matrix.append(new_row) with the solution using the new syntax: new_matrix = [ for row in matrix: [ for cell in row: try: yield f(cell) except ValueError: yield 0 ] ] It's immediately visually obvious that it's building a new nested list, there's much less syntax for me to parse, and the variable `new_row` has gone from appearing 4 times to 0! There have been many requests to add some special syntax to comprehensions to make them a bit more powerful: - [Is this PEP-able? "with" statement inside genexps / list comprehensions](https://mail.python.org/archives/list/python-ideas@python.org/thread/BUD46OE...) - [Allowing breaks in generator expressions by overloading the while keyword](https://mail.python.org/archives/list/python-ideas@python.org/thread/6PEOE5Z...) - [while conditional in list comprehension ??](https://mail.python.org/archives/list/python-ideas@python.org/thread/RYBBHV3...) This would solve all such problems neatly. ### No trying to fit things in a single expression The current syntax can only contain one expression in the body. This restriction makes it difficult to solve certain problems elegantly and creates an uncomfortable grey area where it's hard to decide between squeezing maybe a bit too much into an expression or doing things 'manually'. This can lead to analysis paralysis and disagreements between coders and reviewers. For example, which of the following is the best? clean = [ line.strip() for line in lines if line.strip() ] stripped = [line.strip() for line in lines] clean = [line for line in stripped if line] clean = list(filter(None, map(str.strip, lines))) clean = [] for line in lines: line = line.strip() if line: clean.append(line) def clean_lines(): for line in lines: line = line.strip() if line: yield line clean = list(clean_lines()) You probably have a favourite, but it's very subjective and this kind of problem requires judgement depending on the situation. For example, I'd choose the first version in this case, but a different version if I had to worry about duplicating something more complex or expensive than `.strip()`. And again, there's an awkward sweet spot where it's hard to decide whether I care enough about the duplication. What about assignment expressions? We could do this: clean = [ stripped for line in lines if (stripped := line.strip()) ] Like the nested loops, this is tricky to parse without experience. The execution order can be confusing and the variable is used away from where it's defined. Even if you like it, there are clearly many who don't. I think the fact that assignment expressions were a desired feature despite being so controversial is a symptom of this problem. It's the kind of thing that happens when we're stuck with the limitations of a single expression. The solution with the new syntax is: clean = [ for line in lines: stripped = line.strip() if stripped: stripped ] or if you'd like to use an assignment expression: clean = [ for line in lines: if stripped := line.strip(): stripped ] I think both of these look great and are easily better than any of the other options. And I think it would be the clear winner in any similar situation - no careful judgement needed. This would become the one (and only one) obvious way to do it. The new syntax has the elegance of list comprehensions and the flexibility of multiple statements. It's completely scalable and works equally well from the simplest comprehension to big complicated constructions. ### Easy to change I hate when I've already written a list comprehension but a new requirement forces me to change it to, say, the `.append` version. It's a tedious refactoring involving brackets, colons, indentation, and moving things around. It also leaves me with a very unhelpful `git diff`. With the new syntax I can easily add logic as I please and get a nice simple diff.
Hi, I think this syntax is very hard to read because the yielding-expression can be anywhere in the block and there is nothing to identify it. Even in your examples I can't figure out which expression will be used. What if I call a function somewhere in the block? Can't you just use generators + list/set/dict constructors when you need complex statements? Regards Le ven. 21 févr. 2020 à 08:58, Alex Hall <alex.mojaki@gmail.com> a écrit :
This is a proposal for a new syntax where a comprehension is written as the appropriate brackets containing a loop which can contain arbitrary statements.
Here are some simple examples. Instead of:
[ f(x) for y in z for x in y if g(x) ]
one may write:
[ for y in z: for x in y: if g(x): f(x) ]
Instead of:
lst = [] for x in y: if cond(x): break z = f(x) lst.append(z * 2)
one may write:
lst = [ for x in y: if cond(x): break z = f(x) yield z * 2 ]
Instead of:
[ {k: v for k, v in foo} for foo in bar ]
one may write:
[ for foo in bar: {for k, v in foo: k: v} ]
## Specification
A list/set/dict comprehension or generator expression is written as the appropriate brackets containing a `for` or `while` loop.
In the general case some expressions have `yield` in front and they become the values of the comprehension, like a generator function.
If the comprehension contains exactly one expression statement at any level of nesting, i.e. if there is only one place where a `yield` can be placed at the start of a statement, then `yield` is not required and the expression is implicitly yielded. In particular this means that any existing comprehension translated into the new style doesn't require `yield`.
If the comprehension doesn't contain exactly one expression statement and doesn't contain a `yield`, it's a SyntaxError.
### Dictionary comprehensions
For dictionary comprehensions, a `key: value` pair is allowed as its own pseudo-statement or in a yield. It's not a real expression and cannot appear inside other expressions.
This can potentially be confused with variable type annotations with no assigned value, e.g. `x: int`. But we can essentially apply the same rule as other comprehensions: either use `yield`, or only have one place where a `yield` could be added in front of a statement. So if there is only one pair `x: y` we try to implicitly yield that. The only way this could be misinterpreted is if a user declared the type of exactly one expression and completely forgot to give their comprehension elements, and the program would almost certainly fail spectacularly.
### Whitespace
If placing the loop on a single line would be valid syntax outside a comprehension (i.e. it just contains a simple statement) then we call this an *inline* comprehension. It can be inserted in the same line(s) as other code and formatted however the writer likes - there are no concerns about whitespace.
For a more complex comprehension, the loop must start and end with a newline, i.e. the lines containing the loop cannot contain any tokens from outside, including the enclosing brackets. For example, this is allowed:
foo = [ for x in y: if x > 0: f(x) ]
but this is not:
foo = [for x in y: if x > 0: f(x)]
This ensures that code is readable even at a quick glance. The eyes can quickly find where the loop starts and distinguish the embedded statements from the rest of the enclosing expression.
Furthermore, it's easy to copy paste entire lines to move them around, whereas refactoring the invalid example above without specific tools would be annoying and error-prone. It also makes it easy to adjust code outside the comprehension (e.g. rename `foo` to something longer) without messing up indentation and alignment.
Inside the loop, the rules for indentation and such are the same as anywhere else. The syntax of the loop is valid only if it's also valid as a normal loop outside any expression. The body of the loop must be more indented than the for/while keyword that starts the loop.
### Variable scope
Since comprehensions look like normal loops they should maybe behave like them again, including executing in the same scope and 'leaking' the iteration variable(s). Assignments via the walrus operator already affect the outer scope, only the iteration variable currently behaves differently. My understanding is that this is influenced by the fact that there is little reason to use the value of the iteration variable after a list comprehension completes since it will always be the last value in the iterable. But since the new syntax allows `break`, the value may become useful again.
I don't know what the right approach is here and I imagine it can generate plenty of debate. Given that this whole proposal is already controversial and likely to be rejected this may not be the best place to start discussion. But maybe it is, I don't know.
## Benefits/comparison to current methods
### Uniform syntax
The new comprehensions just look like normal loops in brackets, or generator functions. This should make them easier for beginners to learn than the old comprehensions.
A particular concept that's easier to learn is comprehensions that contain multiple loops. Consider this comprehension over a nested list:
[ f(cell) for row in matrix for cell in row ]
For beginners this can easily be confusing, [and sometimes for experienced coders too](https://mail.python.org/archives/list/python-ideas@python.org/message/BX7LWU... ). Yes there's a rule that one can learn, but putting it in reverse also seems logical, perhaps even more so:
[ f(cell) for cell in row for row in matrix ]
Now the comprehension is 'consistently backwards', it reads more like English, and the usage of `cell` is right next to its definition. But of course that order is wrong...unless we want a nested list comprehension that produces a new nested list:
[ [ f(cell) for cell in row ] for row in matrix ]
Again, it's not hard for an experienced coder to understand this, but for a beginner grappling with new concepts this is not great. Now consider how the same two comprehensions would be written in the new syntax:
[ for row in matrix: for cell in row: f(cell) ]
[ for row in matrix: [ for cell in row: f(cell) ] ]
### Power and flexibility
Comprehensions are great and I love using them. I want to be able to use them more often. I know I can solve any problem with a loop, but it's obvious that comprehensions are much nicer or we wouldn't need to have them at all. Compare this code:
new_matrix = [] for row in matrix: new_row = [] for cell in row: try: new_row.append(f(cell)) except ValueError: new_row.append(0) new_matrix.append(new_row)
with the solution using the new syntax:
new_matrix = [ for row in matrix: [ for cell in row: try: yield f(cell) except ValueError: yield 0 ] ]
It's immediately visually obvious that it's building a new nested list, there's much less syntax for me to parse, and the variable `new_row` has gone from appearing 4 times to 0!
There have been many requests to add some special syntax to comprehensions to make them a bit more powerful:
- [Is this PEP-able? "with" statement inside genexps / list comprehensions](https://mail.python.org/archives/list/python-ideas@python.org/thread/BUD46OE...) - [Allowing breaks in generator expressions by overloading the while keyword](https://mail.python.org/archives/list/python-ideas@python.org/thread/6PEOE5Z...) - [while conditional in list comprehension ??](https://mail.python.org/archives/list/python-ideas@python.org/thread/RYBBHV3...)
This would solve all such problems neatly.
### No trying to fit things in a single expression
The current syntax can only contain one expression in the body. This restriction makes it difficult to solve certain problems elegantly and creates an uncomfortable grey area where it's hard to decide between squeezing maybe a bit too much into an expression or doing things 'manually'. This can lead to analysis paralysis and disagreements between coders and reviewers. For example, which of the following is the best?
clean = [ line.strip() for line in lines if line.strip() ]
stripped = [line.strip() for line in lines] clean = [line for line in stripped if line]
clean = list(filter(None, map(str.strip, lines)))
clean = [] for line in lines: line = line.strip() if line: clean.append(line)
def clean_lines(): for line in lines: line = line.strip() if line: yield line
clean = list(clean_lines())
You probably have a favourite, but it's very subjective and this kind of problem requires judgement depending on the situation. For example, I'd choose the first version in this case, but a different version if I had to worry about duplicating something more complex or expensive than `.strip()`. And again, there's an awkward sweet spot where it's hard to decide whether I care enough about the duplication.
What about assignment expressions? We could do this:
clean = [ stripped for line in lines if (stripped := line.strip()) ]
Like the nested loops, this is tricky to parse without experience. The execution order can be confusing and the variable is used away from where it's defined. Even if you like it, there are clearly many who don't. I think the fact that assignment expressions were a desired feature despite being so controversial is a symptom of this problem. It's the kind of thing that happens when we're stuck with the limitations of a single expression.
The solution with the new syntax is:
clean = [ for line in lines: stripped = line.strip() if stripped: stripped ]
or if you'd like to use an assignment expression:
clean = [ for line in lines: if stripped := line.strip(): stripped ]
I think both of these look great and are easily better than any of the other options. And I think it would be the clear winner in any similar situation - no careful judgement needed. This would become the one (and only one) obvious way to do it. The new syntax has the elegance of list comprehensions and the flexibility of multiple statements. It's completely scalable and works equally well from the simplest comprehension to big complicated constructions.
### Easy to change
I hate when I've already written a list comprehension but a new requirement forces me to change it to, say, the `.append` version. It's a tedious refactoring involving brackets, colons, indentation, and moving things around. It also leaves me with a very unhelpful `git diff`. With the new syntax I can easily add logic as I please and get a nice simple diff. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5UIXE2... Code of Conduct: http://python.org/psf/codeofconduct/
-- Antoine Rozo
Antoine Rozo wrote:
I think this syntax is very hard to read because the yielding-expression can be anywhere in the block and there is nothing to identify it.
Would you feel better if `yield` was always required?
Even in your examples I can't figure out which expression will be used.
For almost all the examples I've provided a corresponding equivalent in the current syntax, so are you saying that you're still confused now or that you can't figure it out until you look at the 'answer'?
What if I call a function somewhere in the block?
If that function is the whole statement and there is no other expression statement in the comprehension, it will be yielded. I can't tell if there's more to your question.
Can't you just use generators + list/set/dict constructors when you need complex statements?
I can, but I've explained at length why that's not as good.
Would you feel better if `yield` was always required?
Yes but then it's the same as defining a generator-function.
For almost all the examples I've provided a corresponding equivalent in the current syntax, so are you saying that you're still confused now or that you can't figure it out until you look at the 'answer'?
I think it's ambiguous, like in this example: clean = [ for line in lines: stripped = line.strip() if stripped: stripped ] what says that it's the last stripped that should be yielded?
If that function is the whole statement and there is no other expression statement in the comprehension, it will be yielded. I can't tell if there's more to your question.
Imagine this one: foo = [ for x in range(5): f(x) if x % 2: x ] what will be the result? [f(x) for x in range(5)]? [x for x in range(5) if x%2]? [x if x%2 else f(x) for x in range(5)]?
Yes but then it's the same as defining a generator-function.
List comprehensions are already the same as other things, but they're nice anyway. `lambda` is the same as defining a function, but it's nice too. Syntactic sugar is helpful sometimes. I think this: clean = [ for line in lines: stripped = line.strip() if stripped: yield stripped ] is easily nicer than this: def clean_lines(): for line in lines: line = line.strip() if line: yield line clean = list(clean_lines()) And this: new_matrix = [ for row in matrix: yield [ for cell in row: try: yield f(cell) except ValueError: yield 0 ] ] is nicer than any of these: new_matrix = [] for row in matrix: def new_row(): for cell in row: try: yield f(cell) except ValueError: yield 0 new_matrix.append(list(new_row())) ---- def new_row(row): for cell in row: try: yield f(cell) except ValueError: yield 0 new_matrix = [list(new_row(row)) for row in matrix] ---- def safe_f(cell): try: return f(cell) except ValueError: return 0 new_matrix = [ [ safe_f(cell) for cell in row ] for row in matrix ]
I think it's ambiguous, like in this example: clean = [ for line in lines: stripped = line.strip() if stripped: stripped ] what says that it's the last stripped that should be yielded?
Because it's the only statement that *can* be yielded. The `yield` is implicit when there's exactly one statement you can put it in front of. You can't `yield stripped = line.strip()`. You can technically have `stripped = yield line.strip()` but we ignore those possibilities.
If that function is the whole statement and there is no other expression statement in the comprehension, it will be yielded. I can't tell if there's more to your question. Imagine this one: foo = [ for x in range(5): f(x) if x % 2: x ] what will be the result?
It will be a SyntaxError, because it's ambiguous. Here's a new idea: `yield` is only optional in inline comprehensions, i.e. where the loop body consists entirely of a single expression. So for example this is allowed: new_row = [for cell in row: f(cell)] but this is not: new_row = [ for cell in row: thing = g(cell) f(thing) ] Instead the user must write `yield f(thing)` at the end. This would mean that you only need to add `yield` when the comprehension is already somewhat long so it's less significant, and there's only one very simple special case to learn about.
I'm not against your proposal at this point, but I think you can already do the first example: clean = [line.strip() for line in lines if line.strip()] You might be able to avoid calling the method twice using the walrus operator. For any operation more complex than that (ie, more complex than clean_lines, defined below) I'd use the list constructor with a named function anyway, rather than inlining it in a comprehension. I consider that more readable. Of course, YMMV, which is why I'm not against the proposal at this point, but to me the point of things like lambda and comprehensions is precisely that they're one logical line, and frequently they're one physical line. So no, the comprehension including a multiline compound statement is not "easily" nicer for me. Alex Hall writes:
Yes but then it's the same as defining a generator-function.
List comprehensions are already the same as other things, but they're nice anyway. `lambda` is the same as defining a function, but it's nice too. Syntactic sugar is helpful sometimes. I think this:
clean = [ for line in lines: stripped = line.strip() if stripped: yield stripped ]
is easily nicer than this:
def clean_lines(): for line in lines: line = line.strip() if line: yield line
clean = list(clean_lines())
You might be able to avoid calling the method twice using the walrus operator.
I specifically discussed the walrus operator solution, but both you and Dominik Vilsmeier seem to have missed that.
I'd use the list constructor with a named function anyway, rather than inlining it in a comprehension. I consider that more readable.
I'm curious, how do you find this: def clean(): for line in lines: line = line.strip() if line: yield line clean_lines = list(clean()) more readable than this? clean_lines = [ for line in lines: line = line.strip() if line: yield line ] It's not that I find my version particularly readable, but I don't see how it's worse.
-100. The weird block structures inside comprehension reads terribly even in the trivial case shown, and looks worse the more structures are inside it. We have functions. They are great. Let's use those. On Sat, Feb 22, 2020, 2:01 AM Alex Hall <alex.mojaki@gmail.com> wrote:
You might be able to avoid calling the method twice using the walrus operator.
I specifically discussed the walrus operator solution, but both you and Dominik Vilsmeier seem to have missed that.
I'd use the list constructor with a named function anyway, rather than inlining it in a comprehension. I consider that more readable.
I'm curious, how do you find this:
def clean(): for line in lines: line = line.strip() if line: yield line
clean_lines = list(clean())
more readable than this?
clean_lines = [ for line in lines: line = line.strip() if line: yield line ]
It's not that I find my version particularly readable, but I don't see how it's worse. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UNMZTO... Code of Conduct: http://python.org/psf/codeofconduct/
The weird block structures inside comprehension reads terribly even in the trivial case shown
David, it's fine if you have that opinion, but you're replying specifically to my post where I asked someone to elaborate on that kind of opinion, so it bothers me that you just restated it without explanation. I'm curious, in what way does it read terribly? Why are the block structures "weird"?
Comprehension are very much based on the idea of *declarative* data collections. That's their entire reason for being. In general, one expects comprehension to be side-effect free and just build a collection according to declared rules. Obviously I know many ways to smuggle in side effects, but doing so goes against their spirit and user expectations. A comprehension should read as one line. At very least logically, but usually physically. If I ever find myself wiring a comprehension longer than about 120 characters, I know it's time to refactor into a function and block loops. I've cleaned up far too much code that used too-complex nested comprehension. Everything about this proposal is antithetical to the intention of comprehension. On Sat, Feb 22, 2020, 2:27 AM Alex Hall <alex.mojaki@gmail.com> wrote:
The weird block structures inside comprehension reads terribly even in the trivial case shown
David, it's fine if you have that opinion, but you're replying specifically to my post where I asked someone to elaborate on that kind of opinion, so it bothers me that you just restated it without explanation. I'm curious, in what way does it read terribly? Why are the block structures "weird"? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/BG426W... Code of Conduct: http://python.org/psf/codeofconduct/
22.02.20 09:43, David Mertz пише:
Comprehension are very much based on the idea of *declarative* data collections. That's their entire reason for being. In general, one expects comprehension to be side-effect free and just build a collection according to declared rules. Obviously I know many ways to smuggle in side effects, but doing so goes against their spirit and user expectations.
A comprehension should read as one line. At very least logically, but usually physically. If I ever find myself wiring a comprehension longer than about 120 characters, I know it's time to refactor into a function and block loops. I've cleaned up far too much code that used too-complex nested comprehension.
Everything about this proposal is antithetical to the intention of comprehension.
I concur with David.
On Feb 20, 2020, at 23:56, Alex Hall <alex.mojaki@gmail.com> wrote:
This is a proposal for a new syntax where a comprehension is written as the appropriate brackets containing a loop which can contain arbitrary statements.
What happens if there’s a yield expression (that isn’t just ignored and used as an expression statement)? Probably not too common inside comprehensions, but there’s no reason you can’t write `[(yield None) for _ in range(3)]` to gather the first three values sent into your generator—and there might be much more common uses that come up once comprehensions can contain arbitrarily complex expressions. Does this now just collect up three None values? Does it still accept sends from the caller of the generator function it’s embedded in? What if the generator function yields from a generator expression with a yield expression in it? For that matter, what does yield from inside a comprehension do?
On 22/02/20 11:45 am, Andrew Barnert via Python-ideas wrote:
there’s no reason you can’t write `[(yield None) for _ in range(3)]` to gather the first three values sent into your generator
Currently this doesn't quite do what you might expect. It doesn't make the enclosing function into a generator, it make the list comprehension itself a generator:
def f(): ... return [(yield x) for x in range(10)] ... g = f() g <generator object f.<locals>.<listcomp> at 0x6b396c>
I don't think this behaviour is deliberate; it seems to be a consequence of deciding to compile the body of the comprehension as a nested function. -- Greg
Currently this doesn't quite do what you might expect. It doesn't make the enclosing function into a generator, it make the list comprehension itself a generator:
def f(): ... return [(yield x) for x in range(10)] ... g = f() g <generator object f.<locals>.<listcomp> at 0x6b396c>
Am I missing something: this is a syntax error for me in 3.8:
def f(): ... return [(yield x) for x in range(10)] ... File "<stdin>", line 2 SyntaxError: 'yield' inside list comprehension
On Sat, Feb 22, 2020 at 10:32 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 22/02/20 11:45 am, Andrew Barnert via Python-ideas wrote:
there’s no reason you can’t write `[(yield None) for _ in range(3)]` to gather the first three values sent into your generator
Currently this doesn't quite do what you might expect. It doesn't make the enclosing function into a generator, it make the list comprehension itself a generator:
def f(): ... return [(yield x) for x in range(10)] ... g = f() g <generator object f.<locals>.<listcomp> at 0x6b396c>
I don't think this behaviour is deliberate; it seems to be a consequence of deciding to compile the body of the comprehension as a nested function.
Depends what you mean by "currently". Python 3.7.0a4+ (heads/master:95e4d58913, Jan 27 2018, 06:21:05) [GCC 6.3.0 20170516] on linux
def f(): ... return [(yield x) for x in range(10)] ...
Python 3.8.2rc2+ (heads/3.8:a207512121, Feb 21 2020, 21:49:46) [GCC 6.3.0 20170516] on linux
def f(): ... return [(yield x) for x in range(10)] ... File "<stdin>", line 2 SyntaxError: 'yield' inside list comprehension
https://docs.python.org/3/whatsnew/3.8.html#changes-in-python-behavior ChrisA
On Feb 21, 2020, at 15:34, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 22/02/20 11:45 am, Andrew Barnert via Python-ideas wrote:
there’s no reason you can’t write `[(yield None) for _ in range(3)]` to gather the first three values sent into your generator
Currently this doesn't quite do what you might expect. It doesn't make the enclosing function into a generator, it make the list comprehension itself a generator:
Right; that’s why I asked what if the enclosing generator function yields from the comprehension. The fact that the semantics would probably be confusing doesn’t mean we shouldn’t ask what the semantics would be. After all, the code has to compile to something, and as written the proposal doesn’t tell us what that something is. Plus: Today there are no good cases where you’d want to yield from inside a comprehension; the proposal to allow arbitrary statements in a compression might well change that, so we need to figure out what you should expect in such cases and if it’s doable. I didn’t realize that 3.8 has made this illegal instead of confusing, which changes things. Maybe the right answer is that yield expression statements are allowed in new-style comprehensions (and have the new semantics) but yield expressions inside other expressions inside new-style comprehensions (or is that last rule just “directly inside”?) are not? But even if so, the proposal needs to specify that and argue for it, not just leave open what it could mean.
Andrew, you raise a good point that I hadn't considered. I think you're right that yield inside an expression shouldn't be allowed, to avoid confusion. If we found a good reason to allow it we could change it later.
participants (9)
-
Alex Hall
-
Andrew Barnert
-
Antoine Rozo
-
Chris Angelico
-
David Mertz
-
Greg Ewing
-
Ricky Teachey
-
Serhiy Storchaka
-
Stephen J. Turnbull