[Python-ideas] Is this PEP-able? for X in ListY while conditionZ:
Andrew Barnert
abarnert at yahoo.com
Thu Jun 27 09:28:24 CEST 2013
Let me try to gather together all of the possibilities that have been discussed in this and the two previous threads, plus a couple of obvious ones nobody's mentioned.
Unless I'm missing a good idea, or someone can explain why one of these isn't as bad as it seems, I don't like any of them. Some of them are ugly right off the bat. The rest are deceptively appealing until you think them through, but then they're even worse than the obviously bad ones.
I'll try to put them in order from least bad to most horrid. (I'm +0.25 on #1, but not until Python 4; -0 on #2; -1 on #3, and it's all downhill from there.)
1. Redefine comprehensions on top of generator expressions instead of defining them in terms of nested blocks.
def stop(): raise StopIteration
x = [value if pred(value) else stop() for value in iterable]
This would make the implementation of Python simpler.
It also makes the language conceptually simpler. The subtle differences between [x for x in foo] and list(x for x in foo) are gone.
And it's actually a pretty small change to the official semantics. Just replace the last two paragraphs of 6.2.4 with "The comprehension consists of a single expression followed by at least one for clause and zero or more for or if clauses. In this case, the expression and clauses are interpreted as if they were a generator expression, and the elements of the new container are those yielded by that generator expression consumed to completion." (It also makes it easier to fix the messy definition of dict comprehensions, if anyone cares.)
Unlike #5, 6, 7, 8, and 10, but like #2, 3, 4, and 9, this only allows you to break out of one for clause, not any. But that's exactly the same as break only being able to break out of one for loop. Nobody complains that Python doesn't have "break 2" or "break :label", right?
The real downside is that this is a very radical change. People may sometimes rely on the differences between listcomps and genexps, maybe even without realizing they're doing so. Code that's worked unchanged from 2.0 to 3.3 might break.
Also, the obvious implementation would make listcomps about 25-50% slower, and trying to tweak the existing optimized comprehension code to work as-if wrapping a generator (and writing tests for all of the edge cases) sounds like a lot of work. And at best it would still be at least a _little_ slower (handling StopIteration, handling errors in the outer for clause differently from other clauses, etc. aren't free).
Still, I could get behind this for Python 4.0.
2. Just require comprehensions to handle StopIteration.
The main cost and benefit are the same as #1.
However, it makes the language and implementation more complex, rather than simpler.
Also, the effects of this less radical change (you can no longer pass StopIteration through a comprehension) seem like they might be harder to explain to people than the more radical one.
And, worse, they less radical effects would probably cause more subtle bugs. Which implies a few versions where passing StopIteration through a comprehension triggers a warning.
3. Turn break from a statement into an expression.
x = [value if pred(value) else break for value in iterable]
The break expression, when evaluated, would have the exact same effect as executing the statement today (including being an error if it's not nested syntactically under a loop or inside a comprehension).
This would be trivial to implement, document, and teach.
But I don't think anyone wants an expression that inherently has no value. Python doesn't have any such expressions today. (Yes, you can write a function call that never returns normally—like sys.exit()—but that still has a return, it just never gets there.)
Also, today, Python gets a lot of mileage out of separating statements and expressions. You can see the flow at a glance, and there's exactly one side-effect per line. If we're going to lose that, I'd rather get something a lot cooler in exchange.
4. Add a break expression that's only allowed inside a comprehension expression.
Basically the same as #3 with less wide-reaching effects… but harder to describe. (How would you write the grammar? Duplicate every expression node: comp_conditional_expression, comp_lambda_form, …? Allow break as an expression syntactically, but make it an error semantically except directly in an expression_stmt or comprehension?)
And of course it makes the language less consistent.
I'd rather have #3, and just say "don't use break as an expression except in comprehensions" in PEP 8 and in linters.
5. Add a new until statement, and a corresponding until clause to comprehensions.
x = [value for value in iterable until pred(value)]
It's intuitively obvious what it means. And it's easy to define:
comp_iter ::= comp_for | comp_if | comp_until
comp_until ::= "until" expression [comp_iter]
Then just s/for or if/for, if, or until/g in 6.2.4.
Of course you have to define an until statement:
until expression: suite
… equivalent to:
if expression: break
else: suit
… with syntax:
until_stmt ::= "until" expression ":" suite
… and semantics:
> until may only occur syntactically nested in a for or while loop, but not nested in a function or class definition within that loop. If the expression is found to be true, it terminates the nearest enclosing loop; otherwise, the suite is executed.
… and nobody will ever use that statement.
Which is a hell of an argument against adding it to the language.
Obviously you could use a different new keyword, with the same sense or the opposite. But I don't think anything is any better. A couple of people suggested when, but that's even worse—besides sounding like it means "if" rather than "while" here, it actually _does_ mean "if" in most languages that have it—Clojure, CoffeeScript, Racket, etc.
6. Add the until clause without the statement.
Basically the same as #5, but interpreted as-if an until statement existed, which it won't.
To me, this is much worse than #5. It adds the same complexity to the language, and makes it inconsistent to boot. To interpret a comprehension, you'll have to map it to a nested block that contains statements that don't exist—you have no experience with, and no way to gain experience with them, and you can't even run the resulting code.
7. Add a "magic" while clause that's basically until with the opposite sense.
x = [value for value in iterable while pred(value)]
This reads pretty nicely (at least in trivial comprehensions), it parallels takewhile and friends, and it matches a bunch of other languages (most of the languages where "when" means "if", "while" means this).
But it has a completely different meaning from while statements, and in fact a barely-related one.
In particular, it's obviously not this:
x = []
for value in iterable:
while pred(value):
x.append(value)
What it actually means is:
x = []
for value in iterable:
if pred(value):
x.append(value)
else:
break
Imagine trying to teach that to a novice.
Or trying to write the formal definition in 6.2.4.
This makes the language much more inconsistent than #6.
8. Allow else break in comp_if clauses.
x = [value for value in iterable if pred(value) else break]
This one is pretty easy to define rigorously, since it maps to exactly what the while attempt maps to with a slight change to the existing rules.
But to me, it makes the code a confusing mess. I'm immediately reading "iterable if pred(value) else break", and that's wrong.
Also, while it's pretty easy to convert this to a nested block in your head, it's not as easy to just read out in line, because the rest of the comprehension and the main expression no longer just nest under the end of the clause; instead, they nest under the middle of it
Also, an else clause that can't be used for anything but break is very weird. But it doesn't make sense to put anything else there.
9. Add a special comp_break clause.
x = [value for value in iterable if pred(value) break]
The syntax is intuitively simple (break clauses nest just like if and for clauses, and the fact that you can't nest anything underneath it is obvious), and also trivial to define (see my last email).
But this reverses the sense of the controlling if. In fact, each time I read it, the meaning seems to flip back and forth until I finally get it.
And there's no way to make the semantics reasonable if they're defined in terms of nested blocks (as they are today) without lots of special-case language to deal with break. (See my last email.)
10. Allow break in comp_if clauses.
x = [value for value in iterable if pred(value) break]
The syntax is almost simple as #9, and it gets you more flexibility (because there's nothing stopping you from putting a break under any if statement, not just the last).
However, intuitively this is exactly the same as #9. An if statement with a break means something completely different, and nearly opposite, to one without a break.
And trying to define the meaning is even more complicated.
----- Original Message -----
> From: Andrew Barnert <abarnert at yahoo.com>
> To: "ron3200 at gmail.com" <ron3200 at gmail.com>; "python-ideas at python.org" <python-ideas at python.org>
> Cc:
> Sent: Wednesday, June 26, 2013 10:13 PM
> Subject: Re: [Python-ideas] Is this PEP-able? for X in ListY while conditionZ:
>
> From: Ron Adam <ron3200 at gmail.com>
>
> Sent: Wednesday, June 26, 2013 7:44 PM
>
>
>> The 'if' after the iterator handles the selection of values. Given
>
>> that, it makes sense to allow flow control statements after the last if,
> but not
>> expressions effecting the values.
>
> First, why restrict break to the last if clause? Saying it can't go in a for
> clause or in the controlled expression I understand, but… currently, there's
> nothing special about the last clause (or any other clause, except for the
> first, in genexps only), and it seems odd to change that.
>
>
> But meanwhile, your description made me realize exactly what feels weird about
> the whole idea.
>
> A comprehension doesn't have flow control statements. It has _clauses_.
> Obviously they're related things; there's a mapping that's
> rigorously definable and intuitively clear. But they don't have a colon and
> a suite. (They don't even take the same set of expressions, but that's
> not important here.)
>
> Intuitively, that's because the whole point of a comprehension is that the
> rest of the comprehension, especially including the core expression, is the
> suite. Adding a suite breaks that.
>
> Forget that break means the flow control is no longer always downward-local. It
> also completely turns the meaning of the controlling if statement around.
> Instead of "if this: the rest of the expression" it's "if
> this: the suite, else: the rest of the expression".
>
> On top of that, the clauses map to nested statements. You can't nest break
> statements.
>
> All of this becomes more obvious if you try to describe things rigorously.
> Let's rewrite
> http://docs.python.org/3/reference/expressions.html#displays-for-lists-sets-and-dictionaries
> to explain the semantics of break.
>
> In order to make the issues more obvious, I'm going to ignore continue, use
> your "only at the end" restriction, and not add a rule saying that the
> break has to be controlled by an if. (If you want to write [x for subiter in
> iter for x in subiter break], fine, you get back []—this actually won't be
> interpretable for a genexp with only one clause, but let that be an error
> because it translates to an error.) I don't think changing any of those
> would solve any of the problems, they'd just make things more complicated
> and harder to see.
>
> First, add the new syntax. There are two obvious ways to do it:
>
> comp_if ::= "if" comp_if_body expression_nocond
> ["break" | comp_iter]
>
> Or:
>
> comp_iter ::= comp_for | comp_if | comp_break
> comp_break ::= "break"
>
> If you keep the existing semantics, this:
>
>
> x = [value for value in iter if pred(value) break]
>
> … maps to:
>
> x = []
> for value in iter:
> if pred(value):
> break
> x.append(value)
>
> Or maybe the append is a sibling of the break rather than a child? Either way,
> it makes no sense. It has to map to something like this:
>
> x = []
> for value in iter:
> if pred(value):
> break
> x.append(value)
>
> Which means the clauses are no longer nested, so the simple explanation no
> longer works. Instead, you need something more convoluted, like:
>
> The comprehension consists of a single expression followed by at least one for
> clause, zero or more for or if clauses, and zero or one break clause. In this
> case, the elements of the new container are those that would be produced by
> considering each of the for or if clauses a block, nesting from left to
> right, and the break clause, if present, a simple statement, and evaluating the
> expression to produce an element each time the innermost block that does not
> directly contain a break is entered and a block directly containing a break, if
> any, is not entered.
>
> I think you can make it a little less convoluted by using the concept of exiting
> a block (which is well-defined, thanks to with statements), but I think that
> just makes it even less intuitive:
>
> The comprehension consists of a single expression followed by at least one for
> clause, zero or more for or if clauses, and zero or one break clause. In this
> case, the elements of the new container are those that would be produced by
> considering each of the for or if clauses a block, nesting from left to
> right, and the break clause, if present, a simple statement, and evaluating the
> expression to produce an element each time the innermost block is exited without
> executing a break.
>
> There's no way to explain this simply, because it's no longer simple.
>
>
> As a side note:
>
>
>> [x for x in range(n) if x<limit pass else continue]
>
>
> This is already hard to distinguish from a ternary expression, even without the
> other stuff you bring in later. Compare:
>
>> [x for x in range(n) if n<10 else range(10)]
>
> I don't think that's valid (although I'm not sure without
> testing). And I'm not sure if it would be as hard for the parser to deal
> with as it is for a human. But really, if a human can't parse it, it's
> meaningless.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
More information about the Python-ideas
mailing list