2014-11-21 10:06 GMT-08:00 Andrew Barnert <abarnert@yahoo.com>:
On Nov 21, 2014, at 9:41, Antony Lee <antony.lee@berkeley.edu> wrote:
I would like to believe that "break"-as-a-expression solves most of these issues.
This one has been raised every time in the past. Since we don't have the PEP, I'll try to summarize the problems. But first:
It is reasonable, though, to drop the "return"-as-an-expression part of the proposal, because you need to know that the comprehension is run in a separate function to understand it (it's a bit ironic that I am dropping the original proposal to defend my own, now...).
There's a more significant difference between the two proposals. A break expression allows you to break out of a single loop, but gives you no way to break out of the whole thing (as your example shows). A return expression allows you to break out of the whole thing, but gives you no way to break out of a single loop. That's why they're not interchangeable in explicit loop code. (Also, notice that every so often, someone proposed a numbered or labeled break, and the answer is always "Why would you need that? If your code has enough loops that you need to break out of some but not all, refactor those some into a separate function and then use return.")
This also raises the question of why one of break/continue/return should be an expression but not the others. If your answer is "because you only need continue when there's code not controlled by the if, which is impossible in a comprehension" then you're pretty much admitting that the flow control expression thing is not really a general purpose thing, but a hack made for comprehensions.
I guess you're going to make me flip-flop and go back to defend having all three of "break", "return" and "raise" as expressions (I can't see a single use of "continue" as an expression but would be happy to be proven wrong). The issue of "return" as an expression is obviously that it exposes the internals of generators (i.e. they are in their own function) but I can live with that. ("raise" as expressions is a separate issue, the main application being to stuff it into lambdas.)
Consider
((x, y) for x in l1 if f1(x) or break for y in l2 if f2(y) or break)
This maps directly to
for x in l1: if f1(x) or break: for y in l2: if f2(y) or break: yield x, y
When you write it this way, it's pretty clear that you're abusing or as an else, and that it comes in the wrong place, and that you're cramming side effects into an expression with no value. Would you put anything else with side effects here, even a call to a logging function?
You're also introducing unnecessary indentation, and making the code more verbose by comparison with the obvious alternative:
for x in l1: if not f1(x): break for y in l2: if not f2(y): break yield x, y
Yes, that alternative can't be written as a comprehension. But that doesn't mean we should come up with a less obvious, more verbose, and harder to reason about alternative just because it can be written as a comprehension, even though we'd never write it as an explicit loop. If you want to go that route, we might as well just enshrine the "or stop()" hack instead of trying to break it.
The introduction of unnecessary indentation is honestly not there to make the non-generator example look bad, but simply to show how the proposal addresses the issue of translating nested loops. For the non-nested loop case, a simpler and arguably more readable way to write it would be (x if f(x) else break for x in X) Perhaps giving the nested case first was poor PR; yes, the nested case is slightly hackish but I am fairly sure anybody can figure out the meaning of the above.
Also, can you imagine using break as an expression in any other context besides as an or operand in an if statement directly under a for statement?