![](https://secure.gravatar.com/avatar/5ce43469c0402a7db8d0cf86fa49da5a.jpg?s=120&d=mm&r=g)
On 01/07/2013 23:44, Oscar Benjamin wrote:
On 1 July 2013 21:29, David Mertz <mertz@gnosis.cx> wrote:
However, I see the point made by a number of people that the 'while' clause has no straightforward translation into an unrolled loop, and is probably ruled out on that basis.
My thought (in keeping with the title of the thread) is that the comprehension
data = [x for y in stuff while z]
would unroll as the loop
for y in stuff while z: data.append(x)
which would also be valid syntax and have the obvious meaning. This is similar to Nick's suggestion that 'break if' be usable in the body of the loop so that
data = [x for y in stuff; break if not z]
would unroll as
for y in stuff: break if not z data.append(y)
Having a while clause on for loops is not just good because it saves a couple of lines but because it clearly separates the flow control from the body of the loop (another reason I dislike 'break if'). In other words I find the flow of the loop
for p in primes() while p < 100: print(p)
easier to understand (immediately) than
for p in primes(): if p >= 100: break print(p)
These are just trivially small examples. As the body of the loop grows in complexity the readability benefit of moving 'if not z: break' into the top line becomes more significant.
You can get the same separation of concerns using takewhile at the expense of a different kind of readability
for p in takewhile(lambda p: p < 100, primes()): print(p)
However there is another problem with using takewhile in for loops which is that it discards an item from the iterable. Imagine parsing a file such as:
csvfile = '''# data.csv # This file begins with an unspecified number of header lines. # Each header line begins with '#'. # I want to keep these lines but need to parse the separately. # The first non-comment line contains the column headers x y z 1 2 3 4 5 6 7 8 9'''.splitlines()
You can do
csvfile = iter(csvfile) headers = [] for line in csvfile: if not line.startswith('#'): break headers.append(line[1:].strip()) fieldnames = line.split() for line in csvfile: yield {name: int(val) for name, val in zip(fieldnames, line.split())}
However if you use takewhile like
for line in takewhile(lambda line: line.startswith('#'), csvfile): headers.append(line[1:].split())
then after the loop 'line' holds the last comment line. The discarded column header line is gone and cannot be recovered; takewhile is normally only used when the entire remainder of the iterator is to be discarded.
I would propose that
for line in csvfile while line.startwith('#'): headers.append(line)
would result in 'line' referencing the item that failed the while predicate.
So: for item in generator while is_true(item): ... is equivalent to: for item in generator: if not is_true(item): break ... By similar reasoning(?): for item in generator if is_true(item): ... is equivalent to: for item in generator: if not is_true(item): continue ... If we have one, shouldn't we also have the other? If only comprehensions have the 'if' form (IIRC, it has already been rejected for multi-line 'for' loops), then shouldn't only comprehensions have the 'while' form?