Re: [Python-ideas] Is this PEP-able? for X in ListY while conditionZ:

July 1, 2013

      On 01/07/2013 23:44, Oscar Benjamin wrote:
...
On 1 July 2013 21:29, David Mertz <mertz@gnosis.cx> wrote:
...
However, I see the point made by a number of people that the 'while' clause
has no straightforward translation into an unrolled loop, and is probably
ruled out on that basis.
My thought (in keeping with the title of the thread) is that the comprehension
data = [x for y in stuff while z]
would unroll as the loop
for y in stuff while z:
         data.append(x)
which would also be valid syntax and have the obvious meaning. This is
similar to Nick's suggestion that 'break if' be usable in the body of
the loop so that
data = [x for y in stuff; break if not z]
would unroll as
for y in stuff:
         break if not z
         data.append(y)
Having a while clause on for loops is not just good because it saves a
couple of lines but because it clearly separates the flow control from
the body of the loop (another reason I dislike 'break if'). In other
words I find the flow of the loop
for p in primes() while p < 100:
         print(p)
easier to understand (immediately) than
for p in primes():
         if p >= 100:
             break
         print(p)
These are just trivially small examples. As the body of the loop grows
in complexity the readability benefit of moving 'if not z: break' into
the top line becomes more significant.
You can get the same separation of concerns using takewhile at the
expense of a different kind of readability
for p in takewhile(lambda p: p < 100, primes()):
         print(p)
However there is another problem with using takewhile in for loops
which is that it discards an item from the iterable. Imagine parsing a
file such as:
csvfile = '''# data.csv
# This file begins with an unspecified number of header lines.
# Each header line begins with '#'.
# I want to keep these lines but need to parse the separately.
# The first non-comment line contains the column headers
x y z
1 2 3
4 5 6
7 8 9'''.splitlines()
You can do
csvfile = iter(csvfile)
     headers = []
     for line in csvfile:
         if not line.startswith('#'):
             break
         headers.append(line[1:].strip())
     fieldnames = line.split()
     for line in csvfile:
         yield {name: int(val) for name, val in zip(fieldnames, line.split())}
However if you use takewhile like
for line in takewhile(lambda line: line.startswith('#'), csvfile):
         headers.append(line[1:].split())
then after the loop 'line' holds the last comment line. The discarded
column header line is gone and cannot be recovered; takewhile is
normally only used when the entire remainder of the iterator is to be
discarded.
I would propose that
for line in csvfile while line.startwith('#'):
         headers.append(line)
would result in 'line' referencing the item that failed the while predicate.
So:

     for item in generator while is_true(item):
         ...

is equivalent to:

     for item in generator:
         if not is_true(item):
             break
         ...

By similar reasoning(?):

     for item in generator if is_true(item):
         ...

is equivalent to:

     for item in generator:
         if not is_true(item):
             continue
         ...

If we have one, shouldn't we also have the other?

If only comprehensions have the 'if' form (IIRC, it has already been
rejected for multi-line 'for' loops), then shouldn't only
comprehensions have the 'while' form?

Re: [Python-ideas] Is this PEP-able? for X in ListY while conditionZ:

MRAB