On 21.11.2014 00:51, Guido van Rossum wrote:
On Thu, Nov 20, 2014 at 2:39 PM, Wolfgang Maier
mailto:wolfgang.maier@biologie.uni-freiburg.de> wrote: [...] Hmm, I'm not convinced by these toy examples, but I did inspect some of my own code for incompatibility with the proposed change. I found that there really is only one recurring pattern I use that I'd have to change and that is how I've implemented several file parsers. I tend to write them like this:
def parser (file_object): while True: title_line = next(file_object) # will terminate after the last record
try: # read and process the rest of the record here except StopIteration: # this record is incomplete raise OSError('Invalid file format') yield processed_record
There's probably something important missing from your examples. The above while-loop is equivalent to
for title_line in io_object: ...
My reason for not using a for loop here is that I'm trying to read from a file where several lines form a record, so I'm reading the title line of a record (and if there is no record in the file any more I want the parser generator to terminate/return. If a title line is read successfully then I'm reading the record's body lines inside a try/except, i.e. where it says "# read and process the rest of the record here" in my shortened code I am actually calling next several times again to retrieve the body lines (and while reading these lines an unexpected StopIteration in the IOWrapper is considered a file format error). I realize that I could also use a for loop and still call next(file_object) inside it, but I find this a potentially confusing pattern that I'm trying to avoid by using the while loop and all explicit next(). Compare: for title_line in file_object: record_body = next(file_object) # in reality record_body is generated using several next calls # depending on the content found in the record body while it's read yield (title_line, record_body) vs while True: title_line = next(file_object) body = next(file_object) yield (title_line, body) To me, the for loop version suggests to me that the content of file_object is read in line by line by the loop (even though the name title_line tries to hint at this being not true). Only when I inspect the loop body I see that further items are retrieved with next() and, thus, skipped in the for iteration. The while loop, on the other hand, makes the number of iterations very clear by showing all of them in the loop body. Would you agree that this is justification enough for while instead of for or is it only me who thinks that a for loop makes the code read awkward ?
If you're okay with getting RuntimeError instead of OSError for an undesirable StopIteration, you can just drop the except clause altogether.
Right, I could do this if the PEP-described behavior was in effect today.