[Python-ideas] PEP 479: Change StopIteration handling inside generators

Guido van Rossum guido at python.org
Fri Nov 21 16:39:40 CET 2014


On Fri, Nov 21, 2014 at 2:19 AM, Wolfgang Maier <
wolfgang.maier at biologie.uni-freiburg.de> wrote:

> On 21.11.2014 00:51, Guido van Rossum wrote:
>
>> On Thu, Nov 20, 2014 at 2:39 PM, Wolfgang Maier
>> <wolfgang.maier at biologie.uni-freiburg.de
>> <mailto:wolfgang.maier at biologie.uni-freiburg.de>>
>> wrote:
>>
>>     [...]
>>     Hmm, I'm not convinced by these toy examples, but I did inspect some
>>     of my own code for incompatibility with the proposed change. I found
>>     that there really is only one recurring pattern I use that I'd have
>>     to change and that is how I've implemented several file parsers. I
>>     tend to write them like this:
>>
>>     def parser (file_object):
>>          while True:
>>              title_line = next(file_object) # will terminate after the
>>     last record
>>
>>              try:
>>                  # read and process the rest of the record here
>>              except StopIteration:
>>                  # this record is incomplete
>>                  raise OSError('Invalid file format')
>>              yield processed_record
>>
>> There's probably something important missing from your examples. The
>> above while-loop is equivalent to
>>
>>      for title_line in io_object:
>>          ...
>>
>>
> My reason for not using a for loop here is that I'm trying to read from a
> file where several lines form a record, so I'm reading the title line of a
> record (and if there is no record in the file any more I want the parser
> generator to terminate/return. If a title line is read successfully then
> I'm reading the record's body lines inside a try/except, i.e. where it says
> "# read and process the rest of the record here" in my shortened code I am
> actually calling next several times again to retrieve the body lines (and
> while reading these lines an unexpected StopIteration in the IOWrapper is
> considered a file format error).
> I realize that I could also use a for loop and still call
> next(file_object) inside it, but I find this a potentially confusing
> pattern that I'm trying to avoid by using the while loop and all explicit
> next(). Compare:
>
> for title_line in file_object:
>     record_body = next(file_object)
>     # in reality record_body is generated using several next calls
>     # depending on the content found in the record body while it's read
>     yield (title_line, record_body)
>
> vs
>
> while True:
>     title_line = next(file_object)
>     body = next(file_object)
>     yield (title_line, body)
>
> To me, the for loop version suggests to me that the content of file_object
> is read in line by line by the loop (even though the name title_line tries
> to hint at this being not true). Only when I inspect the loop body I see
> that further items are retrieved with next() and, thus, skipped in the for
> iteration. The while loop, on the other hand, makes the number of
> iterations very clear by showing all of them in the loop body.
>
> Would you agree that this is justification enough for while instead of for
> or is it only me who thinks that a for loop makes the code read awkward ?
>

Now that you have explained it I see your point.


>  If you're okay with getting RuntimeError instead of OSError for an
>> undesirable StopIteration, you can just drop the except clause altogether.
>>
>
> Right, I could do this if the PEP-described behavior was in effect today.
>

 So shouldn't you be voting *for* the PEP?

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20141121/75e0c003/attachment.html>


More information about the Python-ideas mailing list