[Python-Dev] [Python-checkins] cpython (2.7): Multiple clean-ups to the docs for builtin functions.

Thu Jun 2 02:37:52 CEST 2011

On 6/1/2011 6:50 PM, raymond.hettinger wrote:

> -      with open("mydata.txt") as fp:
> -          for line in iter(fp.readline, "STOP"):
 >                 process_line(line)

As noted on the tracker, this will always loop forever. Even if "STOP"
is corrected to "STOP\n", it will still loop forever if the file does 
not have the magic value.

> +      with open('mydata.txt') as fp:
> +          for line in iter(fp.readline, ''):
 >                 process_line(line)

While I have a big problem with infinite loops, I have a (smaller) 
problem with useless code. The new loop line line is a roundabout way to 
spell "for line in fp". While this would have been useful before files 
were turned into iterables (if there was such a time after iter() was 
introduced), it is not now.

What might be useful and what the example could show is how to stop 
early if a sentinel is present, while still stopping when the iterable 
runs out.  The following alternate fix to the original does just that.

with open("mydata.txt") as fp:
     for line in iter(fp.__next__, "STOP\n"):
          process_line(line)

A tested a runnable variation with and without the exact sentinal. It 
generalizes to *any* iteration than one might want to stop early.

This still has the objection that the loop could be written as
     for line in fp:
         if line == "STOP\n":
             break
          process_line(line)
which is easily generalized to any stopping conditions.

It is hard to think of useful examples of iter(func, sentinal). To be 
sure of someday stopping, func must someday (either certainly or with 
probability 1) either raise StopIteration or produce the sentinal (which 
should therefore be hard to misspell). To be useful, func should not be 
a method of an iterable (or at least not produce the same objects as the 
iterable would when iterated.) It also must produce different values, at 
least sometimes, when called, which means either maintaining internal 
state or pulling values from another source.

Here is a completely different example which meets these criteria. It 
can actually be run (though not doctested ;-). It uses random.randint to 
produce 25 random waiting times for a binary process to hit one of the 
two values.

from random import randint
for i in range(25):
   print(sum(iter(lambda:randint(0,1), 0)), end=',')

The outer loop could be removed, but it hints at how this could be used 
for empirical probability studies.

---
Terry Jan Reedy