[Python-3000] Iterators for dict keys, values, and items == annoying :)

Fri Mar 24 04:24:12 CET 2006

Guido van Rossum wrote:
> On 3/23/06, Ian Bicking <ianb at colorstudy.com> wrote:
>> Guido van Rossum wrote:
>>> But this is only needed if *all you have* is the iterator. Most of the
>>> time, the code containing the for loop has access to the container,
>>> and the iterator is only instantiated by the __iter__() call implied
>>> by the for loop.
>> I don't think that is the case.  For instance:
>>
>> def non_empty_lines(seq):
>>      for line in seq:
>>          if line.strip() and not line.strip().startswith('#'):
>>              yield line
>>
>> for line in non_empty_lines(open('config.txt')):
>>      ...
>>
>> I think wrapping the iterator in non_empty_lines() shouldn't cause you
>> to have to rewrite your logic to radically.
> 
> Radically compared to what?

Starts out:

if os.stat(filename).st_size: # weird, but your suggestion ;)
     with open(filename) as lines:
         for line in lines:
             read_config(line)
else:
     get_default_config()

Adding comments and empty line handling, the code becomes:

with open(filename) as lines:
     empty = True
     for line in non_empty_lines(lines):
         empty = False
         read_config(line)
     if empty:
         get_default_config()

To me that feels like a big transformation, where I would prefer to just 
be able to use "non_empty_lines(lines)" in place of "lines" and 
everything would work perfectly.  If I started out with the second 
example instead of the first, it *would* work perfectly.  But I don't do 
so.  If that second example looked just a little nicer I would use that 
form, and then there wouldn't be any problem.  It really doesn't matter 
for this case if the test comes before (using __nonzero__) or after the 
for loop (using a did-that-loop-run flag).

Of course, no syntax comes to mind to improve this.  Repurposing the 
else clause in for loops seems like it just adds to the confusing of an 
already confusing construct.  If you could somehow count how many times 
the loop had run, that'd work great; but I don't see any way to do that 
without new syntax.

>> More generally, I find
>> myself using list() fairly often lately as generators have become more
>> popular, and it's not just with SQLObject.  Testing for the existence of
>> any items in the iterator (is that a better way of saying it than
>> empty?) is often the reason.
> 
> If creating a copy of all items using list() is not a problem, then
> you shouldn't have been using iterators in the first place. Iterators
> exist so you can efficiently handle cases where list() would overflow
> memory. If you don't have such cases, you should just design your APIs
> to return lists in the first place.

You've just made the argument that dict.keys() should return a list ;) 
Or a view would work just as well, I suppose.  Maybe you've just made 
the argument that it should return an iterable, not an iterator.

-- 
Ian Bicking  |  ianb at colorstudy.com  |  http://blog.ianbicking.org