[Python-3000] Iterators for dict keys, values, and items == annoying :)
Ian Bicking
ianb at colorstudy.com
Fri Mar 24 02:00:17 CET 2006
Guido van Rossum wrote:
>>In SQLObject it came about due to a desire to lazily load objects out of
>>a query. The lazy behavior had other problems (mostly introducing
>>concurrency where you wouldn't expect). In addition, the query is only
>>run when you start iterating. I'm not sure if that is good or bad
>>design -- that queries are iterable doesn't seem that bad, except that
>>the query is only invoked with iter() and that doesn't give very good
>>access to the actual executed-query object; it's all too implicit.
>
>
> I'm becoming more and more doubtful about the design of SQLobject;
> perhaps it's just not a good example since the issues seem to be
> caused by its specific design more than by the language features it's
> using.
I'm just outlining the specific problems I found looking back on the
design there, where I tried some of these techniques, with different
levels of success or frustration. I haven't argued that those decisions
were all good decisions.
>>I don't know if the same issues exist for .items/.keys; I guess it would
>>only be an issue if you passed one of iterators to some routine that
>>didn't have access to the original dict.
>
>
> But again that's an API design issue -- if the routine needed to know
> ahead of time whether the underlying collection was empty it should be
> given access to the collection. OTOH if you have an API that knows it
> can be given *any* iterator, then the "empty" flag pattern that I
> mentioned earlier is the only reliable way to differentiate between an
> empty and a non-empty containier. (Note that I refuse to say "empty
> iterator"!)
Empty iterator or iterator that produced no items -- from the outside
it's the same use case.
Iterators look a lot like containers. Often I only use a list by
iterating over it; if that's all I do then I can't the difference. At
that point it is ambiguous. I'm not even sure if a "sequence" means a
list-like object or an iterable. That's ambiguous too. So I'm only
pointing out an existing ambiguity, and a place where that ambiguity
causes problems.
Right now this is how I would iterate over a container, special-casing
an empty container:
if container:
for item in container: ...
else:
...
In this case I am testing if the container is empty, and this generally
works. Then an iterator is introduced, and my code breaks. So, I have
to choose -- do I convert the iterator to a container with list() (and
maybe needlessly copying a container), or do I switch to only using the
iteratable aspect of the container, like:
empty = True
for item in container:
empty = False
...
if empty:
...
If using the iterable interface in this case felt as natural as using
the container interface, then I'd probably have used the iterable form
from the beginning and I wouldn't have a problem. But it doesn't feel
as natural, so I don't.
I can't say *everyone* makes the same choice as me, so I am using the
first person in this argument. But I think most people do the same as I
do, and so because the language does not make the iterable form very
pretty it causes people to use the container interface (i.e.,
__nonzero__) even though they don't really need to.
>>The identical problem does exist for all generators. Using ad hoc flags
>>in for loops isn't a great solution. It's all somewhat similar to the
>>repr() problem as well.
>
>
> Not all generators. A fair number of generators are methods on
> collections that implement various iterators.
>
> OTOH generators are one of the reasons that the iterator protocol is
> as restricted as it is.
I'm not arguing for adding __nonzero__ to iterators, only for addressing
this use case where currently I make use of __nonzero__. Or,
alternately, having whatever d.keys() returns implement __nonzero__, or
otherwise be an iterable and not an iterator.
--
Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org
More information about the Python-3000
mailing list