[Python-3000] Iterators for dict keys, values, and items == annoying :)
Nick Coghlan
ncoghlan at gmail.com
Wed Mar 29 13:15:35 CEST 2006
Paul Moore wrote:
> On 3/29/06, Brett Cannon <brett at python.org> wrote:
>> Without a direct reason in terms of the language needing a
>> standardization of an interface, perhaps we just don't need views. If
>> people want their iterator to have a __len__ method, then fine, they
>> can add it without breaking anything, just realize it isn't part of
>> the iterator protocol and thus may limit what objects a function can
>> accept, but that is there choice.
>
> Good point. I think we need to start from strong use cases. With
> these, I agree that the view concept is a good implementation
> technique to consider. But let's not implement views just for the sake
> of having them - I'm pretty sure that was never Guido's intention.
There are three big use cases:
dict.keys
dict.values
dict.items
Currently these all return lists, which may be expensive in terms of copying.
They all have iter* variants which while memory efficient, are far less
convenient to work with.
For Py3k, the intent is to have only one version which produces a view with
the memory efficiency of an iterator, but the convenience of a list.
To give these views the benefits of having a real list, the following is all
that's really needed:
1. implement __len__ (allows bool() and len() to work)
- all delegate to dict.__len__
2. implement __contains__ (allows containment tests to work)
- delegate to dict.__contains__ for dict.keys()
- use (or fallback to) linear search for dict.values()
- check "dict[item[0]] == item[1]" for dict.items()
3. implement __iter__ (allows iteration to work)
- make iter(dict.keys()) equivalent to current dict.iterkeys()
- make iter(dict.values()) equivalent to current dict.itervalues()
- make iter(dict.items()) equivalent to current dict.iteritems()
For an immutable view, that's all you need. IOW, take the iterable protocol
(an __iter__ that returns a new iterator when invoked) and add __len__ and
__contains__ to get a "container" protocol. Given that containment falls back
on __iter__ anyway, __len__ is the only essential addition to turn an iterable
into a container.
Note that adding __len__ to an *iterator* does NOT give you something that
would satisfy such a container protocol - invoking __iter__ again does not
give you a fresh iterator, so you can't easily iterate repeatedly.
With reiterability as a defining characteristic, other niceties become
possible (potentially available as a mixin):
1. a generic container __str__ (not __repr__!) implementation:
def __str__(self):
# keep default __repr__ since eval(repr(x)) won't round trip
name = self.__name__
guts = ", ".join(repr(x) for x in self)
return "%s([%s])" % guts
2. generic container value based equality testing:
def __eq__(self, other):
if len(self) != len(other):
return False
for this, that in izip(self, other):
if this != that:
return False
return True
Further refinement of such a container protocol to the minimal requirements
for a sequence protocol is already defined by such things as the requirements
of the reversed() builtin:
for i, x in enumerate(seq):
assert seq[i] == x
Cheers,
Nick.
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
---------------------------------------------------------------
http://www.boredomandlaziness.org
More information about the Python-3000
mailing list