Help with sets
robert.kern at gmail.com
Mon Oct 11 17:09:19 CEST 2010
On 10/11/10 6:11 AM, Lawrence D'Oliveiro wrote:
> In message<8h9ob9FkurU1 at mid.individual.net>, Gregory Ewing wrote:
>> Lawrence D'Oliveiro wrote:
>>> Did you know that applying the “set” or “frozenset” functions to a dict
>>> return a set of its keys?
>>> Seems a bit dodgy, somehow.
>> That's just a consequence of the fact that dicts produce their
>> keys when iterated over, and the set constructor iterates over
>> whatever you give it.
> Hmm. It seems that “iter(<dict>)” iterating over the keys has been around a
> long time. But a dict has both keys and values: why are language constructs
> treating them so specially as to grab the keys and throw away the values?
Language constructs are not treating anything specially much less "throwing
away" anything. The language construct in question does exactly the same thing
with every object nowadays: call the .__iter__() method to get the iterator and
call .next() on that iterator until it raises StopIteration. It is the
responsibility of the dict object itself to decide how it wants to be iterated over.
The reasoning for this decision is spelled out in the PEP introducing the
- There has been a long discussion about whether
for x in dict: ...
should assign x the successive keys, values, or items of the
dictionary. The symmetry between "if x in y" and "for x in y"
suggests that it should iterate over keys. This symmetry has been
observed by many independently and has even been used to "explain"
one using the other. This is because for sequences, "if x in y"
iterates over y comparing the iterated values to x. If we adopt
both of the above proposals, this will also hold for
The argument against making "for x in dict" iterate over the keys
comes mostly from a practicality point of view: scans of the
standard library show that there are about as many uses of "for x
in dict.items()" as there are of "for x in dict.keys()", with the
items() version having a small majority. Presumably many of the
loops using keys() use the corresponding value anyway, by writing
dict[x], so (the argument goes) by making both the key and value
available, we could support the largest number of cases. While
this is true, I (Guido) find the correspondence between "for x in
dict" and "if x in dict" too compelling to break, and there's not
much overhead in having to write dict[x] to explicitly get the
For fast iteration over items, use "for key, value in
dict.iteritems()". I've timed the difference between
for key in dict: dict[key]
for key, value in dict.iteritems(): pass
and found that the latter is only about 7% faster.
Resolution: By BDFL pronouncement, "for x in dict" iterates over
the keys, and dictionaries have iteritems(), iterkeys(), and
itervalues() to return the different flavors of dictionary
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
More information about the Python-list