[Python-Dev] Re: Sets: elt in dict, lst.include
Guido van Rossum
guido@digicool.com
Mon, 29 Jan 2001 09:48:22 -0500
> [Ping]
> > dict[key] = 1
> > if key in dict: ...
> > for key in dict: ...
>
> [Guido]
> > No chance of a time-machine escape, but I *can* say that I agree that
> > Ping's proposal makes a lot of sense. This is a reversal of my
> > previous opinion on this matter. (Take note -- those don't happen
> > very often! :-)
> >
> > First to submit a working patch gets a free copy of 2.1a2 and
> > subsequent releases,
>
> Thomas since submitted a patch to do the "if key in dict" part (which I
> reviewed and accepted, pending resolution of doc issues).
>
> It does not do the "for key in dict" part. It's not entirely clear whether
> you intended to approve that part too (I've simplified away many layers of
> quoting in the above <wink>). In any case, nobody is working on that part.
>
> WRT that part, Ping produced some stats in:
>
> http://mail.python.org/pipermail/python-dev/2001-January/012106.html
>
> > How often do you write 'dict.has_key(x)'? (std lib says: 206)
> > How often do you write 'for x in dict.keys()'? (std lib says: 49)
> >
> > How often do you write 'x in dict.values()'? (std lib says: 0)
> > How often do you write 'for x in dict.values()'? (std lib says: 3)
>
> However, he did not report on occurrences of
>
> for k, v in dict.items()
>
> I'm not clear exactly which files he examined in the above, or how the
> counts were obtained. So I don't know how this compares: I counted 188
> instances of the string ".items(" in 122 .py files, under the dist/ portion
> of current CVS. A number of those were assignment and return stmts, others
> were dict.items() in an arglist, and at least one was in a comment. After
> weeding those out, I was left with 153 legit "for" loops iterating over
> x.items(). In all:
>
> 153 iterating over x.items()
> 118 " over x.keys()
> 17 " over x.values()
>
> So I conclude that iterating over .values() is significantly more common
> than iterating over .keys().
I did a less sophisticated count but come to the same conclusion:
iterations over items() are (somewhat) more common than over keys(),
and values() are 1-2 orders of magnitude less common. My numbers:
$ cd python/src/Lib
$ grep 'for .*items():' *.py | wc -l
47
$ grep 'for .*keys():' *.py | wc -l
43
$ grep 'for .*values():' *.py | wc -l
2
> On c.l.py about an hour ago, Thomas complained that two (out of two) of his
> coworkers guessed wrong about what
>
> for x in dict:
>
> would do, but didn't say what they *did* think it would do. Since Thomas
> doesn't work with idiots, I'm guessing they *didn't* guess it would iterate
> over either values or the lines of a freshly-opened file named "dict"
> <wink>.
I don't much value to the readability argument: typically, one will
write "for key in dict" or "for name in dict" and then it's obvious
what is meant.
> So if you did intend to approve "for x in dict" iterating over dict.keys(),
> maybe you want to call me out on that "approval post" I forged under your
> name.
But here's my dilemma. "if (k, v) in dict" is clearly useless (nobody
has even asked me for a has_item() method). I can live with "x in
list" checking the values and "x in dict" checking the keys. But I
can *not* live with "x in dict" equivalent to "dict.has_key(x)" if
"for x in dict" would mean "for x in dict.items()". I also think that
defining "x in dict" but not "for x in dict" will be confusing.
So we need to think more.
How about:
for key in dict: ... # ... over keys
for key:value in dict: ... # ... over items
This is syntactically unambiguous (a colon is currently illegal in
that position).
This also suggests:
for index:value in list: ... # ... over zip(range(len(list), list)
while doesn't strike me as bad or ugly, and would fulfill my brother's
dearest wish.
(And why didn't we think of this before?)
--Guido van Rossum (home page: http://www.python.org/~guido/)