[Python-Dev] Re: Sets: elt in dict, lst.include

Guido van Rossum guido@digicool.com
Mon, 29 Jan 2001 09:48:22 -0500


> [Ping]
> >     dict[key] = 1
> >     if key in dict: ...
> >     for key in dict: ...
> 
> [Guido]
> > No chance of a time-machine escape, but I *can* say that I agree that
> > Ping's proposal makes a lot of sense.  This is a reversal of my
> > previous opinion on this matter.  (Take note -- those don't happen
> > very often! :-)
> >
> > First to submit a working patch gets a free copy of 2.1a2 and
> > subsequent releases,
> 
> Thomas since submitted a patch to do the "if key in dict" part (which I
> reviewed and accepted, pending resolution of doc issues).
> 
> It does not do the "for key in dict" part.  It's not entirely clear whether
> you intended to approve that part too (I've simplified away many layers of
> quoting in the above <wink>).  In any case, nobody is working on that part.
> 
> WRT that part, Ping produced some stats in:
> 
> http://mail.python.org/pipermail/python-dev/2001-January/012106.html
> 
> > How often do you write 'dict.has_key(x)'?          (std lib says: 206)
> > How often do you write 'for x in dict.keys()'?     (std lib says: 49)
> >
> > How often do you write 'x in dict.values()'?       (std lib says: 0)
> > How often do you write 'for x in dict.values()'?   (std lib says: 3)
> 
> However, he did not report on occurrences of
> 
>     for k, v in dict.items()
> 
> I'm not clear exactly which files he examined in the above, or how the
> counts were obtained.  So I don't know how this compares:  I counted 188
> instances of the string ".items(" in 122 .py files, under the dist/ portion
> of current CVS.  A number of those were assignment and return stmts, others
> were dict.items() in an arglist, and at least one was in a comment.  After
> weeding those out, I was left with 153 legit "for" loops iterating over
> x.items().  In all:
> 
>     153 iterating over x.items()
>     118     "     over x.keys()
>      17     "     over x.values()
> 
> So I conclude that iterating over .values() is significantly more common
> than iterating over .keys().

I did a less sophisticated count but come to the same conclusion:
iterations over items() are (somewhat) more common than over keys(),
and values() are 1-2 orders of magnitude less common.  My numbers:

$ cd python/src/Lib
$ grep 'for .*items():' *.py | wc -l
     47
$ grep 'for .*keys():' *.py | wc -l
     43
$ grep 'for .*values():' *.py | wc -l
      2

> On c.l.py about an hour ago, Thomas complained that two (out of two) of his
> coworkers guessed wrong about what
> 
>     for x in dict:
> 
> would do, but didn't say what they *did* think it would do.  Since Thomas
> doesn't work with idiots, I'm guessing they *didn't* guess it would iterate
> over either values or the lines of a freshly-opened file named "dict"
> <wink>.

I don't much value to the readability argument: typically, one will
write "for key in dict" or "for name in dict" and then it's obvious
what is meant.

> So if you did intend to approve "for x in dict" iterating over dict.keys(),
> maybe you want to call me out on that "approval post" I forged under your
> name.

But here's my dilemma.  "if (k, v) in dict" is clearly useless (nobody
has even asked me for a has_item() method).  I can live with "x in
list" checking the values and "x in dict" checking the keys.  But I
can *not* live with "x in dict" equivalent to "dict.has_key(x)" if
"for x in dict" would mean "for x in dict.items()".  I also think that
defining "x in dict" but not "for x in dict" will be confusing.

So we need to think more.

How about:

    for key in dict: ...		# ... over keys

    for key:value in dict: ...		# ... over items

This is syntactically unambiguous (a colon is currently illegal in
that position).

This also suggests:

    for index:value in list: ...	# ... over zip(range(len(list), list)

while doesn't strike me as bad or ugly, and would fulfill my brother's
dearest wish.

(And why didn't we think of this before?)

--Guido van Rossum (home page: http://www.python.org/~guido/)