[Python-Dev] PEP 469: Restoring the iterkeys/values/items() methods
Terry Reedy
tjreedy at udel.edu
Sun Apr 20 06:03:58 CEST 2014
On 4/19/2014 10:52 AM, Guido van Rossum wrote:
> Does everyone involved know that "for x in d.iterkeys()" is equivalent
> to "for x in d"
Looking at uses I found by searching code.ohloh.net, the answer is
either 'No, people sometimes add a redundant .iterkeys()' or 'people are
writing non-dict mapping classes for which it is not redundant (perhaps
because their custom class iterates by items rather than keys by
default)'. I could not tell from the quoted snippet.
> and works the same in Python 2 and 3? Similarly,
> "list(d)" is a simple, fast way to spell the Python 2 semantics of
> "d.keys()" that works in both versions (but I doubt it is much needed
> -- usually the actual code follows up with sorting, so you should use
> sorted(d)).
>
> This doesn't solve itervalues() and iteritems() but I expect those are
> less common,
ohloh gives about 77,000 python hits for iteritems and 16,000 for
itervalues. A large fraction of itervalue hits are definitions rather
than uses, often from a compat.py (is this from six?)
if sys.version_info[0] >= 3:
text_type = str
string_types = str,
iteritems = lambda o: o.items()
itervalues = lambda o: o.values()
izip = zip
else:
text_type = unicode
string_types = basestring,
iteritems = lambda o: o.iteritems()
itervalues = lambda o: o.itervalues()
from itertools import izip
This is three hits for iteritems and three for itervalues and none for
the unneeded iterkeys. My guess is that there are 5000 itervalue uses
and 70000 iteritem uses.
There are 1,500,000 python hits for 'for', some unknown fraction of
which are 'for key in somedict' or 'for key in somedict.keys()'. There
are 13000 for iterkeys. As noted above, this is *not* inflated by 3 hits
for each use of compat.py. I think 10% or 150000 iterations by key might
be a reasonable guess.
There are other definition sets that include iterkeys or that define
functions that wrap all three bound methods for a particular dict.
iterkeys = lambda: d.iterkeys() # py2
iterkeys = lambda: d.keys() # py3
> and "for x, y in d.iteritems(): <blah>" is rewritten nicely as
>
> for x in d:
> y = d[x]
> <blah>
>
> If there is a measurable slowdown in the latter I would be totally okay
> with some kind of one-element cache for the most recent lookup.
About three weeks ago, Raymond opened http://bugs.python.org/issue21101
with this claim: "It is reasonably common to make two successive
dictionary accesses with the same key." I proposed a specialized
caching as an alternative to adding new C API functions.
Using the iteritems function, there is one simple, extra function call
for the entire loop. If the body of the loop takes at lease as long as
that one call, the extra time is a non-issue if the dict has more than,
say, 20 items.
--
Terry Jan Reedy
More information about the Python-Dev
mailing list