[Python-3000] Spooky behavior of dict.items() and friends

David Pokorny dbpokorny at gmail.com
Wed Apr 2 23:22:33 CEST 2008


On Wed, Apr 2, 2008 at 11:36 AM, Guido van Rossum <guido at python.org> wrote:
>  The problem is that if you make the slow and fool-proof implementation
>  the common name, you'll have to invent another name for the fast (but
>  sometimes less convenient) method. This is what we ended up doing in
>  Python 2.2 with iterkeys() and friends. Unfortunately, despite your
>  assertion, most people think their code should run as fast as
>  possible, and hence we see a great proliferation of iterkeys() calls.
>  So the fast-but-requiring-care implementation becomes more popular
>  than the slow-but-simple version, and now we have a duplication of
>  APIs.
>
>  I'd much rather have a single API that can be made to serve everyone
>  equally. I predict that list(x.keys()) will remain a rarity (except in
>  code converted by 2to3). However sorted(x.keys()) will become a
>  well-known idiom, and it's a much better one than the old idiom
>
>   keys = x.keys()
>   keys.sort()
>
>  which doesn't led itself easily to use in an expression.

I agree that most people think their code should run as fast as
possible, but in this particular case, common practice and common
sense diverge. If 80% of one's code makes only a negligible
contribution to performance, then clearly there is no need to optimize
it, but an average programmer will do it anyway. I imagine the best
programmers would probably do it too under social pressure. This is
not entirely fair, but one could say this change encourages average
programmers to keep their bad habits.

I understand the appeal of having a single API, but in this particular
case, there are two arguably distinct use cases: getting the keys of a
dict and iterating over the keys of the dict. One could change syntax
so that "getting the keys" would be spelled "x.keys()" and "iterating
over the keys" would look like

for k in keys of x:
  ...

or ','.join(k for k in keys of x)

(This would make 'keys' both a keyword and an identifier, so my
understanding is that this would entail a change in architecture of
the tokenizer and maybe parser as well; this strikes me as a temporary
but not intrinsic objection.)
To elaborate on the point of the impact on the programmer new to Python, I find

>>> x = {1:2}
>>> x.keys()
{1}

much more appealing than

>>> x.keys()
<dict_keys object at 0xb7d475b0>

I taught (scheme) programming to high school students once, and I know
from experience that they try absolutely everything (most of it wrong
of course) because there are a million other confusing things to
learn. I think there is a real value in making a simple operation on a
core type as straightforward as possible.

David


More information about the Python-3000 mailing list