[Python-3000] Spooky behavior of dict.items() and friends

Alex Martelli aleaxit at gmail.com
Wed Apr 2 16:58:45 CEST 2008


On Tue, Apr 1, 2008 at 11:39 PM, "Martin v. Löwis" <martin at v.loewis.de> wrote:
   ...
>  > keys = mydict.keys()
>  > keys.sort()
>  > for key in keys:
>
>  That is indeed a frequent case in 2.x. Fortunately, it is what David
>  calls "loud" breakage:
>
>  py> keys = mydict.keys()
>  py> keys.sort()
>  Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>  AttributeError: 'dict_keys' object has no attribute 'sort'
>
>  > In fact, the 2.5 standard library turns up 3 occurrences of
>  > "keys.sort". Given that that's just the ones that used the obvious
>  > name for the list to be sorted
>  >
>  > Nowdays, I tend to write
>  >
>  > keys = sorted(mydict.keys())   # Yeah, I know, .keys() is redundant...
>  > for key in keys:
>  >
>  > or maybe
>  >
>  > for key in sorted(mydict):
>  >
>  > both of which are probably slower than the original version unless
>  > sorted switches to an insertion sort if passed a generator.
>
>  Notice that this isn't affected by the "spookiness" of dict.keys()
>  at all - it just works fine.
>
>  Why do you think this version is slower? It behaves exactly the
>  same as the original code: a list is created with all the keys,
>  and then that list is sorted, with the list-sort algorithm.

Indeed, at least with Python 2.5, any difference in performance is
more or less in the noise:

$ python -mtimeit -s'd=dict.fromkeys(range(99))' 'k=d.keys();
k.sort()' 'for x in k: pass'
10000 loops, best of 3: 24 usec per loop
$ python -mtimeit -s'd=dict.fromkeys(range(99))' 'k=d.keys();
k.sort()' 'for x in k: pass'
10000 loops, best of 3: 21.9 usec per loop
$ python -mtimeit -s'd=dict.fromkeys(range(99))' 'for x in sorted(d):
pass'10000 loops, best of 3: 22.8 usec per loop
$ python -mtimeit -s'd=dict.fromkeys(range(99))' 'for x in sorted(d): pass'
10000 loops, best of 3: 22.6 usec per loop

So the "old" cumbersome idiom (though it still comes natural to those
who started using Python before it had a `sorted' builtin) should IMHO
be discouraged -- the new one is compactly readable and higher-level,
yet roughly equivalent performance-wise.  IOW, this use case counts as
a PLUS for d.keys() NOT returning a list!-)


Alex


More information about the Python-3000 mailing list