[Python-3000] Spooky behavior of dict.items() and friends
Alex Martelli
aleaxit at gmail.com
Wed Apr 2 16:58:45 CEST 2008
On Tue, Apr 1, 2008 at 11:39 PM, "Martin v. Löwis" <martin at v.loewis.de> wrote:
...
> > keys = mydict.keys()
> > keys.sort()
> > for key in keys:
>
> That is indeed a frequent case in 2.x. Fortunately, it is what David
> calls "loud" breakage:
>
> py> keys = mydict.keys()
> py> keys.sort()
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> AttributeError: 'dict_keys' object has no attribute 'sort'
>
> > In fact, the 2.5 standard library turns up 3 occurrences of
> > "keys.sort". Given that that's just the ones that used the obvious
> > name for the list to be sorted
> >
> > Nowdays, I tend to write
> >
> > keys = sorted(mydict.keys()) # Yeah, I know, .keys() is redundant...
> > for key in keys:
> >
> > or maybe
> >
> > for key in sorted(mydict):
> >
> > both of which are probably slower than the original version unless
> > sorted switches to an insertion sort if passed a generator.
>
> Notice that this isn't affected by the "spookiness" of dict.keys()
> at all - it just works fine.
>
> Why do you think this version is slower? It behaves exactly the
> same as the original code: a list is created with all the keys,
> and then that list is sorted, with the list-sort algorithm.
Indeed, at least with Python 2.5, any difference in performance is
more or less in the noise:
$ python -mtimeit -s'd=dict.fromkeys(range(99))' 'k=d.keys();
k.sort()' 'for x in k: pass'
10000 loops, best of 3: 24 usec per loop
$ python -mtimeit -s'd=dict.fromkeys(range(99))' 'k=d.keys();
k.sort()' 'for x in k: pass'
10000 loops, best of 3: 21.9 usec per loop
$ python -mtimeit -s'd=dict.fromkeys(range(99))' 'for x in sorted(d):
pass'10000 loops, best of 3: 22.8 usec per loop
$ python -mtimeit -s'd=dict.fromkeys(range(99))' 'for x in sorted(d): pass'
10000 loops, best of 3: 22.6 usec per loop
So the "old" cumbersome idiom (though it still comes natural to those
who started using Python before it had a `sorted' builtin) should IMHO
be discouraged -- the new one is compactly readable and higher-level,
yet roughly equivalent performance-wise. IOW, this use case counts as
a PLUS for d.keys() NOT returning a list!-)
Alex
More information about the Python-3000
mailing list