PEP 265 - Sorting dicts by value
Hi all, Quick pep265 summary : People frequently want to count the occurrences of values in a dict, or sort the results of a d.items() call by value. This could be done by extending the current items() definition, or by creating a new function for the dict object (both requiring a C implementation). I've had a read through pep265 a few times now, and every time I've had two immediate reactions. First, that I've been there too. I've found myself innumerable times needing to count the occurrences of values in a dict. However second, that dicts shouldn't be naturally sortable. A dict does not guarantee the order that it returns calls such as items() keys() or values(). It's my feeling that we should not encourage people to rely on a dict returning a set ordering, since as a hash based data structure they are designed for key lookup not sequential traversal - if you want to sort something, massage the data into a list and then sort the list (I've seen a proposal before that the sort function be able to handle objects which would allow sorting of 2 dimensional lists). With regards to the two arguments put forward by Grant, the first - that it is an idiom known only to experienced campaigners - does not seem to be a supportable argument to me. I think the problem has quite a simple elegant solution which is rather easily discovered - there are lots of differences in Python that require an inexperienced programmer to learn a new idiom (such as the looping construct). The second, that the solution is full of 'grunge', seems a matter of taste and use to me. As mentioned in the pep there are different kinds of comparison that may be wanted, but could not be supported. Further, it is a natural use case of a dict that items held within it need not be of the same type (and therefore makes the idea of a comparison between them meaningless). With respect to implementation suggestions, numbers 1 2 and 3 definitely don't work for me. To extend the usage of items() without similarly extending the usage of keys() and values() would mean that we are special casing the items() function in a way that makes it inconsistant with the other dict functions. Number 5 seems too specific to me. I could live with 4 ;-) I think in the end it's my feeling that these kind of idioms belong in the cookbook - which, incidentally, it already is to a certain extent under 'Sorting a Dictionary', another recipe could always be added for this ;-) cheers Dave Harrison
On Mon, 13 Sep 2004 11:34:06 +1000, David Harrison <dave.l.harrison@gmail.com> wrote:
With regards to the two arguments put forward by Grant, the first - that it is an idiom known only to experienced campaigners - does not seem to be a supportable argument to me. I think the problem has quite a simple elegant solution which is rather easily discovered - there are lots of differences in Python that require an inexperienced programmer to learn a new idiom (such as the looping construct).
And of course, it is better to teach these idioms to newbies so they become competent. A python newbie will be much better off if they learn the decorate, sort, [undecorate] idiom and list comprehensions, neither of which are particularly difficult; and both will serve the newbie well in many other areas.
With respect to implementation suggestions, numbers 1 2 and 3 definitely don't work for me. To extend the usage of items() without similarly extending the usage of keys() and values() would mean that we are special casing the items() function in a way that makes it inconsistant with the other dict functions. Number 5 seems too specific to me. I could live with 4 ;-)
To quote the PEP: """ Alternatively, items() could simply let us control the (key, value) order: (3) items(values_first=0) """ This suggestion No. 3 from the PEP does not special case the items() function in a way that makes it "inconsistent with the other dict functions" (i.e. keys(), values()); however it would suggest that dict() then also ought take such an inverted, values-first list of tuples if given an optional values_first parameter. But this IMHO makes the dict() constructor too complicated, as well as having a potential conflict with named keywords.
David Harrison wrote:
Hi all,
Quick pep265 summary : People frequently want to count the occurrences of values in a dict, or sort the results of a d.items() call by value. This could be done by extending the current items() definition, or by creating a new function for the dict object (both requiring a C implementation).
In Python 2.4: ->>> ud = dict(a=1, b=2, c=3) ->>> from operator import itemgetter ->>> print sorted(ud.items(), key=itemgetter(1), reverse=True) [('c', 3), ('b', 2), ('a', 1)] I'm not entirely sure who needs to be thanked for this addition, but it sure makes the 'decorate-sort-undecorate' idiom very, very easy to follow (which was, in fact, the point - I do remember that much of the discussion). I think the addition of 'sorted', and the keyword arguments for both it and list.sort make PEP 265 somewhat redundant. Cheers, Nick.
Quick pep265 summary : People frequently want to count the occurrences of values in a dict, or sort the results of a d.items() call by value. This could be done by extending the current items() definition, or by creating a new function for the dict object (both requiring a C implementation).
In Python 2.4:
->>> ud = dict(a=1, b=2, c=3) ->>> from operator import itemgetter ->>> print sorted(ud.items(), key=itemgetter(1), reverse=True) [('c', 3), ('b', 2), ('a', 1)]
I'm not entirely sure who needs to be thanked for this addition, but it sure makes the 'decorate-sort-undecorate' idiom very, very easy to follow (which was, in fact, the point - I do remember that much of the discussion).
I think the addition of 'sorted', and the keyword arguments for both it and list.sort make PEP 265 somewhat redundant.
Seems like another solution to the problem, which makes this pep even less meaningful I'd say. Guess this this pep should be closed then ?
participants (3)
-
Andrew Durdin
-
David Harrison
-
Nick Coghlan