list.sort with a int or str key

list.sort, sorted, and similar methods currently have a "key" argument that accepts a callable. Often, that leads to code looking like this: mylist.sort(key=lambda x: x[1]) myotherlist.sort(key=lambda x: x.length) I would like to propose that the "key" parameter be generalized to accept str and int types, so the above code could be rewritten as follows: mylist.sort(key=1) myotherlist.sort(key='length') I find the latter to be much more readable. As a bonus, performance for those cases would also improve. -- Daniel Stutzbach <http://stutzbachenterprises.com>

On Thu, 16 Sep 2010 10:35:14 -0500 Daniel Stutzbach <daniel@stutzbachenterprises.com> wrote:
list.sort, sorted, and similar methods currently have a "key" argument that accepts a callable. Often, that leads to code looking like this:
mylist.sort(key=lambda x: x[1]) myotherlist.sort(key=lambda x: x.length)
I would like to propose that the "key" parameter be generalized to accept str and int types, so the above code could be rewritten as follows:
mylist.sort(key=1) myotherlist.sort(key='length')
-1 I think the idiom using the operator module tools: mylist.sort(key=itemgetter(1)) mylist.sort(key=attrgetter('length')) is more readable than your proposal - it makes what's going on explicit. <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

On Thu, Sep 16, 2010 at 8:35 AM, Daniel Stutzbach <daniel@stutzbachenterprises.com> wrote:
list.sort, sorted, and similar methods currently have a "key" argument that accepts a callable. Often, that leads to code looking like this:
mylist.sort(key=lambda x: x[1]) myotherlist.sort(key=lambda x: x.length)
I would like to propose that the "key" parameter be generalized to accept str and int types, so the above code could be rewritten as follows:
mylist.sort(key=1) myotherlist.sort(key='length')
I find the latter to be much more readable.
-1. I think this is too cryptic.
As a bonus, performance for those cases would also improve.
Have you measured this? Remember that the key function is only called N times while the number of comparisons (using the values returned from the key function) is O(N log N). -- --Guido van Rossum (python.org/~guido)

On 9/16/10 10:35 AM, Daniel Stutzbach wrote:
list.sort, sorted, and similar methods currently have a "key" argument that accepts a callable. Often, that leads to code looking like this:
mylist.sort(key=lambda x: x[1]) myotherlist.sort(key=lambda x: x.length)
I would like to propose that the "key" parameter be generalized to accept str and int types, so the above code could be rewritten as follows:
mylist.sort(key=1) myotherlist.sort(key='length')
I find the latter to be much more readable. As a bonus, performance for those cases would also improve.
I find the latter significantly less readable because they are special cases that I need to remember. Right now, you can achieve the performance and arguably better readability using operator.itemgetter() and operator.attrgetter(): from operator import attrgetter, itemgetter mylist.sort(key=itemgetter(1)) myotherlist.sort(key=attrgetter('length')) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

-1 key='length' could reasonably mean lambda a:a.length or lambda a:a['length'] an explicit lambda or itemgetter/attrgetter is clearer. --- Bruce http://www.vroospeak.com http://j.mp/gruyere-security On Thu, Sep 16, 2010 at 8:35 AM, Daniel Stutzbach < daniel@stutzbachenterprises.com> wrote:
list.sort, sorted, and similar methods currently have a "key" argument that accepts a callable. Often, that leads to code looking like this:
mylist.sort(key=lambda x: x[1]) myotherlist.sort(key=lambda x: x.length)
I would like to propose that the "key" parameter be generalized to accept str and int types, so the above code could be rewritten as follows:
mylist.sort(key=1) myotherlist.sort(key='length')
I find the latter to be much more readable. As a bonus, performance for those cases would also improve. -- Daniel Stutzbach <http://stutzbachenterprises.com>
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

On Thu, 16 Sep 2010 10:35:14 -0500 Daniel Stutzbach <daniel@stutzbachenterprises.com> wrote:
list.sort, sorted, and similar methods currently have a "key" argument that accepts a callable. Often, that leads to code looking like this:
mylist.sort(key=lambda x: x[1]) myotherlist.sort(key=lambda x: x.length)
I would like to propose that the "key" parameter be generalized to accept str and int types, so the above code could be rewritten as follows:
mylist.sort(key=1) myotherlist.sort(key='length')
It is not obvious whether key='length' should use __getitem__ or __getattr__. Your example claims attribute lookup but an indexed lookup would be more consistent with key=1. I'm quite skeptical towards this. Special cases make things harder to remember, and foreign code more difficult to read. Regards Antoine.

Since most everyone else finds it less readable, I withdraw the proposal. Thanks for the feedback, -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>

On Sep 16, 2010, at 8:35 AM, Daniel Stutzbach wrote:
list.sort, sorted, and similar methods currently have a "key" argument that accepts a callable. Often, that leads to code looking like this:
mylist.sort(key=lambda x: x[1]) myotherlist.sort(key=lambda x: x.length)
I would like to propose that the "key" parameter be generalized to accept str and int types, so the above code could be rewritten as follows:
mylist.sort(key=1) myotherlist.sort(key='length')
-1 The key= parameter is a protocol that is used across multiple tools min(). max(), groupby(), nmallest(), nlargest(), etc. All of those would need to change to stay in-sync.
I find the latter to be much more readable.
It also becomes harder to learn. Multiple signatures (int or str or other callable) create more problems that they solve.
As a bonus, performance for those cases would also improve.
ISTM, the performance would be about the same as you already get from attrgetter(), itemgetter(), and methodcaller(). Also, those three tools are already more flexible than the proposal, for example: attrgetter('lastname', 'firstname') # key = lambda r: (r.lastname, r.firstname) itemgetter(0, 7) # key = lambda r: (r[0], r[7]) methodcaller('get_stats', 'size') # key = lambda r: r.get_stats('size') We've already got a way to do it, so the proposal is basically about saving a few characters in exchange for complexifying the protocol with a form of multiple dispatch. Raymond

On 9/16/2010 2:28 PM, Raymond Hettinger wrote:
The key= parameter is a protocol that is used across multiple tools min(). max(), groupby(), nmallest(), nlargest(), etc. All of those would need to change to stay in-sync. ...
ISTM, the performance would be about the same as you already get from attrgetter(), itemgetter(), and methodcaller(). Also, those three tools are already more flexible than the proposal, for example:
attrgetter('lastname', 'firstname') # key = lambda r: (r.lastname, r.firstname) itemgetter(0, 7) # key = lambda r: (r[0], r[7]) methodcaller('get_stats', 'size') # key = lambda r: r.get_stats('size')
It is easy to not know about these. I think the doc set could usefully use an expanded entry on *key functions* (that would be a cross-reference link) that includes examples like the above. Currently, for example, the min entry has "The optional keyword-only key argument specifies a one-argument ordering function like that used for list.sort()." but there is no link and going to list.sort only adds "that is used to extract a comparison key from each list element: key=str.lower. The default value is None." Perhaps we could expand that and make the existing cross-references into links. -- Terry Jan Reedy

On 2010-09-17, at 08:41 , Terry Reedy wrote:
On 9/16/2010 2:28 PM, Raymond Hettinger wrote:
The key= parameter is a protocol that is used across multiple tools min(). max(), groupby(), nmallest(), nlargest(), etc. All of those would need to change to stay in-sync. ...
ISTM, the performance would be about the same as you already get from attrgetter(), itemgetter(), and methodcaller(). Also, those three tools are already more flexible than the proposal, for example:
attrgetter('lastname', 'firstname') # key = lambda r: (r.lastname, r.firstname) itemgetter(0, 7) # key = lambda r: (r[0], r[7]) methodcaller('get_stats', 'size') # key = lambda r: r.get_stats('size')
It is easy to not know about these. I think the doc set could usefully use an expanded entry on *key functions* (that would be a cross-reference link) that includes examples like the above.
+1, in my experience, the operator module in general is fairly unknown and the attrgetter/itemgetter/methodcaller family criminally so. It doesn't help that they're kind-of lost in a big bunch of text at the very bottom of the module.

ISTM, the performance would be about the same as you already get from attrgetter(), itemgetter(), and methodcaller(). Also, those three tools are already more flexible than the proposal, for example:
attrgetter('lastname', 'firstname') # key = lambda r: (r.lastname, r.firstname) itemgetter(0, 7) # key = lambda r: (r[0], r[7]) methodcaller('get_stats', 'size') # key = lambda r: r.get_stats('size')
It is easy to not know about these.
FWIW, those and other sorting related topics are covered in the sorting-howto: http://wiki.python.org/moin/HowTo/Sorting/ We link to that from the main docs for sorted(): http://docs.python.org/library/functions.html#sorted
I think the doc set could usefully use an expanded entry on *key functions*
That might also make a useful entry to the glossary. Raymond P.S. I don't know that it applies here but one limitation of the docs is that they can get too voluminous. Already, it is a significant time investment just to read the doc page on builtin functions. You can kill a whole afternoon just reading the docs for unittest and logging. The gestalt of the language gets lost when the docs get too fat. Instead, I like the howto write-ups because they bring together many thoughts on a single topic.

On Fri, Sep 17, 2010 at 1:11 PM, Terry Reedy <tjreedy@udel.edu> wrote:
It is easy to not know about these. I think the doc set could usefully use an expanded entry on *key functions* (that would be a cross-reference link) that includes examples like the above. Currently, for example, the min entry has "The optional keyword-only key argument specifies a one-argument ordering function like that used for list.sort()." but there is no link and going to list.sort only adds "that is used to extract a comparison key from each list element: key=str.lower. The default value is None." Perhaps we could expand that and make the existing cross-references into links.
Tracker issue to capture this idea: http://bugs.python.org/issue9886 Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (10)
-
Antoine Pitrou
-
Bruce Leban
-
Daniel Stutzbach
-
Guido van Rossum
-
Masklinn
-
Mike Meyer
-
Nick Coghlan
-
Raymond Hettinger
-
Robert Kern
-
Terry Reedy