[Tutor] Python functions are first-class citizens

Steven D'Aprano steve at pearwood.info
Fri Aug 1 16:38:24 CEST 2014


On Thu, Jul 31, 2014 at 10:12:49PM -0700, memilanuk wrote:

> Been reading a bit more in the mean time, trying to grok that 'key' 
> parameter for max()... and of course the python docs for max(iterable, 
> key=) refer to the docs for list.sort() ;)
> 
> Kind of diverging off the original question a bit... but since it did 
> include the max() code in it... I'm having a bit of difficulty with the 
> whole 'key=' parameter when combined with counts.get here.

To understand the key parameter, it helps to know a bit of history.

Back in the Dark Ages, list.sort() didn't have a key parameter. It did, 
however, have a cmp parameter, but I'm going to ignore that because it 
worked in a completely different way, not relevant to this discussion. 
Suppose you wanted to do a case-insensitive sort. This was one way:

data = ['narwhal', 'cat', 'DOG', 'FOX', 'lion', 'cow', 'APE', 'WALRUS']

# Make a copy of the data, recording the lower-cased values.
data2 = []
for item in data:
    data2.append( (item.lower(), item) )

# Sort this "decorated" version.
data2.sort()

# Remove the lower-cased values.
for i in range(len(data)):
    data[i] = data2[i][1]


which we can write more concisely using list comprehensions:


data = ['narwhal', 'cat', 'DOG', 'FOX', 'lion', 'cow', 'APE', 'WALRUS']
# Decorate:
data = [(value.lower(), value) for value in data]
# Sort:
data.sort()
# Undecorate:
data = [value[1] for value in data]


print data
=> ['APE', 'cat', 'cow', 'DOG', 'FOX', 'lion', 'narwhal', 'WALRUS']



(Wow, that brings back memories...) 

This is called "Decorate/Sort/Undecorate", or DSU, or sometimes 
"Schwartzian Transform" after a Perl programmer, Randal L. Schwartz, who 
popularised it. (The basic idea is *much* older than Perl.)

Now, the important part is the "decorate" step. In this case, we 
decorate the values by calling the str.lower() method on the value. 
That's the only part of the process which differs depending on the data 
being compared, everything else:

* making the tuples (transform(item), item)
* sorting the subsequent list
* extracting out the original items again

can be handled automatically. So much so that around Python 2.3 or 2.4, 
list.sort() and the sorted() function gained the ability to perform DSU 
automatically. All you have to do is provide the transformation 
function, which by tradition is called a "key function", hence the 
argument is called "key".


data = ['narwhal', 'cat', 'DOG', 'FOX', 'lion', 'cow', 'APE', 'WALRUS']
data.sort(key=str.lower)
print data
=> ['APE', 'cat', 'cow', 'DOG', 'FOX', 'lion', 'narwhal', 'WALRUS']


We can apply this DSU idiom to maximum and minimum as well. Suppose we 
want to find the string with the maximum length:

data = ['narwhal', 'cat', 'DOG', 'FOX', 'lion', 'cow', 'APE', 'WALRUS']
# Decorate:
data2 = [(len(s), s) for s in data]
# Find the maximum:
x = max(data2)
# Undecorate:
x = x[1]
print x
=> 'narwhal'


So, max() and min() have also gained a key argument:

max(data, key=len)
=> 'narwhal'

[...]
> Where things are going pear-shaped is how counts.get can function as a 
> 'key' when we don't actually supply () (or anything inside them) to 
> specify what k,v pair we want, and how that relates back to the iterable 
> for max(), counts?

You don't have to provide the parentheses because Python does so for 
you.



-- 
Steven


More information about the Tutor mailing list