identity = lambda x: x -- a Pythonic idiom?
Marco Antoniotti
marcoxa at cs.nyu.edu
Mon Nov 19 16:18:31 CET 2001
"Andrew Dalke" <dalke at dalkescientific.com> writes:
> Marco Antoniotti:
> >Within CL, it turns out that `identity' is useful in a number of
> >cases, especially when dealing with sequence operators. E.g. the
> >`sort' function is really defined as
> >
> > sort <sequence> <predicate> &key (key #'identity)
> >
> >So that you can write things like
> >
> > * (sort '(2 3 4 1) #'<)
> > '(1 2 3 4)
> > * (sort '((1 3) (3 0) (33 9) (19 5)) #'< :key #'second)
> > ((3 0) (1 3) (19 5) (33 9))
> >
> >All in all a good thing.
>
> I've been staring at this trying to figure out what it does,
> and I think I got it. What threw me off, I think, was the
> '*'. It's a prompt, and the indented one on the 3rd line is
> a misindent -- it should be 7 characters to the left, and there
> are actually two different one-line expressions evaluated here.
I am sorry. Yes, the '*' is CMUCL default prompt and I assume that
newsreader hell broke out.
>
> Then in the "sort <sequence> <predictate> &key (key #'identity)"
> line the '&key' says that field is 1) optional and 2) a keyword
> argument, and that if not given the identity function is used.
> Finally, the "#'" syntax is used to get a reference to a special
> symbol, and ':key' is how to specify a keyword argument.
>
> Am I right so far?
Yes, almost. The #' is used as a way to get the actual function out
of a symbol. I could have written the above as
cl-prompt> (sort '(2 3 4 1) '<)
'(1 2 3 4)
cl-prompt> (sort '((1 3) (3 0) (33 9) (19 5)) '< :key 'second)
((3 0) (1 3) (19 5) (33 9))
But the compiler would have had either to do more work or to compile a
dereferencing of the symbols `cl:<' and `cl:second'.
&key is used to introduce keyword parameters. Most sequence
operations in CL have a keyword parameter called :key. (Other nice
argument list markers are &optional, and &rest).
> Python only uses one function for this, which means the field
> extraction and data comparison must be composed into one function,
>
> >>> a = [(1, 3), (3, 0), (33, 9), (19, 5)]
> >>> a.sort(lambda x, y: cmp(x[1], y[1]))
> >>> a
> [(3, 0), (1, 3), (19, 5), (33, 9)]
> >>>
In CL you have it both ways. I could have written the above as
cl-prompt> (sort '((1 3) (3 0) (33 9) (19 5))
(lambda (x y) (< (second x) (second y))))
((3 0) (1 3) (19 5) (33 9))
It is just a nice comvenience. Where it comes handy is in things like
the following
Define a class with a slot that contains a vector
(defclass just-a-test ()
((v :accessor v-of :initarg :v)))
Now suppose that you have a bunch of classes
(defvar v1 (make-instance 'just-a-test :v (vector 1 2 3 4)))
(defvar v2 (make-instance 'just-a-test :v (vector 2 -8 4 6)))
...
(defvar vn (make-instance 'just-a-test :v (vector 1 9 -2 9)))
(defvar vs (vector v1 v2 ... vn)) ; The ... are not part of CL.
... and that you want to sort them based on the i-th element of the
vector in the slot.
(loop for i from 0 below 4
collect (sort vs #'< :key (lambda (the-v) (aref (v-of the-v) i))))
This effectively returns a list of the different sorted vectors
(assume all vectors have length 4). Of course you could have written
the test and forgotten about the :key parameter (it would have
defaulted to `identity' - which is where all of this started from). I
feel that in this case it is clearer to write out the :key parameter.
> The sort function is (was) also defined in terms of C-style
> three-way sorting. It's been changed so that only '<' comparison
> is needed, but I don't know how that change affects the comparison
> function used.
>
> In CL it looks like you'll have to do the same sort of thing
> for data values which don't have a 1D sorting, like
>
> >>> a = [(1, 3), (100, 5), (33, 9), (19, 5)]
> >>> a.sort(lambda x, y: cmp(x[1], y[1]) or cmp(x[0], y[0]))
> >>> a
> [(1, 3), (19, 5), (100, 5), (33, 9)]
> >>>
Of course.
(sort (lambda (x y)
(or (cmp (first x) (first y))
(cmp (second x) (second y))))
'((1 3) (100 5) (33 9) (19 5))
Cheers
--
Marco Antoniotti ========================================================
NYU Courant Bioinformatics Group tel. +1 - 212 - 998 3488
719 Broadway 12th Floor fax +1 - 212 - 995 4122
New York, NY 10003, USA http://bioinformatics.cat.nyu.edu
"Hello New York! We'll do what we can!"
Bill Murray in `Ghostbusters'.
More information about the Python-list
mailing list