identity = lambda x: x -- a Pythonic idiom?

Marco Antoniotti marcoxa at cs.nyu.edu
Mon Nov 19 10:18:31 EST 2001


"Andrew Dalke" <dalke at dalkescientific.com> writes:

> Marco Antoniotti:
> >Within CL, it turns out that `identity' is useful in a number of
> >cases, especially when dealing with sequence operators.  E.g. the
> >`sort' function is really defined as
> >
> > sort <sequence> <predicate> &key (key #'identity)
> >
> >So that you can write things like
> >
> > * (sort '(2 3 4 1) #'<)
> >        '(1 2 3 4)
> >        * (sort '((1 3) (3 0) (33 9) (19 5)) #'< :key #'second)
> > ((3 0) (1 3) (19 5) (33 9))
> >
> >All in all a good thing.
> 
> I've been staring at this trying to figure out what it does,
> and I think I got it.  What threw me off, I think, was the
> '*'.  It's a prompt, and the indented one on the 3rd line is
> a misindent -- it should be 7 characters to the left, and there
> are actually two different  one-line expressions evaluated here.

I am sorry.  Yes, the '*' is CMUCL default prompt and I assume that
newsreader hell broke out.

> 
> Then in the "sort <sequence> <predictate> &key (key #'identity)"
> line the '&key' says that field is 1) optional and 2) a keyword
> argument, and that if not given the identity function is used.
> Finally, the "#'" syntax is used to get a reference to a special
> symbol, and ':key' is how to specify a keyword argument.
> 
> Am I right so far?

Yes, almost.  The #' is used as a way to get the actual function out
of a symbol.  I could have written the above as

cl-prompt> (sort '(2 3 4 1) '<)
'(1 2 3 4)

cl-prompt> (sort '((1 3) (3 0) (33 9) (19 5)) '< :key 'second)
((3 0) (1 3) (19 5) (33 9))

But the compiler would have had either to do more work or to compile a
dereferencing of the symbols `cl:<' and `cl:second'.

&key is used to introduce keyword parameters.  Most sequence
operations in CL have a keyword parameter called :key. (Other nice
argument list markers are &optional, and &rest).

> Python only uses one function for this, which means the field
> extraction and data comparison must be composed into one function,
> 
> >>> a = [(1, 3), (3, 0), (33, 9), (19, 5)]
> >>> a.sort(lambda x, y: cmp(x[1], y[1]))
> >>> a
> [(3, 0), (1, 3), (19, 5), (33, 9)]
> >>>

In CL you have it both ways.  I could have written the above as

cl-prompt> (sort '((1 3) (3 0) (33 9) (19 5))
                 (lambda (x y) (< (second x) (second y))))
((3 0) (1 3) (19 5) (33 9))

It is just a nice comvenience.  Where it comes handy is in things like
the following

Define a class with a slot that contains a vector

(defclass just-a-test ()
  ((v :accessor v-of :initarg :v)))

Now suppose that you have a bunch of classes

(defvar v1 (make-instance 'just-a-test :v (vector 1 2 3 4)))
(defvar v2 (make-instance 'just-a-test :v (vector 2 -8 4 6)))
...
(defvar vn (make-instance 'just-a-test :v (vector 1 9 -2 9)))

(defvar vs (vector v1 v2 ... vn)) ; The ... are not part of CL.

... and that you want to sort them based on the i-th element of the
vector in the slot.

(loop for i from 0 below 4
      collect (sort vs #'< :key (lambda (the-v) (aref (v-of the-v) i))))

This effectively returns a list of the different sorted vectors
(assume all vectors have length 4).  Of course you could have written
the test and forgotten about the :key parameter (it would have
defaulted to `identity' - which is where all of this started from).  I
feel that in this case it is clearer to write out the :key parameter.

> The sort function is (was) also defined in terms of C-style
> three-way sorting.  It's been changed so that only '<' comparison
> is needed, but I don't know how that change affects the comparison
> function used.
> 
> In CL it looks like you'll have to do the same sort of thing
> for data values which don't have a 1D sorting, like
> 
> >>> a = [(1, 3), (100, 5), (33, 9), (19, 5)]
> >>> a.sort(lambda x, y: cmp(x[1], y[1]) or cmp(x[0], y[0]))
> >>> a
> [(1, 3), (19, 5), (100, 5), (33, 9)]
> >>>

Of course.

(sort (lambda (x y)
         (or (cmp (first x) (first y))
             (cmp (second x) (second y))))
      '((1 3) (100 5) (33 9) (19 5))

Cheers

-- 
Marco Antoniotti ========================================================
NYU Courant Bioinformatics Group        tel. +1 - 212 - 998 3488
719 Broadway 12th Floor                 fax  +1 - 212 - 995 4122
New York, NY 10003, USA                 http://bioinformatics.cat.nyu.edu
                    "Hello New York! We'll do what we can!"
                           Bill Murray in `Ghostbusters'.



More information about the Python-list mailing list