Q: sort's key and cmp parameters

Raymond Hettinger python at rcn.com
Mon Oct 5 15:48:12 EDT 2009


> > So, it looks like the relevant comparison values can be stored in
> > nested lists:
>
> >    list_of_lists = \
> >      [[1, [3, [], []],
> >           [7, [5, [], []], []]],
> >       [19, [23, [], []],
> >           []],
> >      ]
>
> Now you've copied all the data out of the original tree, which is even
> worse than putting a class wrapper around it.  The comparison remains
> (asymptotically) as expensive as before, too.

FWIW, you could also just flatten it to:  [(1,3,7,5), (19,23)].

The point is that the tree comparisons you presented have a
predetermined
order of values to compare, so they either be recast a flattened list
of comparison values or the tree itself can be cast in a list of lists
form.
Either way, the O(n log n) step of doing the actual comparisons runs
much
faster than if you coded a recursive cmp function written in pure
python.


>  Say you want to change the numeric comparisons so that
> even numbers always sort before odd numbers, ie.
>     -4 < -2 < 0 < 2 < 4 < ... < -999 < ... -1 < 1 < 3 ...

This is too easy:

    >>> s = [-2, 4, 2, -1, -3, 1, -4, 0, 3]
    >>> s.sort()
    >>> s.sort(key=lambda x: x%2)
    >>> s
    [-4, -2, 0, 2, 4, -3, -1, 1, 3]

The use cases you've presented so far aren't convincing,
but I'm sure that if you keep making-up random, weird use cases,
there will be one where a cmp function is a better approach.


> I think there are reasonable orderings on the trees themselves that
> can't possibly be embedded in the standard python comparison order.
> I might be able to construct an example using ordinal diagrams from
> proof theory.

As long as the values of a tree get compared in a predetermined order,
there will always be a flattened list equivalent that works faster
using a key function.  If you're going to concoct something isn't
easily transformable to a key function, I think your best bet is to
create a comparison where the trees have an interaction other than
comparing values at identical node positions; otherwise, the tree
shape hasn't made the problem any more difficult than sorting a list
of successive values.  The concocted comparison function needs to
exploit some aspect of the tree shape than can't be precomputed
in a key function (perhaps involving some complex interaction between
the two compared items like a tree merge or somesuch).

Instead of trying to create a hard comparison from first principles,
there may already be some research on the subject in the SQL world.
It seems to me that SQL's ORDER BY clauses are essentially just
key functions, so maybe there are known cases where a query cannot
be sorted with just the tools provided by SQL.


Raymond

P.S.  I accept than you hate sorting in Py3.x.  There's no need to
convince me of that.  I was just curious about your one real-world
use case and whether I could find a straight-forward key function
that would get the job done.



More information about the Python-list mailing list