[Python-ideas] Multiple level sorting in python where the order of some levels may or may not be reversed
Tim Peters
tim.peters at gmail.com
Sun Oct 16 02:01:03 EDT 2016
[Alireza Rafiei <alireza.rafiei94 at gmail.com>]
> I have a list called count_list which contains tuples like below:
>
> > [('bridge', 2), ('fair', 1), ('lady', 1), ('is', 2), ('down', 4),
> > ('london', 2), ('falling', 4), ('my', 1)]
>
>
> I want to sort it based on the second parameter in descending order and the
> tuples with the same second parameter must be sorted based on their first
> parameter, in alphabetically ascending order. So the ideal output is:
>
> > [('down', 4), ('falling', 4), ('bridge', 2), ('is', 2), ('london', 2),
> > ('fair', 1), ('lady', 1), ('my', 1)]
I'd like to suggest doing something simple instead, such that
data = [('bridge', 2), ('fair', 1), ('lady', 1),
('is', 2), ('down', 4), ('london', 2),
('falling', 4), ('my', 1)]
from operator import itemgetter
multisort(data, [# primary key is 2nd element, reversed
(itemgetter(1), True),
# secondary key is 1st element, natural
(itemgetter(0), False)])
import pprint
pprint.pprint(data)
prints the output you want.
It's actually very easy to do this, but the cost is that it requires
doing a full sort for _each_ field you want to sort on. Because
Python's sort is stable, it's sufficient to merely sort on the
least-significant key first, and then sort again on each key in turn
through the most-significant. There's another subtlety that makes
this work:
> ...
> The main problem is that reverse argument takes only a boolean and applies
> to the whole list after sorting in finished.
Luckily, that's not how `reverse` actually works. It _first_reverses
the list, then sorts, and then reverses the list again. The result is
that items that compare equal _retain_ their original order, where
just reversing the sorted list would invert their order. That's why,
in your example above, after first sorting on the string component in
natural order we get (in part)
[[('down', 4), ('falling', 4), ...]
and then reverse-sorting on the integer portion _leaves_ those tuples
in the same order. That's essential for this decomposition of the
problem to work.
With that background, here's the implementation:
def multisort(xs, specs):
for key, reverse in reversed(specs):
xs.sort(key=key, reverse=reverse)
That's all it takes! And it accepts any number of items in `specs`.
Before you worry that it's "too slow", time it on real test data.
`.sort()` is pretty zippy, and this simple approach allows using
simple key functions. More importantly, it's much easier on your
brain ;-)
More information about the Python-ideas
mailing list