[Tutor] lazily decorated sort

Steven D'Aprano steve at pearwood.info
Fri Sep 28 14:14:06 CEST 2012

On 11/09/12 21:44, Chris Smith wrote:
> Hi all,
> I'm wondering if anyone has seen or knows of a good way to do a lazily
> decorated sort. I was reading about how good the DSU (decorate, sort,
> undecorate) approach is but the problem that we are running into in
> SymPy is that we want to get by with a fast hash sort if possible, and
> only decorate to break ties *if necessary*. It's a pity to decorate
> with an expensive function if you don't need it but I don't know how
> to only decorate when there are ties.

Firstly, unless you intend supporting Python 2.3 or older, there is
(probably) no need for an explicit DSU approach. Instead, use the key
argument to sorted and list.sort.

I'm not sure I completely understand your question. If you know ahead of
time that you can avoid DSU, you can do this:

if condition:
     x = sorted(something)
     x = sorted(something, key=func)

But I imagine that's not what you mean. My guess is that you want the
sort to be sufficiently clever that instead of doing this:

decorate every item
compare decorated items, and sort as needed
undecorate every item

you want it to do this:

compare items, sorting as needed
if they are equal, compare decorated items (which will never tie???)

Am I close?

I think you could arrange that two ways:

1) A two-pass sort; first, sort undecorated, then scan looking for ties,
if you find any, sort again with a key function;

(This is safe, since sorting in Python is now guaranteed to be stable.)

2) or use a key function that does something like this:

class SmartComparator(object):
     def __init__(self, item):
         self.item = item
     def __cmp__(self, other):
         x = cmp(self.item, other.item)
         if x == 0:
             return cmp(self.decorate(self.item), self.decorate(other.item))
         return x
     def decorate(self, value):
         # expensive magic goes in here...

sorted(items, key=SmartComparator)

I think the second version is more promising, since it avoids a linear search
for ties.

You will need to check the documentation to see whether sorting relies on
__cmp__ or rich comparisons (__gt__, __lt__, __eq__, etc.).

If you need any help, don't hesitate to ask.

P.S. I just discovered sympy recently, it is awesome.


More information about the Tutor mailing list