sorting with expensive compares?

Fri Dec 23 04:12:18 EST 2005

Dan Stromberg wrote:
> Python appears to have a good sort method, but when sorting array elements
> that are very large, and hence have very expensive compares, is there some
> sort of already-available sort function that will merge like elements into
> a chain, so that they won't have to be recompared as many times?

It's not really practical - if the list is unsorted, it's non-trivial
to determine how many times a given element is duplicated until you've
compared it with everything else. That is roughly an O(N*N/2) operation
whereas sorting typically is O(NlogN). This is why C++'s 'std::unique'
function only operates on sorted lists.

So instead, one alternative would be to use a comparison function that
takes the 2 objects and looks for the pair in a dictionary: if that
pair is not found, perform the normal comparison and store it in the
dictionary for next time, and if it is found, just return it. This way
the actual comparison is only done once for each pair.

Alternatively you might be able to produce a cheap comparison from the
expensive one if you can summarise the information in a simpler form.
Perhaps each sorting criterion can be represented as an integer, and
you can instead sort a list of lists containing integers.

-- 
Ben Sizer