DSU pattern (was Re: Trouble sorting lists (unicode/locale related?))
Peter Otten
__peter__ at web.de
Tue Sep 23 04:27:09 EDT 2003
Duncan Booth wrote:
> Note that if anyone proposes this seriously, it should generate a 3-tuple
> (mapping(item), index, item) rather than the 2-tuple you suggest.
>
> This is because the mapping function could reasonably be used to impose an
> ordering on objects that have no natural order, so you need to be sure
> that the comparison never falls through the the original object even where
> the mapping compares equal. It also has the side effect of making the sort
> stable (although if stability is a goal you might want another option to
> reverse the sort which would use '-index' as the second element and call
> .reverse() on the result).
So my demo implementation was faulty :-(
Let me try again:
def sort(self, cmpfunc=None, mapfunc=None):
if mapfunc:
list.sort(self, lambda x, y: cmp(mapfunc(x), mapfunc(y)))
else:
list.sort(self, cmpfunc)
Seriously, if sort() were to be thus extended, the dsu pattern would rather
act as a performance enhancement under the hood. I prefer plain C
struct {
PyObject *decorator;
int index; /* iff stability is not guaranteed by the sort algorithm */
PyObject *data;
}
over n-tuples and would hope to reuse the sorting infrastructure already
there in listobject.c to compare only the decorators. I am not familiar
with the C source, so be gentle if that is not feasible.
> FWIW, I think something like this belongs in the standard library rather
> than as a method on lists or a new builtin.
If you think of the comparison as a two-step process,
(1) extract or calculate an attribute
(2) wrap it into a comparison function
consider the following use cases:
(a) lambda x, y: cmp(y, x)
(b) lambda x, y: cmp(x.attr.lower(), y.attr.lower())
All code following the latter pattern could be rewritten (more clearly, I
think) as
alist.sort(mapfunc=lambda x: attr.lower())
and would automatically benefit from dsu (which could even be turned off
transparently for the client code for very large lists, where the memory
footprint becomes a problem).
The litmus test whether to put it into a utility function or into an already
existing method that needs only a minor and completely backwards-compatible
interface change would then be:
Are (b)-style comparison functions nearly as or more common than (a)-style
functions.
Peter
PS: You and Alex Martelli dismissed my somewhat unfocused idea faster than I
could sort it out. I hope you will find the above useful, though.
More information about the Python-list
mailing list