[Python-Dev] decorate-sort-undecorate
Tim Peters
tim.one at comcast.net
Mon Oct 13 19:14:17 EDT 2003
[Raymond Hettinger]
> ...
> # The decorator form works especially well with mappings so that
> # database keys can be sorted by any field.
Unfortunately, the case of sorting database records by keys is one where
it's most important to use a different form of DSU to avoid outrageous
runtime. If you sort
[(key1, record1), (key2, record2), ...]
then whenever two keys compare equal, tuple comparison goes on to compare
the records too, and general-object comparison can be arbitrarily expensive.
The right way to do this with DSU now is to sort:
[(key1, 0, record1), (key2, 1, record2), ...]
instead. Then ties on the keys (which are very common when sorting a
database) are always resolved quickly by comparing two distinct ints.
This is the same way used to force a stable sort in pre-2.3 Python, and
remains the best thing for non-experts to do by default. Indeed, if it's
not done, then despite that the 2.3 sort *is* stable, sorting on
[(key1, record1), (key2, record2), ...]
is *not* stable wrt just sorting on the keys. DSU actually changes the
natural result unless the original indices are inserted after "the keys".
Alas, in decorator=function syntax, there's no clear way in general to write
function so that it knows the index of the object passed to it.
Internally, I suppose the sort routine could sort a temp list consisting of
only the keys, mirroring the relative data movement in the real list too.
That should blow the cache all to hell <wink>.
More information about the Python-Dev
mailing list