[Python-Dev] decorate-sort-undecorate

Tim Peters tim.one at comcast.net
Mon Oct 13 19:14:17 EDT 2003


[Raymond Hettinger]
> ...
> # The decorator form works especially well with mappings so that
> # database keys can be sorted by any field.

Unfortunately, the case of sorting database records by keys is one where
it's most important to use a different form of DSU to avoid outrageous
runtime.  If you sort

    [(key1, record1), (key2, record2), ...]

then whenever two keys compare equal, tuple comparison goes on to compare
the records too, and general-object comparison can be arbitrarily expensive.
The right way to do this with DSU now is to sort:

    [(key1, 0, record1), (key2, 1, record2), ...]

instead.  Then ties on the keys (which are very common when sorting a
database) are always resolved quickly by comparing two distinct ints.

This is the same way used to force a stable sort in pre-2.3 Python, and
remains the best thing for non-experts to do by default.  Indeed, if it's
not done, then despite that the 2.3 sort *is* stable, sorting on

    [(key1, record1), (key2, record2), ...]

is *not* stable wrt just sorting on the keys.  DSU actually changes the
natural result unless the original indices are inserted after "the keys".

Alas, in decorator=function syntax, there's no clear way in general to write
function so that it knows the index of the object passed to it.

Internally, I suppose the sort routine could sort a temp list consisting of
only the keys, mirroring the relative data movement in the real list too.
That should blow the cache all to hell <wink>.




More information about the Python-Dev mailing list