[Python-Dev] Drastically improving list.sort() for lists of strings/ints

Elliot Gorokhovsky elliot.gorokhovsky at gmail.com
Sun Sep 11 14:43:54 EDT 2016


The sort can be made stable, but that requires the allocation of an
equal-sized auxiliary array. To quote from the paper: "Both list-based and
two-array sorts entail Θ(n) space overhead. That overhead shrinks to
Θ(logn) in American flag sort, which, like quicksort, trades off stability
for space efficiency." So there are two options: follow C++ in providing a
stable and unstable sort, or just use stable radix sort at the cost of
allocating a scratch array. I understand why the first approach is
essentially impossible, since it could break code written under the
assumption that list.sort() is stable. But I think that in Python, since
the list just holds pointers to objects instead of objects themselves,
being in-place isn't that important: we're missing the cache all the time
anyway since our objects are stored all over the place in memory. So I
suppose the thing to do is to benchmark stable radix sort against timsort
and see if it's still worth it. Again, I really don't think the auxiliary
array would make that much of a difference. Note that in timsort we also
use an auxiliary array.

On Sun, Sep 11, 2016, 12:15 PM Mark Dickinson <dickinsm at gmail.com> wrote:

> > I am interested in making a non-trivial improvement to list.sort() [...]
>
> Would your proposed new sorting algorithm be stable? The language
> currently guarantees stability for `list.sort` and `sorted`.
>
> --
> Mark
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160911/f816d276/attachment.html>


More information about the Python-Dev mailing list