Why list.sort() uses mergesort and not timsort?
As title. Is it faster for inplace sorting, or simply the implementation of list.sort() was done before the implementation of timsort?
On 06/06/2021 11.42, Marco Sulla wrote:
As title. Is it faster for inplace sorting, or simply the implementation of list.sort() was done before the implementation of timsort?
list.sort() uses timsort. What makes you think that Python uses mergesort? Tim Peters invented timsort for Python about twenty years ago. Tim a first generation Python core dev. Other languages like Java adopted timsort from Python later. Christian
On Sun, 6 Jun 2021 at 11:57, Christian Heimes <christian@python.org> wrote:
On 06/06/2021 11.42, Marco Sulla wrote:
As title. Is it faster for inplace sorting, or simply the implementation of list.sort() was done before the implementation of timsort?
list.sort() uses timsort. What makes you think that Python uses mergesort?
In listobject.c, in the comment above list_sort_impl, there's written "An adaptive, stable, natural mergesort". But now I see that after there is "See listsort.txt" where it's stated "This describes an adaptive, stable, natural mergesort, modestly called timsort".
Tim Peters invented timsort for Python about twenty years ago. Tim a first generation Python core dev. Other languages like Java adopted timsort from Python later.
I know. Thank you for the answer :)
Christian
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/P7XFRAU6... Code of Conduct: http://python.org/psf/codeofconduct/
On Sun, Jun 6, 2021 at 2:46 AM Marco Sulla <Marco.Sulla.Python@gmail.com> wrote:
As title. Is it faster for inplace sorting, or simply the implementation of list.sort() was done before the implementation of timsort?
As you already know, timsort is pretty close to merge sort. Timsort added the innovation of making mergesort in-place, plus a little (though already common) O(*n^2) sorting for small sublists. I've got a comparison of sort algorithms in both Cython and Pure Python (your choice) at: https://stromberg.dnsalias.org/~strombrg/sort-comparison/ ...including a version of timsort that is in Cython or Pure Python. HTH.
On Sun, Jun 06, 2021 at 04:07:57PM -0700, Dan Stromberg wrote:
I've got a comparison of sort algorithms in both Cython and Pure Python (your choice) at: https://stromberg.dnsalias.org/~strombrg/sort-comparison/ ...including a version of timsort that is in Cython or Pure Python.
Interesting! timsort get's to near-linear in your benchmark. -- Senthil
On Mon, 7 Jun 2021 06:49:24 -0700 Senthil Kumaran <senthil@python.org> wrote:
On Sun, Jun 06, 2021 at 04:07:57PM -0700, Dan Stromberg wrote:
I've got a comparison of sort algorithms in both Cython and Pure Python (your choice) at: https://stromberg.dnsalias.org/~strombrg/sort-comparison/ ...including a version of timsort that is in Cython or Pure Python.
Interesting! timsort get's to near-linear in your benchmark.
O(n log n) always looks linear when n is small enough... Regards Antoine.
On Sun, Jun 6, 2021 at 7:09 PM Dan Stromberg <drsalists@gmail.com> wrote:
I've got a comparison of sort algorithms in both Cython and Pure Python (your choice) at: https://stromberg.dnsalias.org/~strombrg/sort-comparison/ ...including a version of timsort that is in Cython or Pure Python.
Thanks for sharing the graphs. I found the performance of radix sort in particular to be interesting to see mapped out visually, and have never heard of "shellsort" prior to now. :)
[Dan Stromberg <drsalists@gmail.com>]
... Timsort added the innovation of making mergesort in-place, plus a little (though already common) O(*n^2) sorting for small sublists.
Actually, both were already very common in mergesorts. "timsort" is much more a work of engineering than of insight ;-) That is, it combined many pre-existing ideas, but pushed them hard, and got them to work smoothly together. The most novel part is "galloping" to vastly cut the number of comparisons needed in many real-world cases of partially ordered inputs. I took that idea from (as briefly explained in listsort.txt) papers seeking to speed real-life unions and intersections of sorted lists, presumably in database implementations. I later discovered that the idea had already been applied to a mergesort, as noted in a paper by McIlroy (cited in listsort.txt) - but the paper didn't give any details, and best I can tell he never published his code. WIthout that, timsort wouldn't have been worth the bother of writing - it was generally no faster than Python's prior sort implementation on randomly-ordered inputs, and is quite a large pile of code. But many cases of real-world inputs do have significant pre-existing order of some kind, where even a "perfect" quicksort (always picks _the_ median as the partition pivot) remains O(n log n) but timsort O(n). Curiously, though, it was the only implementation ever tried of a mergesort-based algorithm that wasn't notably _slower_ on randomly-ordered inputs than Python's prior "samplesort" algorithm (like quicksort but with a very high-quality guess for the median element based on recursively sample-sorting a largish random sample).
On Mon, Jun 7, 2021 at 11:20 AM Tim Peters <tim.peters@gmail.com> wrote:
[Dan Stromberg <drsalists@gmail.com>]
... Timsort added the innovation of making mergesort in-place, plus a little (though already common) O(*n^2) sorting for small sublists.
Actually, both were already very common in mergesorts. "timsort" is much more a work of engineering than of insight ;-) That is, it combined many pre-existing ideas, but pushed them hard, and got them to work smoothly together.
The most novel part is "galloping" to vastly cut the number of comparisons needed in many real-world cases of partially ordered inputs.
Thanks Tim. Python itself is a great language, but in part it was your good-natured and well-reasoned posts about Python that got me into Python instead of OCaml.
participants (7)
-
Antoine Pitrou
-
Christian Heimes
-
Dan Stromberg
-
Kyle Stanley
-
Marco Sulla
-
Senthil Kumaran
-
Tim Peters