[Python-ideas] Sorted lists
Xavier Combelle
xavier.combelle at gmail.com
Mon Apr 8 04:33:15 EDT 2019
looks for me that means that all subclass of list has to maintain the
__isorted__ invariant,
looks like not a backward compatible modification and quite problematic
in case of the invariant broken.
So if I'm correct the design is brittle.
Moreover, that solve only the strict subset of sort case where there is
no key function, in ascendant order, ....
If one want to make a subclass of list having this properties, that
don't look hard to do and far more safer.
(of course the problem is that there is two similar list classes and the
best one is not builtin)
Le 08/04/2019 à 04:32, Steven D'Aprano a écrit :
> There are quite a few important algorithms which require lists to be
> sorted. For example, the bisect module, and for statistics median and
> other quantiles.
>
> Sorting a list is potentially expensive: while Timsort is very
> efficient, it is still ultimately an O(N log N) algorithm which limits
> how efficient it can be. Checking whether a list is sorted is O(N).
>
> What if we could check that lists were sorted in constant time?
>
> Proposal: let's give lists a dunder flag, __issorted__, that tracks
> whether the list is definitely sorted or not:
>
> - Empty lists, or lists with a single item, are created with
> __issorted__ = True; lists with two or more items are created
> with the flag set to False.
>
> - Appending or inserting items sets the flag to False.
>
> - Deleting or popping items doesn't change the flag.
>
> - Reversing the list doesn't change the flag.
>
> - Sorting it sets the flag to True. (The sort method should NOT
> assume the list is sorted just because the flag is set.)
>
> Functions that require the list to be sorted can use the flag as a
> quick check:
>
> if not alist.__issorted__:
> alist.sort()
> ...
>
> The flag will be writable, so that functions such as those in
> bisect can mark that they have kept the sorted invariant:
>
>
> bisect.insort(alist, x)
> assert alist.__issorted__
>
>
> Being writable, the flag is advisory, not a guarantee, and "consenting
> adults" applies. People can misuse the flag:
>
> alist = [1, 4, 2, 0, 5]
> alist.__issorted__ = True
>
> but then they have nobody to blame but themselves if they shoot
> themselves in the foot. That's no worse than the situation we have now,
> were you might pass an unsorted list to bisect.
>
> The flag doesn't guarantee that the list is sorted the way you want
> (e.g. biggest to smallest, by some key, etc) only that it has been
> sorted. Its up to the user to ensure they sort it the right way:
>
> # Don't do this and expect it to work!
> alist.sort(key=random.random)
> bisect.insort(alist, 1)
>
>
> If you really want to be sure about the state of the list, you have to
> make a copy and sort it. But that's no different from the situation
> right now. But for those willing to assume "consenting adults", you
> might trust the flag and avoid sorting.
>
> Downsides:
>
> - Every list grows an extra attribute; however, given that lists are
> already quite big data structures and are often over-allocated, I
> don't think this will matter much.
>
> - insert(), append(), extend(), __setitem__() will be a tiny bit
> slower due to the need to set the flag.
>
>
>
> Thoughts?
>
>
>
>
More information about the Python-ideas
mailing list