
Serhiy Storchaka wrote:
From my C++ and Java experience, hashtable-based containers are much more useful than tree-based containers. They are more compact and fast. In most cases the only reason of using sorted container is supporting deterministic iteration order, but often it is enough to sort data only for output.
The concern is not constant speed factor but asymptotic complexity. Inserting or deleting in a tree-based sorted collection is O(log n), but lists are O(n): the bisect is O(log n) but moving trailing list elements is O(n). It's memmove fast but dominates as n gets large.
There are many algorithms that are hard to implement as fast as they can be without a tree. Something like rolling median would be a good test case, e.g. "for a list N containing numbers, produce a list M of length len(N) - 1 where M[i] == statistics.median(N[:i+1])". I think the best you can do with the standard library and without writing a balanced binary tree from scratch is O(n²) (if you can beat this I'd like to know how!). With a tree-based sorted list it's O(n log n).
Dicts and sets cannot maintain an arbitrary order, except that OrderedDict.move_to_end() very occasionally turns out to be all you need (and even then, I find it tricky to reason about and apply correctly in an algorithm: it requires you to maintain an invariant, where SortedList maintains the sorted invariant for you).
There is no such need of tree-based implementation. It is not needed in Python core, and is not needed for most users. So it is good to keep it in third-party libraries. Maintaining it in the stdlib has a cost and the benefit/cost value would be low.
The situation where this bites my firm the most is in interviews. This is exactly the situation where we're asking candidates to demonstrate their CS knowledge and produce an algorithm that has low asymptotic complexity.
We use various services including HackerRank, which provide a sandboxed Python environment that cannot use PyPI. There are quite a few services in this kind of Educational+Challenge space including some that execute Python in the browser using Brython etc.
My team believes that candidates with a Java background find the challenge easier than candidates with a Python background, because they just use NavigableSet and move on. I find that kind of embarrassing. It requires us to apologise for Python's deficiency and fudge how we score the question, which makes it hard to know we're providing a level playing field.
(I would argue that we should not ask those kinds of questions or not use these sandboxed interview environments but institutional friction makes that magnitude of change very difficult.)
Putting something in the standard library means that it will, in due course, appear in HackerRank and other services we use, and we can expect candidates to be familiar with it and incorporate it in their solutions.