On Tue, Nov 9, 2021 at 9:00 PM Steven D'Aprano <steve@pearwood.info> wrote:
Sorting dicts has been discussed on the Python-Ideas mailing list, it is
too hard and expensive to justify for the limited use-cases for it. If
you want to sort a dict, you are best to sort the dict's keys, then
create a new dict. Or possibly use a dedicated Sorted Mapping type like
a red-black tree or similar.

There are several implementations of key-order sorted dicts available in Python, and more than a few of them are well tested.  I am the maintainer of one of them: https://pypi.org/project/treap/ which I ported from Java.  I also did a performance comparison among several of them a few years back.

Red-black trees are popular, perhaps especially among Java developers, but I've never understood why.  They aren't the fastest.  Among traditional tree-based key-sorted dicts, treaps are frequently fastest.

However, SortedDict is not uncommonly faster than treaps - I believe this is because SortedDict is very good at maximizing locality of reference.  Traditional trees are almost always going to do an entire cache line hit for every node in large trees, even if those nodes are "next to each other" when sorted/traversed.  SortedDicts put keys that are nearly equal, in nearly the same part of memory - so multiple values can be retrieved with a single cache line hit.

Sorting a dict's keys inside a loop tends to give O(n^2) algorithms, and sometimes even O(n^2 * logn).  This is not good.  A traditional tree should give O(nlogn) algorithms under similar circumstances, and although my gut is telling me SortedDict is similar the presentation linked below suggests a different (though still better than sorting in a loop) runtime for SortedDict.

It's true that key-sorted dicts are not all that common.  Their most common use is probably for implementing finite caches - evictions can benefit from ordered keys.  However, we already have functools.lru_cache.

Here's my most recent performance comparison: https://stromberg.dnsalias.org/~strombrg/sorted-dictionary-comparison/Datastructure%20comparison.pdf

Here's a PyCon 2016 presentation about SortedContainers, that includes SortedDict:
https://www.youtube.com/watch?v=7z2Ki44Vs4E

I think it would make sense to include at least one key-sorted dict in CPython, and feel that SortedDict would not be a bad way to go.  Neither would treap.