On Thu, Oct 13, 2016 at 1:36 PM, Марк Коренберг email@example.com wrote:
I mean mutable containers that are always sorted when iterating over them.
- SortedSet (sorted unique elements, implemented using (rb?)tree instead of hash)
- SortedList (sorted elements, the same as SortedSet, but without uniquiness constraint) - actually a (rb?)tree, not a list (i.e. not an array)
- SortedDict (sorted by key when interating) - like C++'s ordered_map
Can you share more about your use cases for these containers? What are you making?
Nick Coghlan gave an answer to this question on StackOverflow at http://stackoverflow.com/a/5958960/232571 The answer kind of boils down to "there should be one obvious way to do it" and existing Python features like lists, sorted, bisect, and heapq cover many use cases.
I wrote the answer that is now the second highest rated for that question. I've noticed that the upvotes have been accumulating at a slightly higher rate than Nick's answer. I think that reflects an increase in interest and maybe gradual tide change of opinion.
There are many implementations in the net, like:
That's my project. I am the primary developer of the SortedContainers project.
You may also be interested in the [SortedCollections](http://www.grantjenks.com/docs/sortedcollections/) module which builds atop SortedContainers with data types like ValueSortedDict and ItemSortedDict. Because it's pure-Python, SortedContainers offers a lot of opportunity for extension/customization. That's also made it easier for the API to adapt/expand over time.
I think it should be one standardized implementation of such containers in CPython.
Instead of trees, implementation may use SkipList structure, but this is just implementation details.
Such structres imply fast insertion and deletion, ability to iterate, and also memory efficiency.
I gave a talk at PyCon 2016 about Python Sorted Collections that's worth watching. The first third discusses about six different implementations with different strengths and weaknesses. The choice of data type is more than implementation details. One of the biggest issues is the various tradeoffs of data types like blists, rbtrees, etc.
I have been meaning to discuss sorted collections further with Raymond Hettinger (author of the Python collections module). We spoke after the PyCon talk and wanted to continue the conversation. But I had a busy summer and just a few weeks ago welcomed my first son into the world. So realistically that conversation won't happen until 2017.
I recommend to read thorough review articles written by Andrew Barnert:
One of Andrew Barnert's conclusions is that SortedContainers could not scale. I did a pretty rigorous performance analysis and benchmarking at http://www.grantjenks.com/docs/sortedcontainers/performance-scale.html Short answer: I scaled SortedContainers up through ten billion elements, well past the memory limits of most machines.