[Python-Dev] collections.sortedtree

Dan Stromberg drsalists at gmail.com
Sat Mar 29 02:45:35 CET 2014


On Fri, Mar 28, 2014 at 5:48 PM, Daniel Stutzbach <stutzbach at google.com> wrote:
> On Fri, Mar 28, 2014 at 2:54 PM, Marko Rauhamaa <marko at pacujo.net> wrote:
>>
>> The blist implementation, which I have taken a quick glance at,
>>
>> buys cache locality at the price of block copying; I have no data to
>> decide if the tradeoff is a good one.
>
>
> There's actually a compile-time parameter so that you can tune the tradeoff
> by adjusting how wide each of blist's nodes are. Like most tradeoffs,
> whether it's good or bad depends on what you're trying to do.  RedBlack
> trees are very similar to a B-tree with a node-width of 4:
> http://en.wikipedia.org/wiki/Red%E2%80%93black_tree#Analogy_to_B-trees_of_order_4
>
> blist's sorteddict is written in Python on top of the blist type (which is
> written in C).  Because of the Python layer, it's slower by a constant
> factor compared to pure-C implementations in microbenchmarks.

It grieves me a bit to say this, and blist and blist's sorteddict are
probably a good tool for an appropriate job, but blist's sorteddict
type is quite a bit slower than several pure python implementations of
dictionaries with ordered keys.

In my testing blist.sorteddict was dead last for random keys, and
wasn't last but was still significantly underperforming for sequential
keys (outperforming only binary tree and scapegoat tree, behind all
others):

http://stromberg.dnsalias.org/~strombrg/python-tree-and-heap-comparison/2014-03-18/


More information about the Python-Dev mailing list