Long time lurker, first time poster. I think there may be multiple discussions happening here so I wanted to highlight a competing module.

I think blist.blist is an excellent data type with a lot of value. But the performance graphs can be a bit misleading if you think they apply to the sortedlist, sorteddict, and sortedset types. In those scenarios, I do not believe blist is the best choice.

The SortedContainers module (https://pypi.python.org/pypi/sortedcontainers) provides SortedList, SortedDict, and SortedSet data types. It is implemented in pure-Python, has 100% coverage and hours of stress testing. The API implemented is very close to blist's and a lot of effort has been put into documentation (http://www.grantjenks.com/docs/sortedcontainers/). Furthermore, the data types provided are often faster than their blist counterparts.

Extensive performance comparisons against other implementations of sorted list, sorted dict, and sorted set types are documented (http://www.grantjenks.com/docs/sortedcontainers/performance.html) along with a comparison of runtimes and load-factors (similar to blist, sortedcontainers uses a modified B-tree but with a tunable node size.) For SortedList, an analysis of use cases on Github has been made. These describe five use cases with performance analysis (http://www.grantjenks.com/docs/sortedcontainers/performance-workload.html)

Disclaimer: I am the author of the Python SortedContainers project. Feedback welcome.

Grant Jenks

On Sun, Sep 21, 2014 at 3:04 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Sun, 21 Sep 2014 05:18:38 +0000
David Wilson <dw+python-ideas@hmmz.org> wrote:
> On Sun, Sep 21, 2014 at 02:37:39PM +1000, Steven D'Aprano wrote:
> > > In 2007, this PEP was created that suggested integrating Daniel Stutzbach's
> > > blist into Python: http://legacy.python.org/dev/peps/pep-3128/
> > >
> > > The PEP was rejected, but Raymond Hettinger made a note that "after a few
> > > months, I intend to poll comp.lang.python for BList success stories. If
> > > they exist, then I have no problem with inclusion in the collections
> > > module."
> >
> > I have not used blist, but I have no objection to it becoming a
> > collections type if the author agrees.
> It seems unsettling that we'd consider adding another special use
> collection to the stdlib, when more widely applicable generalizations of
> it are missing. In blist's case, it can (mostly) be trivially
> reimplemented using an ordered map, which the standard library lacks.

But can it be *efficiently* reimplemented using an ordered map?
blist uses chunks of 128 pointers, which makes it almost as memory
efficient as a standard list. A traditional ordered map (I suppose you
mean some kind of balanced tree) would generally show much more
overhead, if only because it has to store the keys explicitly. And also,
you now have to do costly key comparisons every time you do a lookup
("costly" because they go through the object layer and its
indirections, as opposed to simple integer arithmetic).

> Coming from this perspective, I'd prefer that further additions were
> limited to clean and far better understood structures. In this light,
> could we perhaps instead discuss the merits of a collections.Tree,
> collections.SortedDict or similar?

This sounds like a false dilemma. We could have a collections.blist
*and* a collections.Tree.



Python-ideas mailing list
Code of Conduct: http://python.org/psf/codeofconduct/