[Python-3000] Two proposals for a new list-like type: one modest, one radical

Wed Apr 25 18:54:33 CEST 2007

On 4/25/07, Raymond Hettinger <python at rcn.com> wrote:
> In practice, the code for BList is somewhat complex, and its desirability and performance in actual applications is unproven.
> Fortunately, Py2.6 is still a long way off. My recommendation is that you release it right away as a third-party module and let the
> hordes of Pythonistas test it in battle. With the code base thoroughly exercised and some positive user feedback, it would be hard
> to say no to this going into Py2.6.

Will do.  What's the best way to release a module to make it easy for
people to find, download, and install?  (a pointer to a relevant
document would be great)

> How does BList's memory consumption compare with the current list over-allocation scheme?
>
> Does it take longer to instantiate an emtpy list?  What are the malloc/free patterns?

See these earlier message in the thread:

http://mail.python.org/pipermail/python-3000/2007-April/006801.html
http://mail.python.org/pipermail/python-3000/2007-April/006804.html
http://mail.python.org/pipermail/python-3000/2007-April/006809.html

The short summary is that BList and the current list have very similar
best-case and worst-case memory consumption for a single list.  (If
copy-on-write comes into play, BList is much more memory efficient
than a regular list)

Malloc/free operations occur when the list is large and growing or
shrinking (approximately when 64 elements have been added or removed).
 They occur more often than for list(), but if you're doing a lot of
growing and shrinking the O(log n) vs O(n) performance boost easily
pays for the extra memory operations.  All the blocks are the same
size.  The BList code keeps a small list of recently freed nodes to
recycle, which speeds things up in certain common cases where
additions and removals are interchanged.

> > In other words, for short lists,
> > a BList works just like Python's array-based list() type.  Thus, it
> > has the same good performance on small lists.
>
> While this is a good implementation idea (similar ideas are used to speed-up quicksort for instance), it makes the first listed
> advantage seem like a straw-man designed to mislead proposal reviewers into thinking that the blist structure is more efficient than
> is actually is.  Perhaps rewrite it as: "To keep the performance acceptable for short lists, btreelists fall back to the existing
> implementation."

I fear that you have misunderstood me, as the BList does not fall back
on the existing implementation.  I'm sorry that you may have gotten
the impression that I was attempting to mislead.  I admit openly that
the performance for small lists is not due to any fancy data
structure.  However, it's not array-like as a special case just for
small lists, either.

Each BList node contains an array of up to 128 children.  If the BList
node is a leaf, these children are pointers to items stored by the
user, otherwise they are pointers to other BList nodes.  If the list
contains 128 or fewer children, the BList can have one node which is
both the root and a leaf.  This node just has an array of the user's
items, so it ends up having the same performance as Python's
array-based list.

It is true that you end up with array-operations when you have a small
list and as the base case for many operations implemented via
recursion.  In some sense that is similar to a QuickSort that uses
InsertionSort for n < threshold.  Unlike QuickSort, I couldn't rewrite
it *without* array operations as the base case, and array operations
also occur in the middle of the tree.  They're fundamental, not a
performance tweak.  So I don't think the QuickSort with InsertionSort
analogy holds.

How about if I include a brief description up-front of how it works
like: "The BList internally uses a tree structure where each node
contains an array of up to 128 children."  Then the first performance
bullet can be something like "just as fast as a Python list() when the
list is small, since BList's are also based on arrays"?

> FWIW, I prefer that the current list implementation remain in-place. Because what we have is simple, it is not hard to maintain, its
> performance characteristics are easily understood, and we have a straight-forward C-API that allows module writers to have fast,
> intuitive direct access to the structure.
>
> OTOH, if your btreelist goes into Py2.6's collections module and users flock to it instead of regular lists, then there may be some
> basis for further discussion.

This sounds like a good plan to me, though how does that square with
Guido's desire to avoid a proliferation of data types?
http://mail.python.org/pipermail/python-3000/2007-April/006774.html

> BTW, please offer-up a pure python version (like we do for the heapq module).  That will make it easier for the curious to find-out
> how the magic is done.  And, it will make life easier for Jython, IronPython, and PyPy implementer so they won't have to do any
> extra work to support Py2.6.

Will do.

> > There are only a few use-cases (that I can think of) where Python's
> > list() regularly outperforms the BList.  These are:
> >
> > 1. A large LIFO stack, where there are many .append() and .pop(-1)
> > operations.  These are O(1) for a Python list, but O(log n) for the
> > BList().
>
> This is a somewhat important use-case (we devote two methods to it).
>
> > 2. An large, unchanging list, where there are many getitem() calls,
> > and none of the inserts or deletes where the BList() shines.  Again,
> > these are O(1) for a Python list, but O(log n) for the BList(), and
> > the BList() is just as fast if the list is small.
>
> This is also a critical use-case.

I don't disagree.

I think the key question is this:  Given that there's no performance
difference for small lists, would you rather have a type that for
large lists:

1) Gives you O(1) speed for some critical use-cases, but O(n) for others, or
2) Gives you O(log n) across the board?

Put another way, for a 10,000 element list how would you feel about
taking a small performance hit (50% increase in execution time for
LIFO; 5% increase in execution time for getitem) in exchange for large
performance improvements elsewhere (< 1% of the old execution time for
getslice, setslice, FIFO, etc.)?

I obviously favor #2, but I acknowledge that this is a subjective
question that depends on which use-cases you care most about.  Your
incremental deployment plan (extension module, possibly in collections
for 2.6, go from there) may be a good way to evaluate how the
community as a whole feels.

> You've come a long way since then.  Congrats.
>
> Thanks for your contribution and thoughtful discussion.  This weekend, I look forward to reading you source code in detail.

Thanks!

I've just adding my Python prototype implementation to the .tar.gz.
It has a lengthy text section at the top that describes the structure
of the BList and the invariants that it must maintain.

-- 
Daniel Stutzbach, Ph.D.             President, Stutzbach Enterprises LLC