[Python-ideas] Optional key to `bisect`'s functions?

Terry Reedy tjreedy at udel.edu
Thu Feb 9 22:51:53 CET 2012


On 2/9/2012 3:16 PM, Raymond Hettinger wrote:
>
> On Feb 9, 2012, at 9:50 AM, Daniel Stutzbach wrote:
>
>> Maintaining a sorted list using Python's list type is a trap. The
>> bisect is O(log n), but insertion and deletion are still O(n).

The omitted constants are such that the log n term dominates for 'small' 
n. list.sort internally uses binary insert sort for n up to 64. It only 
switches to mergesort for runs of at least 64.

>> A SortedList class that provides O(log n) insertions is useful from
>> time to time. There are several existing implementations available (I
>> wrote one of them, on top of my blist type), each with their pros and
>> cons.

Are your blist leaves lists (or arrays) of some maximum size?

> I concur. People who want to maintain sorted collections (periodically
> adding and deleting items) are far better-off using your blist, a binary
> tree, or an in-memory sqlite database. Otherwise, we will have baited
> them into a non-scalable O(n) solution. Unfortunately, when people see
> the word "bisect", they will presume they've got O(log n) code. We've
> documented the issue with bisect.insort(), but the power of suggestion
> is very strong.

Using insort on a list of a millions items is definitely not a good 
idea, but I can see how someone not so aware of scaling issues might be 
tempted, especially with no stdlib alternative. One could almost be 
tempted to issue a warning if 'hi' is 'too large'.

-- 
Terry Jan Reedy




More information about the Python-ideas mailing list