[Python-ideas] Optional key to `bisect`'s functions?
Terry Reedy
tjreedy at udel.edu
Thu Feb 9 22:51:53 CET 2012
On 2/9/2012 3:16 PM, Raymond Hettinger wrote:
>
> On Feb 9, 2012, at 9:50 AM, Daniel Stutzbach wrote:
>
>> Maintaining a sorted list using Python's list type is a trap. The
>> bisect is O(log n), but insertion and deletion are still O(n).
The omitted constants are such that the log n term dominates for 'small'
n. list.sort internally uses binary insert sort for n up to 64. It only
switches to mergesort for runs of at least 64.
>> A SortedList class that provides O(log n) insertions is useful from
>> time to time. There are several existing implementations available (I
>> wrote one of them, on top of my blist type), each with their pros and
>> cons.
Are your blist leaves lists (or arrays) of some maximum size?
> I concur. People who want to maintain sorted collections (periodically
> adding and deleting items) are far better-off using your blist, a binary
> tree, or an in-memory sqlite database. Otherwise, we will have baited
> them into a non-scalable O(n) solution. Unfortunately, when people see
> the word "bisect", they will presume they've got O(log n) code. We've
> documented the issue with bisect.insort(), but the power of suggestion
> is very strong.
Using insort on a list of a millions items is definitely not a good
idea, but I can see how someone not so aware of scaling issues might be
tempted, especially with no stdlib alternative. One could almost be
tempted to issue a warning if 'hi' is 'too large'.
--
Terry Jan Reedy
More information about the Python-ideas
mailing list