[Python-ideas] Optional key to `bisect`'s functions?

Thu Feb 9 03:39:49 CET 2012

On 2/8/2012 7:25 PM, Guido van Rossum wrote:
> Hmm... I disagree with Raymond's rejection of the proposed feature. I
> have come across use cases for this functionality multiple time in real
> life.
>
> Basically Raymond says "bisect can call the key() function many times,
> which leads to bad design". His alternative, to use a list of (key,
> value) tuples, is often a bit clumsy when passing the sorted list to
> another function (e.g. for printing); having to transform the list using
> e.g. [v for (k, v) in a] feels clumsy and suboptimal. So I'm not sure
> that refusing the key= option always leads to the best design (in the
> sense of the most readable code).

An alternative to the n x 2 array is two n-arrays or a 2 x n array. Then 
there is no problem using either the keys or the values array.
To use insort_right or insort_left for this, they would have to return 
the insertion position instead of None. Right now one must use 
bisect_right or _left and then .insert into both arrays instead of just 
the vals array.

> Adding key= is particularly attractive since the current invariant is
> something like "if a == sorted(a) before the operation, then a ==
> sorted(a) after the operation". Adding a key= option would simply change
> that to sorted(a, key=key) on both counts.
>
> Also note that "many times" is actually O(log N) per insertion, which
> isn't so bad. The main use case for bisect() is to manage a list that
> sees updates *and* iterations -- otherwise building the list unsorted
> and sorting it at the end would make more sense. The key= option
> provides a balance between the cost/elegance for updates and for iterations.

For *large enough* lists, the O(n*n) cost of insertions will dominate 
the O(n*logN) key() calls, so reducing the latter to O(n) key calls will 
not matter.

-- 
Terry Jan Reedy