[Python-ideas] Optional key to `bisect`'s functions?
Terry Reedy
tjreedy at udel.edu
Thu Feb 9 03:39:49 CET 2012
On 2/8/2012 7:25 PM, Guido van Rossum wrote:
> Hmm... I disagree with Raymond's rejection of the proposed feature. I
> have come across use cases for this functionality multiple time in real
> life.
>
> Basically Raymond says "bisect can call the key() function many times,
> which leads to bad design". His alternative, to use a list of (key,
> value) tuples, is often a bit clumsy when passing the sorted list to
> another function (e.g. for printing); having to transform the list using
> e.g. [v for (k, v) in a] feels clumsy and suboptimal. So I'm not sure
> that refusing the key= option always leads to the best design (in the
> sense of the most readable code).
An alternative to the n x 2 array is two n-arrays or a 2 x n array. Then
there is no problem using either the keys or the values array.
To use insort_right or insort_left for this, they would have to return
the insertion position instead of None. Right now one must use
bisect_right or _left and then .insert into both arrays instead of just
the vals array.
> Adding key= is particularly attractive since the current invariant is
> something like "if a == sorted(a) before the operation, then a ==
> sorted(a) after the operation". Adding a key= option would simply change
> that to sorted(a, key=key) on both counts.
>
> Also note that "many times" is actually O(log N) per insertion, which
> isn't so bad. The main use case for bisect() is to manage a list that
> sees updates *and* iterations -- otherwise building the list unsorted
> and sorting it at the end would make more sense. The key= option
> provides a balance between the cost/elegance for updates and for iterations.
For *large enough* lists, the O(n*n) cost of insertions will dominate
the O(n*logN) key() calls, so reducing the latter to O(n) key calls will
not matter.
--
Terry Jan Reedy
More information about the Python-ideas
mailing list