On 2/8/2012 7:25 PM, Guido van Rossum wrote:
Hmm... I disagree with Raymond's rejection of the proposed feature. I have come across use cases for this functionality multiple time in real life.
Basically Raymond says "bisect can call the key() function many times, which leads to bad design". His alternative, to use a list of (key, value) tuples, is often a bit clumsy when passing the sorted list to another function (e.g. for printing); having to transform the list using e.g. [v for (k, v) in a] feels clumsy and suboptimal. So I'm not sure that refusing the key= option always leads to the best design (in the sense of the most readable code).
An alternative to the n x 2 array is two n-arrays or a 2 x n array. Then there is no problem using either the keys or the values array. To use insort_right or insort_left for this, they would have to return the insertion position instead of None. Right now one must use bisect_right or _left and then .insert into both arrays instead of just the vals array.
Adding key= is particularly attractive since the current invariant is something like "if a == sorted(a) before the operation, then a == sorted(a) after the operation". Adding a key= option would simply change that to sorted(a, key=key) on both counts.
Also note that "many times" is actually O(log N) per insertion, which isn't so bad. The main use case for bisect() is to manage a list that sees updates *and* iterations -- otherwise building the list unsorted and sorting it at the end would make more sense. The key= option provides a balance between the cost/elegance for updates and for iterations.
For *large enough* lists, the O(n*n) cost of insertions will dominate the O(n*logN) key() calls, so reducing the latter to O(n) key calls will not matter. -- Terry Jan Reedy