On 5/4/03 10:20 PM -0400 Tim Peters firstname.lastname@example.org wrote:
In practice, it's usually much faster than that. Over time, it gets rarer and rarer for
person > q
to be true (the new person has to be larger than the N-th largest seen so far, and that bar gets raised whenever a new person manages to hurdle it),
Good point. If any permutation of the input sequence is equally likely, and you're selecting the best k out of n items, the expected number of times you have to hit the data structure in your heapq solution is roughly k ln n, so the total expected time is O(n + k log k log n), with a really small constant factor on the O(n) term. The sorting solution I suggested has total time O(n log k), and even though sorting is built-in and fast it can't compete when k is small. Random pivoting is O(n + k), but with a larger constant factor, so your heapq solution looks like a winner.
For fairness, it might be interesting to try another run of your test in which the input sequence is sorted in increasing order rather than random. I.e., replace the random generation of seq by seq = range(M) I'd try it myself, but I'm still running python 2.2 and haven't installed heapq. I'd have to know more about your application to have an idea whether the sorted or randomly-permuted case is more representative.