For fairness, it might be interesting to try another run of your test in which the input sequence is sorted in increasing order rather than random.
Comparing the worst case of one against the best case of the other isn't my idea of fairness <wink>, but sure.
Well, it doesn't seem any fairer to use random data to compare an algorithm with an average time bound that depends on an assumption of randomness in the data...anyway, the point was more to understand the limiting cases. If one algorithm is usually 3x faster than the other, and is never more than 10x slower, that's better than being usually 3x faster but sometimes 1000x slower, for instance.
Sure. In practice you need to know time distributions when using an algorithm -- best, expected, worse, and how likely each are under a variety of expected conditions.
My Java KBest code was written to make data subsets for a half-dozen web pages (same data selected according to different criteria). Of these six instances, one is presented the data in roughly ascending order, one in descending order, and the other four are less clear but probably not random.
Robustness in the face of this sort of variation is why I prefer any average-case assumptions in my code's performance to depend only on randomness from a random number generator, and not arbitrariness in the actual input. But I'm not sure I'd usually be willing to pay a 3x penalty for that robustness.
Most people aren't, until they hit a bad case <0.5 wink>. So "pure" algorithms rarely survive in the face of a large variety of large problem instances. The monumental complications Python's list.sort() endures to work well under many conditions (both friendly and hostile) is a good example of that. In industrial real life, I expect an all-purpose N-Best queue would need to take a hybrid approach, monitoring its fast-path gimmick in some cheap way in order to fall back to a more defensive algorithm when the fast-path gimmick isn't paying.
Here's a surprise: I coded a variant of the quicksort-like partitioning method, at the bottom of this mail. On the largest-1000 of a million random-float case, times were remarkably steady across trials (i.e., using a different set of a million random floats each time):
heapq 0.96 seconds sort (micro-optimized) 3.4 seconds KBest (below) 2.6 seconds
Huh. You're almost convincing me that asymptotic analysis works even in the presence of Python's compiled-vs-interpreted anomalies.
Indeed, you can't fight the math! It often takes a large problem for better O() behavior to overcome a smaller constant in a worse O() approach, and especially in Python. For example, I once wrote and tuned and timed an O(N) worst-case rank algorithm in Python ("find the k'th smallest item in a sequence"), using the median-of-medians-of-5 business. I didn't have enough RAM at the time to create a list big enough for it to beat "seq.sort(); return seq[k]". By playing lots of tricks, and boosting it to median-of-medians-of-11, IIRC I eventually got it to run faster than sorting on lists with "just" a few hundred thousand elements.
But in *this* case I'm not sure that the only thing we're really measuring isn't:
1. Whether an algorithm has an early-out gimmick. 2. How effective that early-out gimmick is. and 3. How expensive it is to *try* the early-out gimmick.
The heapq method Rulz on random data because its answers then are "yes, very, dirt cheap". I wrote the KBest test like so:
def three(seq, N): NBest = KBest(N, -1e200) for x in seq: NBest.put(x) L = NBest.get() L.sort() return L
(the sort at the end is just so the results can be compared against the other methods, to ensure they all get the same answer). If I break into the abstraction and change the test like so:
def three(seq, N): NBest = KBest(N, -1e200) cutoff = -1e200 for x in seq: if x > cutoff: NBest.put(x) cutoff = NBest.cutoff L = NBest.get() L.sort() return L
then KBest is about 20% *faster* than heapq on random data. Doing the comparison inline avoids a method call when early-out pays, early-out pays more and more as the sequence nears its end, and simply avoiding the method call then makes the overall algorithm 3X faster. So O() analysis may triumph when equivalent low-level speed tricks are played (the heapq method did its early-out test inline too), but get swamped before doing so.
The other surprise is that (unlike, say, the sort or heapq versions) your KBest doesn't look significantly more concise than my earlier Java implementation.
The only thing I was trying to minimize was my time in whipping up something correct to measure. Still, I count 107 non-blank, non-comment lines of Java, and 59 of Python. Java gets unduly penalized for curly braces, Python for tedious tricks like
buf = self.buf k = self.k
to make locals for speed, and that I never put dependent code on the same line as an "if" or "while" test (while you do). Note that it's not quite the same algorithm: the Python version isn't restricted to ints, and in particular doesn't assume it can do arithmetic on a key to get "the next larger" key. Instead it does 3-way partitioning to find the items equal to the pivot. The greater generality may make the Python a little windier.
BTW, the heapq code isn't really more compact than C, if you count the implementation code in heapq.py too: it's all low-level small-int arithmetic and array indexing. The only real advantage Python has over assembler for code like that is that we can grow the list/heap dynamically without any coding effort.