Keeping track of the N largest values

John Nagle nagle at animats.com
Sat Dec 25 15:26:32 EST 2010


On 12/25/2010 8:04 AM, Peter Otten wrote:
> Roy Smith wrote:
>
>> I'm processing a stream of N numbers and want to keep track of the K
>> largest.  There's too many numbers in the stream (i.e. N is too large)
>> to keep in memory at once.  K is small (100 would be typical).
>
> http://docs.python.org/library/heapq.html#heapq.nlargest

     Incidentally, if it happens that the data is already in
a database, MySQL will do that.

	SELECT val FROM tab ORDER BY val DESC LIMIT N;

will, for small N, keep only N values. For large N,
it sorts.

    That's for an un-indexed field.  If "val" is an
index, this is a very fast operation, since the tree
building has already been done.

				John Nagle




More information about the Python-list mailing list