[New-bugs-announce] [issue21424] Simplify and speed-up heaqp.nlargest()

Raymond Hettinger report at bugs.python.org
Sun May 4 07:57:09 CEST 2014

New submission from Raymond Hettinger:

Consolidate the logic for nlargest() into a single function.  Remove both the C and pure Python base underlying code. 

With all the logic in a single function, it only becomes necessary to create, store, and compare the data tuples when a need item is added to the heap.  This means that the rest of the comparisons (checking to see whether a new item needs to be added to the heap) can run faster and not need to create a (key, order, elem) tuple.

The change reduces the number of tuples created and the number of ordering integers created.

When rich comparisons were introduced, tuple ordering comparisons became twice as expensive (they are compared elementwise for equality and then there is an additional comparison call to order the first differing element).  Under the existing nlargest() code, we pay that price for every lement in the iterable.  In the new code, we pay that price only for the small subset of the iterable that actually gets added to the heap.

After this, another patch for simplifying nsmallest() is forthcoming.

components: Library (Lib)
files: rid_nlargest.py
messages: 217859
nosy: rhettinger
priority: low
severity: normal
status: open
title: Simplify and speed-up heaqp.nlargest()
type: performance
versions: Python 3.5
Added file: http://bugs.python.org/file35148/rid_nlargest.py

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list