[Python-Dev] collections.sortedtree
Marko Rauhamaa
marko at pacujo.net
Sun Mar 30 10:39:18 CEST 2014
Guido van Rossum <guido at python.org>:
> Yeah, so the pyftp fix is to keep track of how many timers were
> cancelled, and if the number exceeds a threshold it just recreates the
> heap, something like
>
> heap = [x for x in heap if not x.cancelled]
> heapify(heap)
I measured my target use case with a simple emulation on my linux PC.
The simple test case emulates this scenario:
Start N connections at frequency F and have each connection start a
timer T. Then, rotate over the connections at the same frequency F
restarting timer T. Stop after a duration that is much greater than
T.
Four different timer implementations were considered:
HEAPQ: straight heapq
HEAPQ*: heapq with the pyftp fix (reheapify whenever 80% of the
outstanding timers have been canceled)
SDICT: sorteddict (my C implementation)
PyAVL: Python AVL tree (my implementation)
Here are the results:
N = 1000, F = 100 Hz, T = 10 min, duration 1 hr
=============================================
Virt Res max len() urs sys CPU
MB MB s s %
=============================================
HEAPQ 22 16 60001 21.5 4.3 0.7
HEAPQ* 11 7 5000 18.4 4.2 0.6
SDICT 11 6 1000 18.2 3.9 0.6
PyAVL 11 6 1000 39.3 3.6 1.2
=============================================
N = 10000, F = 1000 Hz, T = 10 min, duration 1 hr
=============================================
Virt Res max len() urs sys CPU
MB MB s s %
=============================================
HEAPQ 125 120 600044 223.0 25.8 6.9
HEAPQ* 21 16 50000 186.8 30.0 6.0
SDICT 15 10 10000 196.6 25.7 6.2
PyAVL 16 11 10000 412.5 22.3 12.1
=============================================
Conclusions:
* The CPU load is almost identical in HEAPQ, HEAPQ* and SDICT.
* HEAPQ* is better than HEAPQ because of the memory burden.
* PyAVL is not all that bad compared with SDICT.
Marko
More information about the Python-Dev
mailing list