_siftup and _siftdown implementation
Sven R. Kunze
srkunze at mail.de
Fri Feb 5 09:42:29 EST 2016
On 05.02.2016 02:26, srinivas devaki wrote:
> as I come to think of it again, it is not subheap, it actually heap cut at
> some level hope you get the idea from the usage of _siftup. so even though
> the `pos` children are valid the _siftup brings down the new element (i.e
> the element which is at first at `pos`) upto its leaf level and then again
> it is brought up by using _siftdown. why do the redundant work when it can
> simply breakout?
The heapq module itself has a very extensive documentation inside. This
is what it says for _siftup. I think this is sort of an optimization
that works pretty well (cf. the numbers) for popping off the FIRST item:
# The child indices of heap index pos are already heaps, and we want to
make # a heap at index pos too. We do this by bubbling the smaller child
of # pos up (and so on with that child's children, etc) until hitting a
leaf, # then using _siftdown to move the oddball originally at index pos
into place. # # We *could* break out of the loop as soon as we find a
pos where newitem <= # both its children, but turns out that's not a
good idea, and despite that # many books write the algorithm that way.
During a heap pop, the last array # element is sifted in, and that tends
to be large, so that comparing it # against values starting from the
root usually doesn't pay (= usually doesn't # get us out of the loop
early). See Knuth, Volume 3, where this is # explained and quantified in
an exercise. # # Cutting the # of comparisons is important, since these
routines have no # way to extract "the priority" from an array element,
so that intelligence # is likely to be hiding in custom comparison
methods, or in array elements # storing (priority, record) tuples.
Comparisons are thus potentially # expensive. # # On random arrays of
length 1000, making this change cut the number of # comparisons made by
heapify() a little, and those made by exhaustive # heappop() a lot, in
accord with theory. Here are typical results from 3 # runs (3 just to
demonstrate how small the variance is): # # Compares needed by heapify
Compares needed by 1000 heappops # --------------------------
-------------------------------- # 1837 cut to 1663 14996 cut to 8680 #
1855 cut to 1659 14966 cut to 8678 # 1847 cut to 1660 15024 cut to 8703
# # Building the heap by using heappush() 1000 times instead required #
2198, 2148, and 2219 compares: heapify() is more efficient, when # you
can use it. # # The total compares needed by list.sort() on the same
lists were 8627, # 8627, and 8632 (this should be compared to the sum of
heapify() and # heappop() compares): list.sort() is (unsurprisingly!)
more efficient # for sorting.
What do you think about our use-case?
> _siftup and _siftdown are functions from python standard heapq module.
> PS: I do competitive programming, I use these modules every couple of days
> when compared to other modules. so didn't give much thought when posting to
> the mailing list. sorry for that.
Competitive programming? That sounds interesting. :)
More information about the Python-list