Why is heapify linear?

Mon Nov 1 19:45:24 EST 2004

[Scott David Daniels]
> I am sorry, but in the Python 2.4 description of "heapify", I find the
> description of "Transform list x into a heap, in-place, in linear time,"
> unbelievable.  I understand the hand-wave that makes dictionary building
> linear (though I have a hard time with even that).  Could somebody tell
> me what twist of logic makes "heapify" linear, except in the sense that
> linear is coming to mean "very fast?"

There are no hands waving.  Dict-building is best-case and
expected-case linear time, but worst-case quadratic time.  For
heapification, all three are linear.  That's all rigorously provable
(and typically are so proved in a first course on algorithmic
complexity).  Google on

    heapify linear

to find proofs for heapification bounds.  But you perhaps won't
*believe* it until you can't deny it <wink>.  So try it:

"""
class CmpCounter:
    def __init__(self, i):
        self.i = i

    def __cmp__(self, other):
        global cmpcount
        cmpcount += 1
        return cmp(self.i, other.i)

from heapq import heapify
import random
n = 10000

xs = [CmpCounter(i) for i in range(n)]

def tryone(tag, xs):
    global cmpcount
    cmpcount = 0
    heapify(xs)
    print tag, "len", len(xs), "# comparisions", cmpcount

for i in range(10):
    random.shuffle(xs)
    tryone("random", xs)

xs.sort()
tryone("already sorted", xs)

xs.sort(reverse=True) # using 2.4b1 here
tryone("reverse sorted", xs)
"""

"Already sorted" is essentially the worst case for min-heap
heapification (each new element added is then the smallest seen so
far, and has to bubble all the way from the bottom of the heap-so-far
to the new root location).

Note that "the obvious" way to transform a list into a heap is not
worst-case linear time.  Linear-time heapification is due to Floyd. 
Someone should patent it <wink>.