[Python-Dev] On time complexity of heapq.heapify

Raymond Hettinger raymond.hettinger at gmail.com
Mon Dec 12 11:43:29 EST 2016


> On Dec 12, 2016, at 8:12 AM, Raymond Hettinger <raymond.hettinger at gmail.com> wrote:
> 
> The heapify() algorithm is well known and well studied.   A quick Google search turns up plenty of explanations: https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=complexity%20of%20heapify 
> 
> algorithm - How can building a heap be O(n) time complexity? - StackOverflow
> https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=complexity%20of%20heapify
> 
> Data Structures Heap, Heap Sort & Priority Queue
> https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=complexity%20of%20heapify
> Sub tree rooted at i is a max heap. Simple bound: 
> – O(n) calls to MAX-HEAPIFY, 
> – Each of which takes O(lg n), – Complexity: O(n lg n). 
> – Thus, the running time of BUILD-MAX-HEAP is O(n).
> 
> Here's a more detailed explanation:  http://www.cs.umd.edu/~meesh/351/mount/lectures/lect14-heapsort-analysis-part.pdf
> 
> If you have other follow-ups, please take this discussion to another list.  This list is for the development of the Python core and not for general questions about algorithms or use of the language.

I forgot to attach the measurements that demonstrate the O(n) complexity:

# Python 3 Code
from heapq import heapify
from random import randrange

cmp_cnt = 0

class Int(int):
    def __lt__(self, other):
        global cmp_cnt
        cmp_cnt += 1
        return super.__lt__(self, other)

def trial(n):
    global cmp_cnt
    
    data = [Int(randrange(1000000)) for i in range(n)]
    cmp_cnt = 0
    heapify(data)
    return cmp_cnt

for n in [100, 1000, 10000, 100000, 1000000, 10000000]:
    print("n = {:<10d}    compares = {}".format(n, trial(n)))

Which outputs:

n = 100           compares = 155
n = 1000          compares = 1620
n = 10000         compares = 16446
n = 100000        compares = 164779
n = 1000000       compares = 1649165
n = 10000000      compares = 16493429




More information about the Python-Dev mailing list