[Python-Dev] On time complexity of heapq.heapify
Raymond Hettinger
raymond.hettinger at gmail.com
Mon Dec 12 11:43:29 EST 2016
> On Dec 12, 2016, at 8:12 AM, Raymond Hettinger <raymond.hettinger at gmail.com> wrote:
>
> The heapify() algorithm is well known and well studied. A quick Google search turns up plenty of explanations: https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=complexity%20of%20heapify
>
> algorithm - How can building a heap be O(n) time complexity? - StackOverflow
> https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=complexity%20of%20heapify
>
> Data Structures Heap, Heap Sort & Priority Queue
> https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=complexity%20of%20heapify
> Sub tree rooted at i is a max heap. Simple bound:
> – O(n) calls to MAX-HEAPIFY,
> – Each of which takes O(lg n), – Complexity: O(n lg n).
> – Thus, the running time of BUILD-MAX-HEAP is O(n).
>
> Here's a more detailed explanation: http://www.cs.umd.edu/~meesh/351/mount/lectures/lect14-heapsort-analysis-part.pdf
>
> If you have other follow-ups, please take this discussion to another list. This list is for the development of the Python core and not for general questions about algorithms or use of the language.
I forgot to attach the measurements that demonstrate the O(n) complexity:
# Python 3 Code
from heapq import heapify
from random import randrange
cmp_cnt = 0
class Int(int):
def __lt__(self, other):
global cmp_cnt
cmp_cnt += 1
return super.__lt__(self, other)
def trial(n):
global cmp_cnt
data = [Int(randrange(1000000)) for i in range(n)]
cmp_cnt = 0
heapify(data)
return cmp_cnt
for n in [100, 1000, 10000, 100000, 1000000, 10000000]:
print("n = {:<10d} compares = {}".format(n, trial(n)))
Which outputs:
n = 100 compares = 155
n = 1000 compares = 1620
n = 10000 compares = 16446
n = 100000 compares = 164779
n = 1000000 compares = 1649165
n = 10000000 compares = 16493429
More information about the Python-Dev
mailing list