propose index heap

Index heap is essential in efficient implementation of Dijkstra's algorithm, Prim's algorithm and other variants, I have implemented a concise version at https://github.com/yutao-li/libheap , I think it would be useful to have it in stdlib

I've thought about this before too. But if I remember correctly, most real-world path-finding algorithms can use a heap without an index to achieve basically the same results: with this approach, the heap can now store more than one copy of a node, but it doesn't need to put all of the nodes in the queue at the same time. Instead of calling a "decrease-key" function on a particular node, you can just throw in a new node into the heap with the smaller key. And when you pop a node off the heap, check that you haven't already popped off and processed that node. There may be marginally worse memory usage with this approach on extremely dense graphs, but there's generally better real-world performance with a naive un-indexed heap without the decrease-key function. This version also works on infinite graphs, where putting everything in the queue at the beginning would fail. See https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm#Practical_optimizations... The other thing to worry about is that adding a new data structure to the stdlib is not without significant costs: it adds something new to the heapq module for everyone to learn, and it prompts the question "should my heap be indexed?" for all users. Per PEP 20, "There should be one-- and preferably only one --obvious way to do it." There may be some situations where some data structures are better than the standard ones, but I feel that PyPI is a good place for those to stay.

I've thought about this before too. But if I remember correctly, most real-world path-finding algorithms can use a heap without an index to achieve basically the same results: with this approach, the heap can now store more than one copy of a node, but it doesn't need to put all of the nodes in the queue at the same time. Instead of calling a "decrease-key" function on a particular node, you can just throw in a new node into the heap with the smaller key. And when you pop a node off the heap, check that you haven't already popped off and processed that node. There may be marginally worse memory usage with this approach on extremely dense graphs, but there's generally better real-world performance with a naive un-indexed heap without the decrease-key function. This version also works on infinite graphs, where putting everything in the queue at the beginning would fail. See https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm#Practical_optimizations... The other thing to worry about is that adding a new data structure to the stdlib is not without significant costs: it adds something new to the heapq module for everyone to learn, and it prompts the question "should my heap be indexed?" for all users. Per PEP 20, "There should be one-- and preferably only one --obvious way to do it." There may be some situations where some data structures are better than the standard ones, but I feel that PyPI is a good place for those to stay.
participants (2)
-
Dennis Sweeney
-
master_lee