
On Sat, Apr 18, 2009 at 7:49 AM, Daniel Stutzbach <daniel@stutzbachenterprises.com> wrote:
On Sat, Apr 18, 2009 at 7:43 AM, Aahz <aahz@pythoncraft.com> wrote:
+1 -- I recently did a short presentation on Big O notation and Python containers, and heapq looked remarkably ugly.
Another issue with heapq is that it does not support modifying the priority of an existing element (usually called the "decrease_key" operation in textbooks). The hard part of incrementing decrease_key is that somehow the object in the heap needs to know its current position, else you have to do an expensive linear search to find the object's current position.
I've implemented a couple of heaps for Python over the years:
1. http://svn.python.org/view/sandbox/trunk/collections/pairing_heap.py?view=markup&pathrev=40887 (checked into the Sandbox... 5 years ago! I think I also have a C reimplementation somewhere)
If you don't care about altering the priority, then the interface is more or less like Facundo described.
To keep track of the positions, insert operations return a wrapper around each object. The user needs to keep track of the wrapper and pass it to the adjust_key() method if they want to change the priority of the object later.
2. HeapDict. http://pypi.python.org/pypi/HeapDict
Looks, act, and quacks like a dict/MutableMapping, except popitem() returns the item with the lowest value instead of a random item. Also, there's a peekitem() method to examine that item without removing it. Since the interface is familiar, it's super-easy to use without having to look up the names of the methods whenever you need to use a heap.
The downside is that you can't add the same element into the heap twice (just like you can't have the same key in a dictionary twice). Personally I've only inserted the same element more than once when doing toy examples with integers, so for me at least that's no great loss. ;-)
I've also got a pair heap implementation: http://mail.python.org/pipermail/python-dev/2006-November/069845.html There's also a proposed update of sched.py in the bugtracker (as part of some updates to asyncore to add a scheduler) that offers a version of decreasekey, as well as "remove" functionality. I describe some of the trade-offs in implementing a heap with such features (memory use, O-notation, etc.) in a blog post last summer: http://chouyu-31.livejournal.com/316112.html . One of the ways we get around the "key must be unique" limitation is that we don't use a dictionary to reference the keys; when you put an item into the heap, you get an opaque reference, which you can then cancel, "reschedule", etc. The sched module is heavily biased towards event scheduling, so adding a collections.priority_queue or whatever is just fine with me. - Josiah