[Python-ideas] Heap data type

Josiah Carlson josiah.carlson at gmail.com
Tue Apr 21 18:08:53 CEST 2009

On Sat, Apr 18, 2009 at 7:49 AM, Daniel Stutzbach
<daniel at stutzbachenterprises.com> wrote:
> On Sat, Apr 18, 2009 at 7:43 AM, Aahz <aahz at pythoncraft.com> wrote:
>> +1 -- I recently did a short presentation on Big O notation and Python
>> containers, and heapq looked remarkably ugly.
> Another issue with heapq is that it does not support modifying the priority
> of an existing element (usually called the "decrease_key" operation in
> textbooks).  The hard part of incrementing decrease_key is that somehow the
> object in the heap needs to know its current position, else you have to do
> an expensive linear search to find the object's current position.
> I've implemented a couple of heaps for Python over the years:
> 1.
> http://svn.python.org/view/sandbox/trunk/collections/pairing_heap.py?view=markup&pathrev=40887
> (checked into the Sandbox... 5 years ago! I think I also have a C
> reimplementation somewhere)
> If you don't care about altering the priority, then the interface is more or
> less like Facundo described.
> To keep track of the positions, insert operations return a wrapper around
> each object.  The user needs to keep track of the wrapper and pass it to the
> adjust_key() method if they want to change the priority of the object
> later.
> 2. HeapDict.  http://pypi.python.org/pypi/HeapDict
> Looks, act, and quacks like a dict/MutableMapping, except popitem() returns
> the item with the lowest value instead of a random item.  Also, there's a
> peekitem() method to examine that item without removing it.  Since the
> interface is familiar, it's super-easy to use without having to look up the
> names of the methods whenever you need to use a heap.
> The downside is that you can't add the same element into the heap twice
> (just like you can't have the same key in a dictionary twice).  Personally
> I've only inserted the same element more than once when doing toy examples
> with integers, so for me at least that's no great loss. ;-)

I've also got a pair heap implementation:

There's also a proposed update of sched.py in the bugtracker (as part
of some updates to asyncore to add a scheduler) that offers a version
of decreasekey, as well as "remove" functionality.  I describe some of
the trade-offs in implementing a heap with such features (memory use,
O-notation, etc.) in a blog post last summer:
http://chouyu-31.livejournal.com/316112.html .  One of the ways we get
around the "key must be unique" limitation is that we don't use a
dictionary to reference the keys; when you put an item into the heap,
you get an opaque reference, which you can then cancel, "reschedule",

The sched module is heavily biased towards event scheduling, so adding
a collections.priority_queue or whatever is just fine with me.

 - Josiah

More information about the Python-ideas mailing list