[Python-Dev] Simple dicts
Greg Ward
gward@python.net
Thu, 15 May 2003 09:39:27 -0400
On 14 May 2003, Brett C. said:
> When I took data structures I was taught that chaining was actually the
> easiest way to do hash tables and they still had good performance
> compared to open addressing. Because of this taught bias I always
> wondered why Python used open addressing; can someone tell me?
If your nodes are small, chaining has a huge overhead -- an extra
pointer for each node in a chain. You can play around with glomming
several nodes together to amortize the cost of those pointers, but ISTR
the win isn't that big.
Open addressing is more memory-efficient, but when the hash table fills
(or gets close to full), you absolutely positively have to rehash.
(Back in January, I played around with writing a custom hash table for
keeping ZODB indexes in memory without using a Python dict, so that's
why I'm fairly fresh on hash table minutiae.)
Greg
--
Greg Ward <gward@python.net> http://www.gerg.ca/
NOBODY expects the Spanish Inquisition!