order independent hash?

Steven D'Aprano steve+comp.lang.python at pearwood.info
Fri Dec 9 06:27:16 EST 2011


On Thu, 08 Dec 2011 10:30:01 +0100, Hrvoje Niksic wrote:

> In a language like Python, the difference between O(1) and O(log n) is
> not the primary reason why programmers use dict; they use it because
> it's built-in, efficient compared to alternatives, and convenient to
> use.  If Python dict had been originally implemented as a tree, I'm sure
> it would be just as popular.

Except for people who needed dicts with tens of millions of items.

Remember also that dicts are used for looking up names in Python. Nearly 
all method calls, attribute accesses, global name lookups, function 
calls, etc. go through at least one and potentially multiple dict 
lookups. The simple statement:

n = len(x.y) + len(z)

likely requires nine dict lookups, and potentially more. In even a small 
application, there could be tens of millions of dict lookups; changing 
each of them from O(1) to O(log N) could result in a measurable slowdown 
to Python code in real applications. That is why dicts are highly 
optimized for speed.

As fast as dicts are, sometimes they aren't fast enough. One common micro-
optimization for tight loops and time-critical code is to create local 
variables from globals or builtins, because local variable access 
bypasses dict lookup. So people would notice if dicts were slower.


-- 
Steven



More information about the Python-list mailing list