order independent hash?

Hrvoje Niksic hniksic at xemacs.org
Thu Dec 8 04:30:01 EST 2011


Tim Chase <python.list at tim.thechases.com> writes:

> From an interface perspective, I suppose it would work.  However one
> of the main computer-science reasons for addressing by a hash is to
> get O(1) access to items (modulo pessimal hash structures/algorithms
> which can approach O(N) if everything hashes to the same
> value/bucket), rather than the O(logN) time you'd get from a tree. So
> folks reaching for a hash/map might be surprised if performance
> degraded with the size of the contents.

In a language like Python, the difference between O(1) and O(log n) is
not the primary reason why programmers use dict; they use it because
it's built-in, efficient compared to alternatives, and convenient to
use.  If Python dict had been originally implemented as a tree, I'm sure
it would be just as popular.

Omitting the factor of O(log n) as functionally equivalent to O(1) is
applicable to many situations and is sometimes called "soft-O" notation.
One example from practice is the pre-2011 C++, where the standardization
committee failed to standardize hash tables on time for the 1998
standard.  Although this was widely recognized as an oversight, a large
number of programs simply used tree-based std::maps and never noticed a
practical difference between between average-constant-time and
logarithmic complexity lookups.



More information about the Python-list mailing list