RE: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)

Dec. 11, 2000

      [/F, on Christian's GF tutorial]
...
I think someone just won the "brain exploder 2000" award ;-)
Well, anyone can play.  When keys collide, what we need is a function f(i)
such that repeating
    i = f(i)
visits every int in (0, 2**N) exactly once before setting i back to its
initial value, for a fixed N and where the first i is in (0, 2**N).  This is
the quickest:

def f(i):
    i -= 1
    if i == 0:
        i = 2**N-1
    return i

Unfortunately, this leads to performance-destroying "primary collisions"
(see Knuth, or any other text w/ a section on hashing).

Other *good* possibilities include a pseudo-random number generator of
maximal period, or viewing the ints in (0, 2**N) as bit vectors indicating
set membership and generating all subsets of an N-element set in a Grey code
order.

The *form* of the function dictobject.c actually uses is:

def f(i):
    i <<= 1
    if i >= 2**N:
       i ^= MAGIC_CONSTANT_DEPENDING_ON_N
    return i

which is suitably non-linear and as fast as the naive method.  Given the
form of the function, you don't need any theory at all to find a value for
MAGIC_CONSTANT_DEPENDING_ON_N that simply works.  In fact, I verified all
the magic values in dictobject.c via brute force, because the person who
contributed the original code botched the theory slightly and gave us some
values that didn't work.  I'll rely on the theory if and only if we have to
extend this to 64-bit machines someday:  I'm too old to wait for a brute
search of a space with 2**64 elements <wink>.

mathematics-is-a-battle-against-mortality-ly y'rs  - tim

RE: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)

Tim Peters