[Python-Dev] Re: The Dictionary Gem is polished!

Dec. 19, 2000

      Again...

Tim Peters wrote:
...
Sounds good to me!  It's a very cheap way to get the high bits into play.
...
...
[Christian]
...
- The bits used from the string hash are not well distributed
- using a "warmup wheel" on the hash to suck all bits in
  gives the same quality of hashes like random numbers.
See above and be very cautious:  none of Python's hash functions produce
well-distributed bits, and-- in effect --that's why Python dicts often
perform "better than random" on common data.  Even what you've done so far
appears to provide marginally worse statistics for Guido's favorite kind of
test case ("worse" in two senses:  total number of collisions (a measure of
amortized lookup cost), and maximum collision chain length (a measure of
worst-case lookup cost)):
d = {}
   for i in range(N):
       d[repr(i)] = i
I will look into this.
...
check-in-one-thing-then-let-it-simmer-ly y'rs  - tim
Are you saying I should check the thing in? Really?

In another reply to this message I was saying
"""
This is why I think to be even more conservative:
Try to use a division wheel, but with the inverses
of the original primitive roots, just in order to
get at Guido's results :-)
"""

This was a religious desire, but such an inverse cannot exist.
Well, all inverses exists, but it is an error to think
that they can produce similar bit patterns. Changing the
root means changing the whole system, since we have just
a *representation* of a goup, via polynomial coefficients.

A simple example which renders my thought useless is this:
There is no general pattern that can turn a physical right
shift into a left shift, for all bit combinations.

Anyway, how can I produce a nearly complete scheme like today
with the same "cheaper than random" properties?

Ok, we have to stick with the given polymomials to stay
compatible, and we also have to shift left.
How do we then rotate the random bits in?
Well, we can in fact do a rotation of the whole index, moving
the highest bit into the lowest.
Too bad that this isn't supported in C. It is a native
machine instruction on X86 machines.

We would then have:

                incr = ROTATE_LEFT(incr, 1)
                if (incr > mask):
                    incr = incr ^ mp.ma_poly

The effect is similar to the "old" algorithm, bits are shiftet
left. Only if the hash happens to have hight bits, they appear
in the modulus.
On the current "faster than random" cases, I assume that
high bits in the hash are less likely than low bits, so it is
more likely that an entry finds its good place in the dict,
before bits are rotated in. hence the "good" cases would be kept.

I did all tests again, now including maximum trip length, and
added a "rotate-left" version as well:

D:\crml_doc\platf\py>python dictest.py
N=1000
trips for strings old=293/9 new=302/7 new2=221/7 rot=278/5
trips for bad integers old=499500/999 new=13187/31 new2=999/1 rot=16754/31
trips for random integers old=360/8 new=369/8 new2=358/6 rot=356/7
trips for windows names old=230/5 new=207/7 new2=200/5 rot=225/5
N=2000
trips for strings old=1093/11 new=1109/10 new2=786/6 rot=1082/8
trips for bad integers old=0/0 new=26455/32 new2=1999/1 rot=33524/34
trips for random integers old=704/7 new=686/8 new2=685/7 rot=693/7
trips for windows names old=503/8 new=542/9 new2=564/6 rot=529/7
N=3000
trips for strings old=810/5 new=839/6 new2=609/5 rot=796/5
trips for bad integers old=0/0 new=38681/36 new2=2999/1 rot=49828/38
trips for random integers old=708/5 new=723/7 new2=724/5 rot=722/6
trips for windows names old=712/6 new=711/5 new2=691/5 rot=738/9
N=4000
trips for strings old=1850/9 new=1843/8 new2=1375/11 rot=1848/10
trips for bad integers old=0/0 new=52994/39 new2=3999/1 rot=66356/38
trips for random integers old=1395/9 new=1397/8 new2=1435/9 rot=1394/13
trips for windows names old=1449/8 new=1434/8 new2=1457/11 rot=1513/9

D:\crml_doc\platf\py>

Concerning trip length, rotate is better than old in most cases.
Random integers seem to withstand any of these procedures.
For bad integers, rot takes naturally more trips than new, since
the path to the bits is longer.

All in all I don't see more than marginal differences between
the approaches, and I tent to stick with "new", since it is
theapest to implement.
(it does not cost anything and might instead be a little cheaper
for some compilers, since it does not reference the mask variable).

I'd say let's do the patch   --   ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com

## dictest.py
## Test of a new rehash algorithm
## Chris Tismer
## 2000-12-17
## Mission Impossible 5oftware Team

# The following is a partial re-implementation of
# Python dictionaries in Python.
# The original algorithm was literally turned
# into Python code.

##/*
##Table of irreducible polynomials to efficiently cycle through
##GF(2^n)-{0}, 2<=n<=30.
##*/
polys = [
    4 + 3,
    8 + 3,
    16 + 3,
    32 + 5,
    64 + 3,
    128 + 3,
    256 + 29,
    512 + 17,
    1024 + 9,
    2048 + 5,
    4096 + 83,
    8192 + 27,
    16384 + 43,
    32768 + 3,
    65536 + 45,
    131072 + 9,
    262144 + 39,
    524288 + 39,
    1048576 + 9,
    2097152 + 5,
    4194304 + 3,
    8388608 + 33,
    16777216 + 27,
    33554432 + 9,
    67108864 + 71,
    134217728 + 39,
    268435456 + 9,
    536870912 + 5,
    1073741824 + 83,
    0
]
polys = map(long, polys)

class NULL: pass

class Dictionary:
    dummy = "<dummy key>"
    def __init__(mp, newalg=0):
        mp.ma_size = 0
        mp.ma_poly = 0
        mp.ma_table = []
        mp.ma_fill = 0
        mp.ma_used = 0
        mp.oldalg = not newalg
        mp.warmup = newalg==2
        mp.rotleft = newalg==3
        mp.trips = 0
        mp.tripmax = 0

    def getTrips(self):
        trips, tripmax = self.trips, self.tripmax
        self.trips = self.tripmax = 0
        return trips, tripmax

    def lookdict(mp, key, _hash):
        me_hash, me_key, me_value = range(3) # rec slots
        dummy = mp.dummy

        mask = mp.ma_size-1
        ep0 = mp.ma_table
        i = (~_hash) & mask
        ep = ep0[i]
        if ep[me_key] is NULL or ep[me_key] == key:
            return ep
        if ep[me_key] == dummy:
            freeslot = ep
        else:
            if (ep[me_hash] == _hash and
                cmp(ep[me_key], key) == 0) :
                return ep
            freeslot = NULL

###### FROM HERE
        if mp.oldalg:
            incr = (_hash ^ (_hash >> 3)) & mask
        else:
            # note that we do not mask!
            # the shifting is worth it in the incremental case.

            ## added after posting to python-dev:
            uhash = _hash & 0xffffffffl
            if mp.warmup:
                incr = uhash
                mask2 = 0xffffffffl ^ mask
                while mask2 > mask:
                    if (incr & 1):
                        incr = incr ^ mp.ma_poly
                    incr = incr >> 1
                    mask2 = mask2>>1
                # this loop *can* be sped up by tables
                # with precomputed multiple shifts.
                # But I'm not sure if it is worth it at all.
            else:
                incr = uhash ^ (uhash >> 3)

###### TO HERE

        if (not incr):
            incr = mask

        triplen = 0            
        while 1:
            mp.trips = mp.trips+1
            triplen = triplen+1
            if triplen > mp.tripmax:
                mp.tripmax = triplen

            ep = ep0[int((i+incr)&mask)]
            if (ep[me_key] is NULL) :
                if (freeslot is not NULL) :
                    return freeslot
                else:
                    return ep
            if (ep[me_key] == dummy) :
                if (freeslot == NULL):
                    freeslot = ep
            elif (ep[me_key] == key or
                 (ep[me_hash] == _hash and
                  cmp(ep[me_key], key) == 0)) :
                return ep

            # Cycle through GF(2^n)-{0}
###### FROM HERE
            if mp.oldalg:
                incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            elif mp.rotleft:
                if incr &0x80000000L:
                    incr = (incr << 1) | 1
                else:
                    incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            else:
                # new algorithm: do a division
                if (incr & 1):
                    incr = incr ^ mp.ma_poly
                incr = incr >> 1
###### TO HERE

    def insertdict(mp, key, _hash, value):
        me_hash, me_key, me_value = range(3) # rec slots

        ep = mp.lookdict(key, _hash)
        if (ep[me_value] is not NULL) :
            old_value = ep[me_value]
            ep[me_value] = value
        else :
            if (ep[me_key] is NULL):
                mp.ma_fill=mp.ma_fill+1
            ep[me_key] = key
            ep[me_hash] = _hash
            ep[me_value] = value
            mp.ma_used = mp.ma_used+1

    def dictresize(mp, minused):
        me_hash, me_key, me_value = range(3) # rec slots

        oldsize = mp.ma_size
        oldtable = mp.ma_table
        MINSIZE = 4
        newsize = MINSIZE
        for i in range(len(polys)):
            if (newsize > minused) :
                newpoly = polys[i]
                break
            newsize = newsize << 1
        else:
            return -1

        _nullentry = range(3)
        _nullentry[me_hash] = 0
        _nullentry[me_key] = NULL
        _nullentry[me_value] = NULL

        newtable = map(lambda x,y=_nullentry:y[:], range(newsize))      

        mp.ma_size = newsize
        mp.ma_poly = newpoly
        mp.ma_table = newtable
        mp.ma_fill = 0
        mp.ma_used = 0

        for ep in oldtable:
            if (ep[me_value] is not NULL):
                mp.insertdict(ep[me_key],ep[me_hash],ep[me_value])
        return 0

    # PyDict_GetItem
    def __getitem__(op, key):
        me_hash, me_key, me_value = range(3) # rec slots

        if not op.ma_table:
            raise KeyError, key
        _hash = hash(key)
        return op.lookdict(key, _hash)[me_value]

    # PyDict_SetItem
    def __setitem__(op, key, value):
        mp = op
        _hash = hash(key)
##      /* if fill >= 2/3 size, double in size */
        if (mp.ma_fill*3 >= mp.ma_size*2) :
            if (mp.dictresize(mp.ma_used*2) != 0):
                if (mp.ma_fill+1 > mp.ma_size):
                    raise MemoryError
        mp.insertdict(key, _hash, value)

    # more interface functions
    def keys(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _key)
        return res

    def values(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _value)
        return res

    def items(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( (_key, _value) )
        return res

    def __cmp__(self, other):
        mine = self.items()
        others = other.items()
        mine.sort()
        others.sort()
        return cmp(mine, others)

######################################################
## tests

def test(lis, dic):
    for key in lis: dic[key]

def nulltest(lis, dic):
    for key in lis: dic

def string_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    for i in range(n):
        s = str(i) #* 5
        #s = chr(i%256) + chr(i>>8)##
        d1[s] = d2[s] = d3[s] = d4[s] = i
    return d1, d2, d3, d4

def istring_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    for i in range(n):
        s = chr(i%256) + chr(i>>8)
        d1[s] = d2[s] = d3[s] = d4[s] = i
    return d1, d2, d3, d4

def random_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    from whrandom import randint
    import sys
    keys = []
    for i in range(n):
        keys.append(randint(0, sys.maxint-1))
    for i in keys:
        d1[i] = d2[i] = d3[i] = d4[i] = i
    return d1, d2, d3, d4

def badnum_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    shift = 10
    if EXTREME:
        shift = 16
    for i in range(n):
        bad = i << 16
        d2[bad] = d3[bad] = d4[bad] = i
        if n <= 1000: d1[bad] = i
    return d1, d2, d3, d4

def names_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    import win32con
    keys = win32con.__dict__.keys()
    if len(keys) < n:
        keys = []
    for s in keys[:n]:
        d1[s] = d2[s] = d3[s] = d4[s] = s
    return d1, d2, d3, d4

def do_test(dict):
    keys = dict.keys()
    dict.getTrips() # reset
    test(keys, dict)
    return "%d/%d" % dict.getTrips()

EXTREME=1

if __name__ == "__main__":

    for N in (1000,2000,3000,4000):    

        sdold, sdnew, sdnew2, sdrot = string_dicts(N)
        #idold, idnew, idnew2, idrot = istring_dicts(N)
        bdold, bdnew, bdnew2, bdrot = badnum_dicts(N)
        rdold, rdnew, rdnew2, rdrot = random_dicts(N)
        ndold, ndnew, ndnew2, ndrot = names_dicts(N)
        fmt = "old=%s new=%s new2=%s rot=%s"
        print "N=%d" %N        
        print ("trips for strings "+fmt) % tuple(
            map(do_test, (sdold, sdnew, sdnew2, sdrot)) )
        #print ("trips for bin strings "+fmt) % tuple(
        #    map(do_test, (idold, idnew, idnew2, idrot)) )
        print ("trips for bad integers "+fmt) % tuple(
            map(do_test, (bdold, bdnew, bdnew2, bdrot)))
        print ("trips for random integers "+fmt) % tuple(
            map(do_test, (rdold, rdnew, rdnew2, rdrot)))
        print ("trips for windows names "+fmt) % tuple(
            map(do_test, (ndold, ndnew, ndnew2, ndrot)))

"""
Results with a shift of 10 (EXTREME=0):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.097 new=5.088
timing for bad integers old=101.540 new=12.610

Results with a shift of 16 (EXTREME=1):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.218 new=5.147
timing for bad integers old=571.210 new=19.220
"""