[Python-Dev] String hash function multiplier

Bob Ippolito bob at redivi.com
Tue Apr 13 21:30:28 EDT 2004


On Apr 13, 2004, at 9:09 PM, Jeff Epler wrote:

> With -O2 -mcpu=i686 or newer, gcc uses "imul" for both 100003 and  
> 65599,
> rather than shifts and adds.
>
> There may be a few people who care about some other processor, but I
> wouldn't listen to them. (the only non-x86 CPU I program for on a  
> weekly
> basis doesn't have hardware multiply, but it's also much too small for
> Python)
>
> The current value goes back a long way:
> http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Objects/ 
> stringobject.c#rev2.31
> ... all the way back to when Python did string haching instead of
> hashing.
>
> Other than some abstract beauty to 65599, are there some other  
> practical
> advantages I'm missing?

It's not expected that GCC optimize an integer constant into shifts on  
its own.  Anyways, a practical advantage is that with a sane  
instruction set, like PPC, it saves you a memory access or some  
instructions (depending on the compiler I guess).  Both 100003 and  
65599 are too big to be immediate values in a PPC instruction, but the  
shift constants are not.

I guess the real question for Raymond is, does it really make a  
measurable difference?  And what effect does it have on pickled dicts  
(or other such hash-using data structures), if any?

-bob




More information about the Python-Dev mailing list