[Python-Dev] String hash function multiplier
Bob Ippolito
bob at redivi.com
Tue Apr 13 21:30:28 EDT 2004
On Apr 13, 2004, at 9:09 PM, Jeff Epler wrote:
> With -O2 -mcpu=i686 or newer, gcc uses "imul" for both 100003 and
> 65599,
> rather than shifts and adds.
>
> There may be a few people who care about some other processor, but I
> wouldn't listen to them. (the only non-x86 CPU I program for on a
> weekly
> basis doesn't have hardware multiply, but it's also much too small for
> Python)
>
> The current value goes back a long way:
> http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Objects/
> stringobject.c#rev2.31
> ... all the way back to when Python did string haching instead of
> hashing.
>
> Other than some abstract beauty to 65599, are there some other
> practical
> advantages I'm missing?
It's not expected that GCC optimize an integer constant into shifts on
its own. Anyways, a practical advantage is that with a sane
instruction set, like PPC, it saves you a memory access or some
instructions (depending on the compiler I guess). Both 100003 and
65599 are too big to be immediate values in a PPC instruction, but the
shift constants are not.
I guess the real question for Raymond is, does it really make a
measurable difference? And what effect does it have on pickled dicts
(or other such hash-using data structures), if any?
-bob
More information about the Python-Dev
mailing list