New subject: [Patches] fix float_hash and complex_hash for 64-bit *nix

May 10, 2000


      On Wed, May 10, 2000 at 12:53:16AM -0400, Tim Peters wrote:
...
[Trent Mick]
...
Discussion:
Okay, it is debatable to call float_hash and complex_hash broken,
but their code presumed that sizeof(long) was 32-bits. As a result
the hashed values for floats and complex values were not the same
on a 64-bit *nix system as on a 32-bit *nix system. With this
patch they are.
The goal is laudable but the analysis seems flawed.  For example, this new
comment:
Firstly, I should have admitted my ignorance with regards to hash functions.
...
Looks to me like the real problem in the original was here:
x = hipart + (long)fractpart + (long)intpart + (expo << 15);
                                   ^^^^^^^^^^^^^
The difficulty is that intpart may *not* fit in 32 bits, so the cast of
intpart to long is ill-defined when sizeof(long) == 4.
...
That is, the hash function truly is broken for "large" values with a
fractional part, and I expect your after-patch code suffers the same
problem:
Yes it did.
...
The
solution to this is to break intpart in this branch into pieces no larger
than 32 bits too
Okay here is another try (only for floatobject.c) for discussion. If it looks
good then I will submit a patch for float and complex objects. So do the same
for 'intpart' as was done for 'fractpart'.


static long
float_hash(v)
    PyFloatObject *v;
{
    double intpart, fractpart;
    long x;

    fractpart = modf(v->ob_fval, &intpart);

    if (fractpart == 0.0) {
		// ... snip ...
    }
    else {
        int expo;
        long hipart;

        fractpart = frexp(fractpart, &expo);
        fractpart = fractpart * 2147483648.0; 
        hipart = (long)fractpart; 
        fractpart = (fractpart - (double)hipart) * 2147483648.0;

        x = hipart + (long)fractpart + (expo << 15); /* combine the fract parts */

        intpart = frexp(intpart, &expo);
        intpart = intpart * 2147483648.0;
        hipart = (long)intpart;
        intpart = (intpart - (double)hipart) * 2147483648.0;

        x += hipart + (long)intpart + (expo << 15); /* add in the int parts */
    }
    if (x == -1)
        x = -2;
    return x;
}
...
Note this consequence under the Win32 Python:
With this change, on Linux32:
...
...
...
base = 2.**40 + 0.5
base
1099511627776.5
for i in range(32, 45):
...     x = base + 2.**i
...     print x, hash(x)
...
1.10380659507e+12 -2141945856
1.10810156237e+12 -2137751552
1.11669149696e+12 -2129362944
1.13387136614e+12 -2112585728
1.16823110451e+12 -2079031296
1.23695058125e+12 -2011922432
1.37438953472e+12 -1877704704
1.64926744166e+12 -1609269248
2.19902325555e+12 -2146107392
3.29853488333e+12 -1609236480
5.49755813888e+12 -1877639168
9.89560464998e+12 -2011824128
1.86916976722e+13 -2078900224
On Linux64:
...
...
...
base = 2.**40 + 0.5
base
1099511627776.5
for i in range(32, 45):
...     x = base + 2.**i
...     print x, hash(x)
...
1.10380659507e+12 2153021440
1.10810156237e+12 2157215744
1.11669149696e+12 2165604352
1.13387136614e+12 2182381568
1.16823110451e+12 2215936000
1.23695058125e+12 2283044864
1.37438953472e+12 2417262592
1.64926744166e+12 2685698048
2.19902325555e+12 2148859904
3.29853488333e+12 2685730816
5.49755813888e+12 2417328128
9.89560464998e+12 2283143168
1.86916976722e+13 2216067072
...
-- and that should also fix your 64-bit woes "by magic".
As you can see it did not, but for another reason. The summation of the parts
overflows 'x'. Is this a problem? I.e., does it matter if a hash function
returns an overflowed integral value (my hash function ignorance is showing)?
And if this does not matter, does it matter that a hash returns different
values on different platforms?
...
a hash function should never ignore any bit in its input.
Which brings up a question regarding instance_hash(), func_hash(),
meth_hash(), HKEY_hash() [or whatever it is called], and other which cast a
pointer to a long (discarding the upperhalf of the pointer on Win64). Do
these really need to be fixed. Am I nitpicking too much on this whole thing?


Thanks,
Trent

-- 
Trent Mick
trentm@activestate.com

Re: [Patches] fix float_hash and complex_hash for 64-bit *nix

Trent Mick

Tim Peters

Guido van Rossum

Tim Peters

tags

participants (3)