[Python-Dev] Nondeterministic long-to-float coercion

Thu Oct 19 23:28:27 CEST 2006

[Raymond Hettinger]
> My colleague got an odd result today that is reproducible on his build
> of Python (RedHat's distribution of Py2.4.2) but not any other builds
> I've checked (including an Ubuntu Py2.4.2 built with a later version of
> GCC).  I hypothesized that this was a bug in the underlying GCC
> libraries, but the magnitude of the error is so large that that seems
> implausible.
>
> Does anyone have a clue what is going-on?
>
> Python 2.4.2 (#1, Mar 29 2006, 11:22:09) [GCC 4.0.2 20051125 (Red Hat
> 4.0.2-8)] on linux2 Type "help", "copyright", "credits" or "license" for
> more information.
> >>> set(-19400000000 * (1/100.0) for i in range(10000))
> set([-194000000.0, -193995904.0, -193994880.0])

Note that the Hamming distance between -194000000.0 and -193995904.0
is 1, and ditto between -193995904.0 and -193994880.0, when viewed as
IEEE-754 doubles.  That is, 193995904.0 is "missing a bit" from
-194000000.0, and -193994880.0 is missing the same bit plus an
additional bit.  Maybe clearer, writing a function to show the hex
little-endian representation:

>>> def ashex(d):
...     return binascii.hexlify(struct.pack("<d", float(d)))
>>> ashex(-194000000)
'000000006920a7c1'
>>> ashex(-193995904)   # "the 2 bit" from "6" is missing, leaving 4
'000000004920a7c1'
>>> ashex(-193994880)   # and "the 8 bit" from "9" is missing, leaving 1
'000000004120a7c1'

More than anything else that suggests flaky memory, or "weak bits" in
a HW register or CPU<->FPU path.  IOW, it looks like a hardware
problem to me.

Note that the missing bits here don't coincide with a "natural"
software boundary -- screwing up a bit "in the middle of" a byte isn't
something software is prone to do.

You could try different inputs and see whether the same bits "go
missing", e.g. starting with a double with a lot of 1 bits lit.  Might
also try using these as keys to a counting dict to see how often they
go missing.