[Tutor] int(1.99...99) = 1 and can = 2
Steven D'Aprano
steve at pearwood.info
Sun May 1 06:43:38 EDT 2016
On Sun, May 01, 2016 at 01:02:50AM -0500, boB Stepp wrote:
> Life has kept me from Python studies since March, but now I resume.
> Playing around in the interpreter I tried:
>
> py3: 1.9999999999999999
> 2.0
> py3: 1.999999999999999
> 1.999999999999999
Correct. Python floats carry 64 bits of value, which means in
practice that they can carry about 16-17 significant figures in
decimal.
Starting with Python 2.6, floats have "hex" and "fromhex" methods which
allow you to convert them to and from base 16, which is more compact
than the base 2 used internally but otherwise equivalent.
https://docs.python.org/2/library/stdtypes.html#float.hex
So here is your second example, shown in hex so we can get a better
idea of the internal details:
py> (1.999999999999999).hex()
'0x1.ffffffffffffbp+0'
The "p+0" at the end shows the exponent, as a power of 2. (It can't use
"e" or "E" like decimal, because that would be confused with the hex
digit "e".)
You can see that the last hex digit is "b". If we add an extra digit to
the end of the decimal 1.999999999999999, that final digit increases
until we reach:
py> (1.9999999999999997).hex()
'0x1.fffffffffffffp+0'
1.9999999999999998 also gives us the same result. More on this later.
If we increase the final decimal digit one more, we get:
py> (1.9999999999999999).hex()
'0x1.0000000000000p+1'
which is equal to decimal 2: a mantissa of 1 in hex, an exponent of 1
in decimal, which gives 1*2**1 = 2.
Given that we only have 64 bits for a float, and some of them are used
for the exponent and the sign, it is invariable that conversions to and
from decimal must be inexact. Remember that I mentioned that both
1.9999999999999997 and 1.9999999999999998 are treated as the same float?
That is because a 64-bit binary float does not have enough binary
decimal places to distinguish them. You would need more than 64 bits to
tell them apart. And so, following the IEEE-754 standard (the best
practice for floating point arithmetic), both numbers are rounded to the
nearest possible float.
Why the nearest possible float? Because any other choice, such as
"always round down", or "always round up", or "round up on Tuesdays",
will have *larger* rounding errors. Rounding errors are inescapable, but
we can do what we can to keep them as small as possible. So, decimal
strings like 1.999...97 generate the binary float with the smallest
possible error.
(In fact, the IEEE-754 standard requires that the rounding mode be
user-configurable. Unfortunately, most C maths library do not provide
that functionality, or if they do, it is not reliable.)
A diagram might help make this more clear. This ASCII art is best viewed
using a fixed-width font like Courier.
Suppose we look at every single float between 1 and 2. Since they use
a finite number of bits, there are a finite number of equally spaced
floats between any two consecutive whole numbers. But because they are
in binary, not decimal, they won't match up with decimal floats except
for numbers like 0.5, 0.25 etc. So:
1 _____ | _____ | _____ | _____ | ... | _____ | _____ | _____ 2
---------------------------------------------------^----^---^
a b c
The first arrow ^ marked as "a" represents the true position of
1.999...97 and the second, "b", represents the true position of
1.999...98. Since they don't line up exactly with the binary float
0x1.ffff....ff, there is some rounding error, but it is the smallest
error possible.
The third arrow, marked as "c", represents 1.999...99.
> py3: int(1.9999999999999999)
> 2
> py3: int(1.999999999999999)
> 1
The int() function always truncates. So in the first place, your float
starts off as 2.0 (as seen above), and then int() truncates it to 2.0.
The second case starts off as with a float 1.9999... which is
'0x1.ffffffffffffbp+0'
which int() then truncates to 1.
> It has been many years since I did problems in converting decimal to
> binary representation (Shades of two's-complement!), but I am under
> the (apparently mistaken!) impression that in these 0.999...999
> situations that the floating point representation should not go "up"
> in value to the next integer representation.
In ancient days, by which I mean the earlier than the 1980s, there was
no agreement on how floats should be rounded by computer manufacturers.
Consequently they all used their own rules, which contradicted the rules
used by other manufacturers, and sometimes even their own. But in the
early 80s, a consortium of companies including Apple, Intel and others
got together and agreed on best practices (give or take a few
compromises) for computer floating point maths. One of those is that the
default rounding mode should be round to nearest, so as to minimize the
errors. Otherwise, if you always round down, then errors accumulate
faster.
We can test this with the fractions and decimal modules:
py> from fractions import Fraction
py> f = Fraction(0)
py> for i in range(1, 100):
... f += Fraction(1)/i
...
py> f
Fraction(360968703235711654233892612988250163157207,
69720375229712477164533808935312303556800)
py> float(f)
5.17737751763962
So that tells us the exact result of adding the recipricals of 1 through
99, and the nearest binary float. Now let's do it again, only this time
with only limited precision:
py> from decimal import *
py> d = Decimal(0)
py> with localcontext() as ctx:
... ctx.prec = 5
... for i in range(1, 100):
... d += Decimal(1)/i
...
py> d
Decimal('5.1773')
That's not too bad: four out of the five significant figures are
correct, the the fifth is only off by one. (It should be 5.1774 if we
added exactly, and rounded only at the end.) But if we change to always
round down:
py> d = Decimal(0)
py> with localcontext() as ctx:
... ctx.prec = 5
... ctx.rounding = ROUND_DOWN
... for i in range(1, 100):
... d += Decimal(1)/i
...
py> d
Decimal('5.1734')
we're now way off: only three significant figures are correct, and the
fourth is off by 4.
Obviously this is an extreme case, for demonstration purposes only. But
the principle is the same for floats: the IEEE 754 promise to keep
simple arithmetic is correctly rounded ensures that errors are as small
as possible.
--
Steve
More information about the Tutor
mailing list