Re: [Python-ideas] Python Numbers as Human Concept Decimal System

8 Mar 2014

      [CC back to the list because you posted the same argument there but without
the numerical example, and my working through that might help others
understand your point]

On Fri, Mar 7, 2014 at 9:18 PM, Andrew Barnert  wrote:
...
The main point I'm getting at is that by rounding 0.100000000000000012 to
0.1 instead of 0.10000000000000000555..., You're no longer rounding it to
the nearest binary float, but instead to the second nearest
Decimal(repr(binary float)) (since 0.10000000000000002 is closer than
0.1).
OK, let me walk through that carefully. Let's name the exact mathematical
values and assign them to strings:
...
...
...
a = '0.100000000000000012'
b = '0.1000000000000000055511151231257827021181583404541015625'
c = '0.10000000000000002'
Today, Decimal(float(a)) == Decimal(b). Under my proposal,
Decimal(float(a)) == Decimal('0.1'). The difference between float('0.1')
and float(c) is 1 ulp (2**-56), and a is between those, but closer to c;
but it is even closer to b (in the other direction). IOW for the
mathematical values, 0.1 < b < a < c, where a is closer to b than to c, So
if the choices for rounding a would be b or c, b is preferred. So far so
good. (And still good if we replace c with the slightly smaller exact value
of float(c).)

And your point is that if we change the allowable choices to '0.1' or c, we
find that float(b) == float('0.1'), but a is closer to c than to 0.1. This
is less than 1 ulp, but more than 0.5 ulp.

I find the argument intriguing, but I blame it more on what happens in
float(a) than in what Decimal() does to the resulting value. If you
actually had the string a, and wanted to convert it to Decimal, you would
obviously write Decimal(a), not Decimal(float(a)), so this is really only a
problem when someone uses a as a literal in a program that is passed to
Decimal, i.e. Decimal(0.100000000000000012).

That's slightly unfortunate, but easy to fix by adding quotes. The only
place where I think something like this might occur in real life is when
someone copies a numerical recipe involving some very precise constants,
and mindlessly applies Decimal() without string quotes to the constants.
But that's a "recipe" for failure anyway, since if the recipe really uses
more precision than IEEE double can handle, *with* the quotes the recipe
would be calculated more exactly anyway. Perhaps another scenario would be
if the constant was calculated (by the recipe-maker) within 0.5 ulp using
IEEE double and rendered with exactly the right number of digits.

But these scenarios sound like either they should use the quotes anyway, or
the calculation would be better off done in double rather than Decimal. So
I think it's still pretty much a phantom problem.
...
Of course that's not true for all reals (0.1 being the obvious
counterexample), but it's true for some with your proposal, while today
it's true for none. So the mean absolute error in Decimal(repr(f)) across
any range of reals is inherently higher than Decimal.from_float(f). Put
another way, you're adding additional rounding error. That additional
rounding error is still less than the rule-of-thumb cutoff that people use
when talking about going through float, but it's nonzero and not guaranteed
to cancel out.
On top of that, the distribution of binary floats is uniform (well, more
complicated than uniform because they have an exponent as well as a
mantissa, but you know what I mean); the distribution of closest-repr
values to binary floats is not.
I have no idea whether either of these are properties that users of
Decimal (or, rather, Decimal and float together) care about. But they are
properties that Decimal(float) has today that would be lost.
-- 
--Guido van Rossum (python.org/~guido)