From: Guido van Rossum firstname.lastname@example.org Sent: Friday, March 7, 2014 10:42 PM
[CC back to the list because you posted the same argument there but without the numerical example, and my working through that might help others understand your point]
Thank you for presenting my point better than I could; it's a lot clearer this way.
On Fri, Mar 7, 2014 at 9:18 PM, Andrew Barnert email@example.com wrote:
The main point I'm getting at is that by rounding 0.100000000000000012 to 0.1 instead of 0.10000000000000000555..., You're no longer rounding it to the nearest binary float, but instead to the second nearest Decimal(repr(binary float)) (since 0.10000000000000002 is closer than 0.1).
OK, let me walk through that carefully. Let's name the exact mathematical values and assign them to strings:
a = '0.100000000000000012' b = '0.1000000000000000055511151231257827021181583404541015625' c = '0.10000000000000002'
Today, Decimal(float(a)) == Decimal(b). Under my proposal, Decimal(float(a)) == Decimal('0.1'). The difference between float('0.1') and float(c) is 1 ulp (2**-56), and a is between those, but closer to c; but it is even closer to b (in the other direction). IOW for the mathematical values, 0.1 < b < a < c, where a is closer to b than to c, So if the choices for rounding a would be b or c, b is preferred. So far so good. (And still good if we replace c with the slightly smaller exact value of float(c).)
And your point is that if we change the allowable choices to '0.1' or c, we find that float(b) == float('0.1'), but a is closer to c than to 0.1. This is less than 1 ulp, but more than 0.5 ulp.
Yes. It's the same two problems that inspired Clinger's correct rounding papers: it does not have the closest-match property, and it can lose almost twice as much accuracy. But the context is very different, so I'm not sure Clinger's arguments are relevant here.
: http://citeseer.ist.psu.edu/william90how.html : ftp://ftp.ccs.neu.edu/pub/people/will/retrospective.pdf
I find the argument intriguing, but I blame it more on what happens in float(a) than in what Decimal() does to the resulting value. If you actually had the string a, and wanted to convert it to Decimal, you would obviously write Decimal(a), not Decimal(float(a)), so this is really only a problem when someone uses a as a literal in a program that is passed to Decimal, i.e. Decimal(0.100000000000000012).
Agreed on both counts. However, the entire problem you're trying to solve here is caused by what happens in float(a). You're effectively attempting to recover the information lost in float() in Decimal(), in a way that often does what people want, and otherwise never does anything too bad.
So long as giving up the correct-rounding property, doubling the error (but still staying under 1 ulp), and skewing the distribution of Decimals created this way (by less than 0.5 ulp, but possibly in a way that can accumulate) are not "too bad", I believe your proposal succeeds completely.
That's slightly unfortunate, but easy to fix by adding quotes.
Yes, but the motivating example to this whole thread, Decimal(1.1), is just as easy to fix by using quotes.
I think I can see the distinction: Novices don't know to use quotes; people trying to implement numerical recipes in Python do (or at least really, really should); therefore a change that helps the former but hurts the latter, when they both leave off the quotes, is a net gain. Yes?