<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Mar 7, 2014 at 5:05 PM, Steven D'Aprano <span dir="ltr"><<a href="mailto:steve@pearwood.info" target="_blank">steve@pearwood.info</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">[...] The way Python converts between the two is the right<br>
way to do the conversion.</blockquote><div><br></div><div>It's *exact*. I don't know we all agree it is the *right* way.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Given a decimal d and a float f constructed<br>
from d, f is the closest possible float to d. And the same applies for<br>
conversions the other way around.<br></blockquote><div><br></div><div>It's actually stronger the other way around: when d is constricted from f, d is *equal* to the mathematical value of f.<br><br></div><div>The issue (as I see it) is that there are many different decimals d that all convert to the same float f (because of rounding). The d that is constructed by taking the exact value of f is gross overkill. If d was constructed from f by always rounding to (I think) 17 digits it would still back convert exactly to f; the new float repr() gives us an even "better" (i.e. shorter) string of digits that are still guaranteed to to be converted into exactly f (because there's simply no other float that is closer to d than f).<br>
</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
(This may only apply on platforms with IEEE-754 semantics, but I think<br>
that's nearly all platforms CPython runs on these days.)<br></blockquote><div><br></div><div>Sure.<br> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> Second, that was a bad decision--really--IMHO. It was the wrong<br>
> decision for two reasons 1) it denies what is inherently poor about<br>
> binary floating point representation in the first place,<br>
<br>
What is inherently poor about binary floats is also poor for Decimal<br>
floats: limited precision, rounding errors, and violation of fundamental<br>
laws of the real number system. These problems, save one, apply equally<br>
to Decimal, just in different places with different numbers.<br>
<br>
The *sole* difference is that with Decimal there is no rounding between<br>
what the user writes as a decimal string "1.23456" and what the value<br>
generated is. (Decimal also includes a bunch of extra features, like<br>
control of the precision, which floats don't support, but they *could*<br>
be supported if there was enough interest to justify the work.)<br>
<br>
This benefit is mostly of use if you are dealing with (1) numerically<br>
naive users or (2) using Python as an interactive calculator.<br>
<br>
In the case of (1), numerically naive users are not the most important<br>
subset of users for Python. In fact, I would say that numerically<br>
sophisticated users (scientists, mostly) are far more important to the<br>
Python ecosystem, and they *do not want your suggested change*.</blockquote><div><br></div><div>What is the "suggested change" here? If it's "default float literals to Decimal" I agree. But if the suggestion is my "Decimal(<float>) should be implemented as Decimal(repr(<float>))" I don't think most scientists care (few of them use Decimal, they stay entirely in the realm of binary floating point).<br>
</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">If we<br>
forced Decimals on them, they would be around with flaming torches and<br>
pitchforks. Seriously, the numeric community would likely abandon Python<br>
(with much wailing and gnashing of teeth) if we forced this on them.<br></blockquote><div><br></div><div>Right. In Mark's "post of the day" he already accepts that "decimal by default" is untenable -- and my claim is that even if it was desirable, it would be too big an undertaking. So it's dead.<br>
<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
In the case of (2), anyone who has used a common pocket calculator is<br>
used to seeing values calculated with a string of 9s or 0s at the end:<br>
<br>
2.01000000000001<br>
2.00999999999999<br>
<br>
instead of the expected 2.01.</blockquote><div><br></div><div>But the cause is calculations, like 1/3 ==> 0.333333333 and then multiply by 3 ==> 0.999999999. If you enter 2.1, a pocket calculater displays 2.1.<br></div>
<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">In fact, even on advanced scientific<br>
calculators which use decimal floats internally, such as the TI-89, we<br>
still *frequently* witness this problem.</blockquote><div><br></div><div>I suspect that if Python exhibited the problem in exactly those cases where a TI-89 exhibits it, few people would complain. The complaints would most likely come from numerically *sophisticated* users who object against using decimal (since binary has much better rounding behavior for arithmetic operations).<br>
</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">It seems to me that the average<br>
numerically naive user has simply learned that "calculators and<br>
computers do approximate calculations" (which is actually correct, when<br>
compared to the mathematically exact result), and don't particularly<br>
care too much about it.<br></blockquote><div><br></div><div>And yet I have seen plenty of Tweets, emails and other public discussions where people gleefully pointed out that "Python has a bug." I think I've also seen tracker items created for such issues.<br>
</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
But if *you* care, that's okay. Python is a programming language. You<br>
can *program* it to make an interactive calculator with whatever<br>
properties you desire, including using Decimal floats exclusively.<br></blockquote><div><br></div><div>That sounds a bit condescending. :-(<br><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Python's done 97% of the work for you: the Decimal module, and the cmd<br>
module which makes building interactive command-line oriented<br>
applications easy. You can even simulate the Python interactive<br>
interpreter.<br>
<br>
In other words, if you don't want to convert from floats to Decimal,<br>
there is *absolutely no reason* why you should.<br>
<br>
<br>
> and 2) further compounds the problem by denying what is the benefit of<br>
> using Decimal floating point in the first place.<br>
<br>
I must admit I'm not clear as to what you think is the benefit of<br>
Decimal which is being denied.<br></blockquote><div><br></div><div>If I may venture a guess, I think the assumed benefit of Decimal here is that it doesn't shatter naive users' expectations.<br></div><div> <br></div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> Not to mention, it gives the wrong answer; from a decimal floating<br>
> point perspective it gives a VERY wrong answer.<br>
<br>
Now that's simply not true. The relative error between the "right"<br>
answer:<br>
<br>
py> from decimal import Decimal as D<br>
py> a = D("2.01").sqrt()**2<br>
<br>
and the "wrong" answer:<br>
<br>
py> b = D(2.01).sqrt()**2<br>
<br>
is *miniscule*, just one part in ten thousand million million:<br>
<br>
py> (a-b)/a<br>
Decimal('1.060511545905472636815920399E-16')<br></blockquote><div><br></div><div>This feels like unnecessary pedantry. (Not the first time in this thread. :-( )<br> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
[...]<br>
> What needs to happen is that binary floats need to be "correctly"<br>
> promoted to decimal floats as appropriate. This must not happen by<br>
> simple copying (that does not work).<br>
<br>
If I have given you the impression that float to Decimal conversion<br>
occurs by copying the bits from one into the other, I apologise, that<br>
was not my intention. That is not what happens.<br>
<br>
What actually happens is that given any float, not just floats that are<br>
typed in by hand by the user, are converted to the closest possible<br>
Decimal.</blockquote><div><br></div><div>Well, actually, a Decimal that is mathematically *equal*. (I understand that this is technically the "closest possible" but the use of the latter phrase suggests that it's not always *equal*, which the actual algorithm used always produces a value that is mathematically equal to the input float.)<br>
</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">What you appear to want is for Python to inspect the value of a<br>
float like 2.0099999999999997868 (the actual float you get from typing<br>
2.01) and intelligently decide that what you *actually* wanted is the<br>
decimal 2.01.<br>
<br>
Even if this intelligence was possible, it has one serious flaw that<br>
cripples the whole exercise. It cannot apply only to numbers typed in by<br>
the user, but would have to apply across the board. Python cannot<br>
distinguish between these two cases:<br>
<br>
Decimal(1.8703152) # Give me this precise decimal<br>
<br>
and<br>
<br>
x = 1519.63802016624 # result of some intermediate calculation<br>
y = 812.5037 # result of another intermediate calculation<br>
z = x/y # yet another intermediate calculation<br>
Decimal(z) # convert from float to nearest decimal<br>
<br>
But z is the binary float 1.8703152, and Decimal cannot do one thing in<br>
the first case and a different thing in the second, because it cannot<br>
see where the number came from, only what it is. It does not and can<br>
not see the literal string, only the float.<br>
<br>
In the first case, you want 1.8703152 because that's what the user<br>
typed, but in the second case, we *must* return the result:<br>
<br>
Decimal('1.8703152000000000665380639475188218057155609130859375')<br>
<br>
because that's the closest conversion, the one with the least error.<br></blockquote><div><br></div><div>Steven, that's actually not true. print(z) produces 1.8703152, and if I enter float(Decimal('1.8703152')) I get the exact same value of z. Try it. (In Python 2.7 or 3, please.) So I claim that if Decimal(z) produced the same value as Decimal('1.8703152') nothing would be lost.<br>
</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Rounding it to 7 decimal places may be sufficient for some applications,<br>
but that's not Python's decision to make, any more than Python can<br>
decide that if you type 1.99999998 that you must have meant exactly 2.<br></blockquote><div><br></div><div>The repr() function does not round to a fixed number of decimals. It produces (in theory, dynamically, although I suspect that the current algorithm is better) the shortest decimal string that, when converted back to binary, equals *exactly* the input.<br>
<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> There needs to be some policy in place that will correctly "set" the<br>
> decimal float value based on intelligence<br>
<br>
<a href="http://www.catb.org/jargon/html/D/DWIM.html" target="_blank">http://www.catb.org/jargon/html/D/DWIM.html</a></blockquote><div><br></div><div>Also condescending. :-(<br></div></div><br>-- <br>--Guido van Rossum (<a href="http://python.org/~guido">python.org/~guido</a>)
</div></div>