On Fri, Mar 7, 2014 at 5:05 PM, Steven D'Aprano
[...] The way Python converts between the two is the right way to do the conversion.
It's *exact*. I don't know we all agree it is the *right* way.
Given a decimal d and a float f constructed from d, f is the closest possible float to d. And the same applies for conversions the other way around.
It's actually stronger the other way around: when d is constricted from f, d is *equal* to the mathematical value of f. The issue (as I see it) is that there are many different decimals d that all convert to the same float f (because of rounding). The d that is constructed by taking the exact value of f is gross overkill. If d was constructed from f by always rounding to (I think) 17 digits it would still back convert exactly to f; the new float repr() gives us an even "better" (i.e. shorter) string of digits that are still guaranteed to to be converted into exactly f (because there's simply no other float that is closer to d than f). (This may only apply on platforms with IEEE-754 semantics, but I think
that's nearly all platforms CPython runs on these days.)
Sure.
Second, that was a bad decision--really--IMHO. It was the wrong decision for two reasons 1) it denies what is inherently poor about binary floating point representation in the first place,
What is inherently poor about binary floats is also poor for Decimal floats: limited precision, rounding errors, and violation of fundamental laws of the real number system. These problems, save one, apply equally to Decimal, just in different places with different numbers.
The *sole* difference is that with Decimal there is no rounding between what the user writes as a decimal string "1.23456" and what the value generated is. (Decimal also includes a bunch of extra features, like control of the precision, which floats don't support, but they *could* be supported if there was enough interest to justify the work.)
This benefit is mostly of use if you are dealing with (1) numerically naive users or (2) using Python as an interactive calculator.
In the case of (1), numerically naive users are not the most important subset of users for Python. In fact, I would say that numerically sophisticated users (scientists, mostly) are far more important to the Python ecosystem, and they *do not want your suggested change*.
What is the "suggested change" here? If it's "default float literals to Decimal" I agree. But if the suggestion is my "Decimal(<float>) should be implemented as Decimal(repr(<float>))" I don't think most scientists care (few of them use Decimal, they stay entirely in the realm of binary floating point).
If we forced Decimals on them, they would be around with flaming torches and pitchforks. Seriously, the numeric community would likely abandon Python (with much wailing and gnashing of teeth) if we forced this on them.
Right. In Mark's "post of the day" he already accepts that "decimal by default" is untenable -- and my claim is that even if it was desirable, it would be too big an undertaking. So it's dead.
In the case of (2), anyone who has used a common pocket calculator is used to seeing values calculated with a string of 9s or 0s at the end:
2.01000000000001 2.00999999999999
instead of the expected 2.01.
But the cause is calculations, like 1/3 ==> 0.333333333 and then multiply by 3 ==> 0.999999999. If you enter 2.1, a pocket calculater displays 2.1.
In fact, even on advanced scientific calculators which use decimal floats internally, such as the TI-89, we still *frequently* witness this problem.
I suspect that if Python exhibited the problem in exactly those cases where a TI-89 exhibits it, few people would complain. The complaints would most likely come from numerically *sophisticated* users who object against using decimal (since binary has much better rounding behavior for arithmetic operations).
It seems to me that the average numerically naive user has simply learned that "calculators and computers do approximate calculations" (which is actually correct, when compared to the mathematically exact result), and don't particularly care too much about it.
And yet I have seen plenty of Tweets, emails and other public discussions where people gleefully pointed out that "Python has a bug." I think I've also seen tracker items created for such issues.
But if *you* care, that's okay. Python is a programming language. You can *program* it to make an interactive calculator with whatever properties you desire, including using Decimal floats exclusively.
That sounds a bit condescending. :-( Python's done 97% of the work for you: the Decimal module, and the cmd
module which makes building interactive command-line oriented applications easy. You can even simulate the Python interactive interpreter.
In other words, if you don't want to convert from floats to Decimal, there is *absolutely no reason* why you should.
and 2) further compounds the problem by denying what is the benefit of using Decimal floating point in the first place.
I must admit I'm not clear as to what you think is the benefit of Decimal which is being denied.
If I may venture a guess, I think the assumed benefit of Decimal here is that it doesn't shatter naive users' expectations.
Not to mention, it gives the wrong answer; from a decimal floating point perspective it gives a VERY wrong answer.
Now that's simply not true. The relative error between the "right" answer:
py> from decimal import Decimal as D py> a = D("2.01").sqrt()**2
and the "wrong" answer:
py> b = D(2.01).sqrt()**2
is *miniscule*, just one part in ten thousand million million:
py> (a-b)/a Decimal('1.060511545905472636815920399E-16')
This feels like unnecessary pedantry. (Not the first time in this thread. :-( )
[...]
What needs to happen is that binary floats need to be "correctly" promoted to decimal floats as appropriate. This must not happen by simple copying (that does not work).
If I have given you the impression that float to Decimal conversion occurs by copying the bits from one into the other, I apologise, that was not my intention. That is not what happens.
What actually happens is that given any float, not just floats that are typed in by hand by the user, are converted to the closest possible Decimal.
Well, actually, a Decimal that is mathematically *equal*. (I understand that this is technically the "closest possible" but the use of the latter phrase suggests that it's not always *equal*, which the actual algorithm used always produces a value that is mathematically equal to the input float.)
What you appear to want is for Python to inspect the value of a float like 2.0099999999999997868 (the actual float you get from typing 2.01) and intelligently decide that what you *actually* wanted is the decimal 2.01.
Even if this intelligence was possible, it has one serious flaw that cripples the whole exercise. It cannot apply only to numbers typed in by the user, but would have to apply across the board. Python cannot distinguish between these two cases:
Decimal(1.8703152) # Give me this precise decimal
and
x = 1519.63802016624 # result of some intermediate calculation y = 812.5037 # result of another intermediate calculation z = x/y # yet another intermediate calculation Decimal(z) # convert from float to nearest decimal
But z is the binary float 1.8703152, and Decimal cannot do one thing in the first case and a different thing in the second, because it cannot see where the number came from, only what it is. It does not and can not see the literal string, only the float.
In the first case, you want 1.8703152 because that's what the user typed, but in the second case, we *must* return the result:
Decimal('1.8703152000000000665380639475188218057155609130859375')
because that's the closest conversion, the one with the least error.
Steven, that's actually not true. print(z) produces 1.8703152, and if I enter float(Decimal('1.8703152')) I get the exact same value of z. Try it. (In Python 2.7 or 3, please.) So I claim that if Decimal(z) produced the same value as Decimal('1.8703152') nothing would be lost.
Rounding it to 7 decimal places may be sufficient for some applications, but that's not Python's decision to make, any more than Python can decide that if you type 1.99999998 that you must have meant exactly 2.
The repr() function does not round to a fixed number of decimals. It produces (in theory, dynamically, although I suspect that the current algorithm is better) the shortest decimal string that, when converted back to binary, equals *exactly* the input.
There needs to be some policy in place that will correctly "set" the decimal float value based on intelligence
Also condescending. :-( -- --Guido van Rossum (python.org/~guido)