[Python-ideas] Python Numbers as Human Concept Decimal System
Steven D'Aprano
steve at pearwood.info
Sat Mar 8 02:05:26 CET 2014
On Fri, Mar 07, 2014 at 11:01:01AM -0800, Mark H. Harris wrote:
> Here is the main post for today, from me, having slept on this and
> putting some things back into historical timeframe. Sometime before
> Py2.7.x it was not possible to promote a binary float to a decimal
> float; the system threw an exception (convert to string).
>
> First, that was the correct way to handle this issue from the
> inception of decimal.Decimal. Throw an exception and let everyone know
> that there is a problem; then let the user figure out how to solve it.
No, it was the *conservative* way to handle the issue. By disallowing
the feature, it was possible to add it later without breaking backwards
compatibility. If it had been allowed from the beginning, it would have
been impossible to remove the feature if it had turned out to be a
mistake.
> It is not possible to correctly promote a binary float to a decimal
> float. The decision (in Py2.7?) was to NOT throw an exception and to
> copy the binary float "exactly" into the decimal float.
That's incorrect. The way Python converts between the two is the right
way to do the conversion. Given a decimal d and a float f constructed
from d, f is the closest possible float to d. And the same applies for
conversions the other way around.
(This may only apply on platforms with IEEE-754 semantics, but I think
that's nearly all platforms CPython runs on these days.)
> Second, that was a bad decision--really--IMHO. It was the wrong
> decision for two reasons 1) it denies what is inherently poor about
> binary floating point representation in the first place,
What is inherently poor about binary floats is also poor for Decimal
floats: limited precision, rounding errors, and violation of fundamental
laws of the real number system. These problems, save one, apply equally
to Decimal, just in different places with different numbers.
The *sole* difference is that with Decimal there is no rounding between
what the user writes as a decimal string "1.23456" and what the value
generated is. (Decimal also includes a bunch of extra features, like
control of the precision, which floats don't support, but they *could*
be supported if there was enough interest to justify the work.)
This benefit is mostly of use if you are dealing with (1) numerically
naive users or (2) using Python as an interactive calculator.
In the case of (1), numerically naive users are not the most important
subset of users for Python. In fact, I would say that numerically
sophisticated users (scientists, mostly) are far more important to the
Python ecosystem, and they *do not want your suggested change*. If we
forced Decimals on them, they would be around with flaming torches and
pitchforks. Seriously, the numeric community would likely abandon Python
(with much wailing and gnashing of teeth) if we forced this on them.
In the case of (2), anyone who has used a common pocket calculator is
used to seeing values calculated with a string of 9s or 0s at the end:
2.01000000000001
2.00999999999999
instead of the expected 2.01. In fact, even on advanced scientific
calculators which use decimal floats internally, such as the TI-89, we
still *frequently* witness this problem. It seems to me that the average
numerically naive user has simply learned that "calculators and
computers do approximate calculations" (which is actually correct, when
compared to the mathematically exact result), and don't particularly
care too much about it.
But if *you* care, that's okay. Python is a programming language. You
can *program* it to make an interactive calculator with whatever
properties you desire, including using Decimal floats exclusively.
Python's done 97% of the work for you: the Decimal module, and the cmd
module which makes building interactive command-line oriented
applications easy. You can even simulate the Python interactive
interpreter.
In other words, if you don't want to convert from floats to Decimal,
there is *absolutely no reason* why you should.
> and 2) further compounds the problem by denying what is the benefit of
> using Decimal floating point in the first place.
I must admit I'm not clear as to what you think is the benefit of
Decimal which is being denied.
> Not to mention, it gives the wrong answer; from a decimal floating
> point perspective it gives a VERY wrong answer.
Now that's simply not true. The relative error between the "right"
answer:
py> from decimal import Decimal as D
py> a = D("2.01").sqrt()**2
and the "wrong" answer:
py> b = D(2.01).sqrt()**2
is *miniscule*, just one part in ten thousand million million:
py> (a-b)/a
Decimal('1.060511545905472636815920399E-16')
[...]
> What needs to happen is that binary floats need to be "correctly"
> promoted to decimal floats as appropriate. This must not happen by
> simple copying (that does not work).
If I have given you the impression that float to Decimal conversion
occurs by copying the bits from one into the other, I apologise, that
was not my intention. That is not what happens.
What actually happens is that given any float, not just floats that are
typed in by hand by the user, are converted to the closest possible
Decimal. What you appear to want is for Python to inspect the value of a
float like 2.0099999999999997868 (the actual float you get from typing
2.01) and intelligently decide that what you *actually* wanted is the
decimal 2.01.
Even if this intelligence was possible, it has one serious flaw that
cripples the whole exercise. It cannot apply only to numbers typed in by
the user, but would have to apply across the board. Python cannot
distinguish between these two cases:
Decimal(1.8703152) # Give me this precise decimal
and
x = 1519.63802016624 # result of some intermediate calculation
y = 812.5037 # result of another intermediate calculation
z = x/y # yet another intermediate calculation
Decimal(z) # convert from float to nearest decimal
But z is the binary float 1.8703152, and Decimal cannot do one thing in
the first case and a different thing in the second, because it cannot
see where the number came from, only what it is. It does not and can
not see the literal string, only the float.
In the first case, you want 1.8703152 because that's what the user
typed, but in the second case, we *must* return the result:
Decimal('1.8703152000000000665380639475188218057155609130859375')
because that's the closest conversion, the one with the least error.
Rounding it to 7 decimal places may be sufficient for some applications,
but that's not Python's decision to make, any more than Python can
decide that if you type 1.99999998 that you must have meant exactly 2.
> There needs to be some policy in place that will correctly "set" the
> decimal float value based on intelligence
http://www.catb.org/jargon/html/D/DWIM.html
--
Steven
More information about the Python-ideas
mailing list