[Python-ideas] Python Numbers as Human Concept Decimal System

Sat Mar 8 03:02:02 CET 2014

On Fri, Mar 7, 2014 at 5:05 PM, Steven D'Aprano <steve at pearwood.info> wrote:

> [...] The way Python converts between the two is the right
> way to do the conversion.

It's *exact*. I don't know we all agree it is the *right* way.

> Given a decimal d and a float f constructed
> from d, f is the closest possible float to d. And the same applies for
> conversions the other way around.
>

It's actually stronger the other way around: when d is constricted from f,
d is *equal* to the mathematical value of f.

The issue (as I see it) is that there are many different decimals d that
all convert to the same float f (because of rounding). The d that is
constructed by taking the exact value of f is gross overkill. If d was
constructed from f by always rounding to (I think) 17 digits it would still
back convert exactly to f; the new float repr() gives us an even "better"
(i.e. shorter) string of digits that are still guaranteed to to be
converted into exactly f (because there's simply no other float that is
closer to d than f).

(This may only apply on platforms with IEEE-754 semantics, but I think
> that's nearly all platforms CPython runs on these days.)
>

Sure.

> > Second, that was a bad decision--really--IMHO.  It was the wrong
> > decision for two reasons 1) it denies what is inherently poor about
> > binary floating point representation in the first place,
>
> What is inherently poor about binary floats is also poor for Decimal
> floats: limited precision, rounding errors, and violation of fundamental
> laws of the real number system. These problems, save one, apply equally
> to Decimal, just in different places with different numbers.
>
> The *sole* difference is that with Decimal there is no rounding between
> what the user writes as a decimal string "1.23456" and what the value
> generated is. (Decimal also includes a bunch of extra features, like
> control of the precision, which floats don't support, but they *could*
> be supported if there was enough interest to justify the work.)
>
> This benefit is mostly of use if you are dealing with (1) numerically
> naive users or (2) using Python as an interactive calculator.
>
> In the case of (1), numerically naive users are not the most important
> subset of users for Python. In fact, I would say that numerically
> sophisticated users (scientists, mostly) are far more important to the
> Python ecosystem, and they *do not want your suggested change*.

What is the "suggested change" here? If it's "default float literals to
Decimal" I agree. But if the suggestion is my "Decimal(<float>) should be
implemented as Decimal(repr(<float>))" I don't think most scientists care
(few of them use Decimal, they stay entirely in the realm of binary
floating point).

> If we
> forced Decimals on them, they would be around with flaming torches and
> pitchforks. Seriously, the numeric community would likely abandon Python
> (with much wailing and gnashing of teeth) if we forced this on them.
>

Right. In Mark's "post of the day" he already accepts that "decimal by
default" is untenable -- and my claim is that even if it was desirable, it
would be too big an undertaking. So it's dead.

> In the case of (2), anyone who has used a common pocket calculator is
> used to seeing values calculated with a string of 9s or 0s at the end:
>
> 2.01000000000001
> 2.00999999999999
>
> instead of the expected 2.01.

But the cause is calculations, like 1/3 ==> 0.333333333 and then multiply
by 3 ==> 0.999999999. If you enter 2.1, a pocket calculater displays 2.1.

> In fact, even on advanced scientific
> calculators which use decimal floats internally, such as the TI-89, we
> still *frequently* witness this problem.

I suspect that if Python exhibited the problem in exactly those cases where
a TI-89 exhibits it, few people would complain. The complaints would most
likely come from numerically *sophisticated* users who object against using
decimal (since binary has much better rounding behavior for arithmetic
operations).

> It seems to me that the average
> numerically naive user has simply learned that "calculators and
> computers do approximate calculations" (which is actually correct, when
> compared to the mathematically exact result), and don't particularly
> care too much about it.
>

And yet I have seen plenty of Tweets, emails and other public discussions
where people gleefully pointed out that "Python has a bug." I think I've
also seen tracker items created for such issues.

> But if *you* care, that's okay. Python is a programming language. You
> can *program* it to make an interactive calculator with whatever
> properties you desire, including using Decimal floats exclusively.
>

That sounds a bit condescending. :-(

Python's done 97% of the work for you: the Decimal module, and the cmd
> module which makes building interactive command-line oriented
> applications easy. You can even simulate the Python interactive
> interpreter.
>
> In other words, if you don't want to convert from floats to Decimal,
> there is *absolutely no reason* why you should.
>
>
> > and 2) further compounds the problem by denying what is the benefit of
> > using Decimal floating point in the first place.
>
> I must admit I'm not clear as to what you think is the benefit of
> Decimal which is being denied.
>

If I may venture a guess, I think the assumed benefit of Decimal here is
that it doesn't shatter naive users' expectations.

> > Not to mention, it gives the wrong answer; from a decimal floating
> > point perspective it gives a VERY wrong answer.
>
> Now that's simply not true. The relative error between the "right"
> answer:
>
> py> from decimal import Decimal as D
> py> a = D("2.01").sqrt()**2
>
> and the "wrong" answer:
>
> py> b = D(2.01).sqrt()**2
>
> is *miniscule*, just one part in ten thousand million million:
>
> py> (a-b)/a
> Decimal('1.060511545905472636815920399E-16')
>

This feels like unnecessary pedantry. (Not the first time in this thread.
:-( )

> [...]
> > What needs to happen is that binary floats need to be "correctly"
> > promoted to decimal floats as appropriate.  This must not happen by
> > simple copying (that does not work).
>
> If I have given you the impression that float to Decimal conversion
> occurs by copying the bits from one into the other, I apologise, that
> was not my intention. That is not what happens.
>
> What actually happens is that given any float, not just floats that are
> typed in by hand by the user, are converted to the closest possible
> Decimal.

Well, actually, a Decimal that is mathematically *equal*. (I understand
that this is technically the "closest possible" but the use of the latter
phrase suggests that it's not always *equal*, which the actual algorithm
used always produces a value that is mathematically equal to the input
float.)

> What you appear to want is for Python to inspect the value of a
> float like 2.0099999999999997868 (the actual float you get from typing
> 2.01) and intelligently decide that what you *actually* wanted is the
> decimal 2.01.
>
> Even if this intelligence was possible, it has one serious flaw that
> cripples the whole exercise. It cannot apply only to numbers typed in by
> the user, but would have to apply across the board. Python cannot
> distinguish between these two cases:
>
>     Decimal(1.8703152)  # Give me this precise decimal
>
> and
>
>     x = 1519.63802016624    # result of some intermediate calculation
>     y = 812.5037  # result of another intermediate calculation
>     z = x/y  # yet another intermediate calculation
>     Decimal(z)  # convert from float to nearest decimal
>
> But z is the binary float 1.8703152, and Decimal cannot do one thing in
> the first case and a different thing in the second, because it cannot
> see where the number came from, only what it is. It does not and can
> not see the literal string, only the float.
>
> In the first case, you want 1.8703152 because that's what the user
> typed, but in the second case, we *must* return the result:
>
>     Decimal('1.8703152000000000665380639475188218057155609130859375')
>
> because that's the closest conversion, the one with the least error.
>

Steven, that's actually not true. print(z) produces 1.8703152, and if I
enter float(Decimal('1.8703152')) I get the exact same value of z. Try it.
(In Python 2.7 or 3, please.) So I claim that if Decimal(z) produced the
same value as Decimal('1.8703152') nothing would be lost.

> Rounding it to 7 decimal places may be sufficient for some applications,
> but that's not Python's decision to make, any more than Python can
> decide that if you type 1.99999998 that you must have meant exactly 2.
>

The repr() function does not round to a fixed number of decimals. It
produces (in theory, dynamically, although I suspect that the current
algorithm is better) the shortest decimal string that, when converted back
to binary, equals *exactly* the input.

>
> > There needs to be some policy in place that will correctly "set" the
> > decimal float value based on intelligence
>
> http://www.catb.org/jargon/html/D/DWIM.html

Also condescending. :-(

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20140307/b7bfde77/attachment.html>