[Python-ideas] Python Numbers as Human Concept Decimal System

Sat Mar 8 02:05:26 CET 2014

On Fri, Mar 07, 2014 at 11:01:01AM -0800, Mark H. Harris wrote:

> Here is the main post for today, from me, having slept on this and 
> putting some things back into historical timeframe.  Sometime before 
> Py2.7.x it was not possible to promote a binary float to a decimal 
> float; the system threw an exception (convert to string).
> 
> First, that was the correct way to handle this issue from the 
> inception of decimal.Decimal. Throw an exception and let everyone know 
> that there is a problem; then let the user figure out how to solve it.

No, it was the *conservative* way to handle the issue. By disallowing 
the feature, it was possible to add it later without breaking backwards 
compatibility. If it had been allowed from the beginning, it would have 
been impossible to remove the feature if it had turned out to be a 
mistake.

> It is not possible to correctly promote a binary float to a decimal 
> float. The decision (in Py2.7?) was to NOT throw an exception and to 
> copy the binary float "exactly" into the decimal float.

That's incorrect. The way Python converts between the two is the right 
way to do the conversion. Given a decimal d and a float f constructed 
from d, f is the closest possible float to d. And the same applies for 
conversions the other way around.

(This may only apply on platforms with IEEE-754 semantics, but I think 
that's nearly all platforms CPython runs on these days.)

> Second, that was a bad decision--really--IMHO.  It was the wrong 
> decision for two reasons 1) it denies what is inherently poor about 
> binary floating point representation in the first place,

What is inherently poor about binary floats is also poor for Decimal 
floats: limited precision, rounding errors, and violation of fundamental 
laws of the real number system. These problems, save one, apply equally 
to Decimal, just in different places with different numbers.

The *sole* difference is that with Decimal there is no rounding between 
what the user writes as a decimal string "1.23456" and what the value 
generated is. (Decimal also includes a bunch of extra features, like 
control of the precision, which floats don't support, but they *could* 
be supported if there was enough interest to justify the work.)

This benefit is mostly of use if you are dealing with (1) numerically 
naive users or (2) using Python as an interactive calculator.

In the case of (1), numerically naive users are not the most important 
subset of users for Python. In fact, I would say that numerically 
sophisticated users (scientists, mostly) are far more important to the 
Python ecosystem, and they *do not want your suggested change*. If we 
forced Decimals on them, they would be around with flaming torches and 
pitchforks. Seriously, the numeric community would likely abandon Python 
(with much wailing and gnashing of teeth) if we forced this on them.

In the case of (2), anyone who has used a common pocket calculator is 
used to seeing values calculated with a string of 9s or 0s at the end:

2.01000000000001
2.00999999999999

instead of the expected 2.01. In fact, even on advanced scientific 
calculators which use decimal floats internally, such as the TI-89, we 
still *frequently* witness this problem. It seems to me that the average 
numerically naive user has simply learned that "calculators and 
computers do approximate calculations" (which is actually correct, when 
compared to the mathematically exact result), and don't particularly 
care too much about it.

But if *you* care, that's okay. Python is a programming language. You 
can *program* it to make an interactive calculator with whatever 
properties you desire, including using Decimal floats exclusively. 
Python's done 97% of the work for you: the Decimal module, and the cmd 
module which makes building interactive command-line oriented 
applications easy. You can even simulate the Python interactive 
interpreter.

In other words, if you don't want to convert from floats to Decimal, 
there is *absolutely no reason* why you should.

> and 2) further compounds the problem by denying what is the benefit of 
> using Decimal floating point in the first place.

I must admit I'm not clear as to what you think is the benefit of 
Decimal which is being denied.

> Not to mention, it gives the wrong answer; from a decimal floating 
> point perspective it gives a VERY wrong answer.

Now that's simply not true. The relative error between the "right" 
answer:

py> from decimal import Decimal as D
py> a = D("2.01").sqrt()**2

and the "wrong" answer:

py> b = D(2.01).sqrt()**2

is *miniscule*, just one part in ten thousand million million:

py> (a-b)/a
Decimal('1.060511545905472636815920399E-16')

[...]
> What needs to happen is that binary floats need to be "correctly" 
> promoted to decimal floats as appropriate.  This must not happen by 
> simple copying (that does not work).

If I have given you the impression that float to Decimal conversion 
occurs by copying the bits from one into the other, I apologise, that 
was not my intention. That is not what happens.

What actually happens is that given any float, not just floats that are 
typed in by hand by the user, are converted to the closest possible 
Decimal. What you appear to want is for Python to inspect the value of a 
float like 2.0099999999999997868 (the actual float you get from typing 
2.01) and intelligently decide that what you *actually* wanted is the 
decimal 2.01.

Even if this intelligence was possible, it has one serious flaw that 
cripples the whole exercise. It cannot apply only to numbers typed in by 
the user, but would have to apply across the board. Python cannot 
distinguish between these two cases:

    Decimal(1.8703152)  # Give me this precise decimal

and 

    x = 1519.63802016624    # result of some intermediate calculation
    y = 812.5037  # result of another intermediate calculation
    z = x/y  # yet another intermediate calculation
    Decimal(z)  # convert from float to nearest decimal

But z is the binary float 1.8703152, and Decimal cannot do one thing in 
the first case and a different thing in the second, because it cannot 
see where the number came from, only what it is. It does not and can 
not see the literal string, only the float.

In the first case, you want 1.8703152 because that's what the user 
typed, but in the second case, we *must* return the result:

    Decimal('1.8703152000000000665380639475188218057155609130859375')

because that's the closest conversion, the one with the least error. 
Rounding it to 7 decimal places may be sufficient for some applications, 
but that's not Python's decision to make, any more than Python can 
decide that if you type 1.99999998 that you must have meant exactly 2.

> There needs to be some policy in place that will correctly "set" the 
> decimal float value based on intelligence

http://www.catb.org/jargon/html/D/DWIM.html

-- 
Steven