On Fri, Mar 7, 2014 at 5:05 PM, Steven D'Aprano <steve@pearwood.info> wrote:

[...] The way Python converts between the two is the right
way to do the conversion.

It's *exact*. I don't know we all agree it is the *right* way.

Given a decimal d and a float f constructed
from d, f is the closest possible float to d. And the same applies for
conversions the other way around.

It's actually stronger the other way around: when d is constricted from f, d is *equal* to the mathematical value of f.

The issue (as I see it) is that there are many different decimals d that all convert to the same float f (because of rounding). The d that is constructed by taking the exact value of f is gross overkill. If d was constructed from f by always rounding to (I think) 17 digits it would still back convert exactly to f; the new float repr() gives us an even "better" (i.e. shorter) string of digits that are still guaranteed to to be converted into exactly f (because there's simply no other float that is closer to d than f).

(This may only apply on platforms with IEEE-754 semantics, but I think
that's nearly all platforms CPython runs on these days.)

Sure.

> Second, that was a bad decision--really--IMHO. It was the wrong
> decision for two reasons 1) it denies what is inherently poor about
> binary floating point representation in the first place,

What is inherently poor about binary floats is also poor for Decimal
floats: limited precision, rounding errors, and violation of fundamental
laws of the real number system. These problems, save one, apply equally
to Decimal, just in different places with different numbers.

The *sole* difference is that with Decimal there is no rounding between
what the user writes as a decimal string "1.23456" and what the value
generated is. (Decimal also includes a bunch of extra features, like
control of the precision, which floats don't support, but they *could*
be supported if there was enough interest to justify the work.)

This benefit is mostly of use if you are dealing with (1) numerically
naive users or (2) using Python as an interactive calculator.

In the case of (1), numerically naive users are not the most important
subset of users for Python. In fact, I would say that numerically
sophisticated users (scientists, mostly) are far more important to the
Python ecosystem, and they *do not want your suggested change*.

What is the "suggested change" here? If it's "default float literals to Decimal" I agree. But if the suggestion is my "Decimal(<float>) should be implemented as Decimal(repr(<float>))" I don't think most scientists care (few of them use Decimal, they stay entirely in the realm of binary floating point).

If we
forced Decimals on them, they would be around with flaming torches and
pitchforks. Seriously, the numeric community would likely abandon Python
(with much wailing and gnashing of teeth) if we forced this on them.

Right. In Mark's "post of the day" he already accepts that "decimal by default" is untenable -- and my claim is that even if it was desirable, it would be too big an undertaking. So it's dead.

In the case of (2), anyone who has used a common pocket calculator is
used to seeing values calculated with a string of 9s or 0s at the end:

2.01000000000001
2.00999999999999

instead of the expected 2.01.

But the cause is calculations, like 1/3 ==> 0.333333333 and then multiply by 3 ==> 0.999999999. If you enter 2.1, a pocket calculater displays 2.1.

In fact, even on advanced scientific
calculators which use decimal floats internally, such as the TI-89, we
still *frequently* witness this problem.

I suspect that if Python exhibited the problem in exactly those cases where a TI-89 exhibits it, few people would complain. The complaints would most likely come from numerically *sophisticated* users who object against using decimal (since binary has much better rounding behavior for arithmetic operations).

It seems to me that the average
numerically naive user has simply learned that "calculators and
computers do approximate calculations" (which is actually correct, when
compared to the mathematically exact result), and don't particularly
care too much about it.

And yet I have seen plenty of Tweets, emails and other public discussions where people gleefully pointed out that "Python has a bug." I think I've also seen tracker items created for such issues.

But if *you* care, that's okay. Python is a programming language. You
can *program* it to make an interactive calculator with whatever
properties you desire, including using Decimal floats exclusively.

That sounds a bit condescending. :-(

Python's done 97% of the work for you: the Decimal module, and the cmd
module which makes building interactive command-line oriented
applications easy. You can even simulate the Python interactive
interpreter.

In other words, if you don't want to convert from floats to Decimal,
there is *absolutely no reason* why you should.

> and 2) further compounds the problem by denying what is the benefit of
> using Decimal floating point in the first place.

I must admit I'm not clear as to what you think is the benefit of
Decimal which is being denied.

If I may venture a guess, I think the assumed benefit of Decimal here is that it doesn't shatter naive users' expectations.

> Not to mention, it gives the wrong answer; from a decimal floating
> point perspective it gives a VERY wrong answer.

Now that's simply not true. The relative error between the "right"
answer:

py> from decimal import Decimal as D
py> a = D("2.01").sqrt()**2

and the "wrong" answer:

py> b = D(2.01).sqrt()**2

is *miniscule*, just one part in ten thousand million million:

py> (a-b)/a
Decimal('1.060511545905472636815920399E-16')

This feels like unnecessary pedantry. (Not the first time in this thread. :-( )

[...]
> What needs to happen is that binary floats need to be "correctly"
> promoted to decimal floats as appropriate. This must not happen by
> simple copying (that does not work).

If I have given you the impression that float to Decimal conversion
occurs by copying the bits from one into the other, I apologise, that
was not my intention. That is not what happens.

What actually happens is that given any float, not just floats that are
typed in by hand by the user, are converted to the closest possible
Decimal.

Well, actually, a Decimal that is mathematically *equal*. (I understand that this is technically the "closest possible" but the use of the latter phrase suggests that it's not always *equal*, which the actual algorithm used always produces a value that is mathematically equal to the input float.)

What you appear to want is for Python to inspect the value of a
float like 2.0099999999999997868 (the actual float you get from typing
2.01) and intelligently decide that what you *actually* wanted is the
decimal 2.01.

Even if this intelligence was possible, it has one serious flaw that
cripples the whole exercise. It cannot apply only to numbers typed in by
the user, but would have to apply across the board. Python cannot
distinguish between these two cases:

Decimal(1.8703152) # Give me this precise decimal

and

x = 1519.63802016624 # result of some intermediate calculation
y = 812.5037 # result of another intermediate calculation
z = x/y # yet another intermediate calculation
Decimal(z) # convert from float to nearest decimal

But z is the binary float 1.8703152, and Decimal cannot do one thing in
the first case and a different thing in the second, because it cannot
see where the number came from, only what it is. It does not and can
not see the literal string, only the float.

In the first case, you want 1.8703152 because that's what the user
typed, but in the second case, we *must* return the result:

Decimal('1.8703152000000000665380639475188218057155609130859375')

because that's the closest conversion, the one with the least error.

Steven, that's actually not true. print(z) produces 1.8703152, and if I enter float(Decimal('1.8703152')) I get the exact same value of z. Try it. (In Python 2.7 or 3, please.) So I claim that if Decimal(z) produced the same value as Decimal('1.8703152') nothing would be lost.

Rounding it to 7 decimal places may be sufficient for some applications,
but that's not Python's decision to make, any more than Python can
decide that if you type 1.99999998 that you must have meant exactly 2.

The repr() function does not round to a fixed number of decimals. It produces (in theory, dynamically, although I suspect that the current algorithm is better) the shortest decimal string that, when converted back to binary, equals *exactly* the input.

> There needs to be some policy in place that will correctly "set" the
> decimal float value based on intelligence

http://www.catb.org/jargon/html/D/DWIM.html

Also condescending. :-(

--
--Guido van Rossum (python.org/~guido)