[Python-ideas] Python Float Update
Steven D'Aprano
steve at pearwood.info
Tue Jun 2 05:00:40 CEST 2015
Nicholas,
Your email client appears to not be quoting text you quote. It is a
conventional to use a leading > for quoting, perhaps you could configure
your mail program to do so? The good ones even have a "Paste As Quote"
command.
On with the substance of your post...
On Mon, Jun 01, 2015 at 01:24:32PM -0400, Nicholas Chammas wrote:
> I guess it’s a non-trivial tradeoff. But I would lean towards considering
> people likely to be affected by the performance hit as doing something “not
> common”. Like, if they are doing that many calculations that it matters,
> perhaps it makes sense to ask them to explicitly ask for floats vs.
> decimals, in exchange for giving the majority who wouldn’t notice a
> performance difference a better user experience.
Changing from binary floats to decimal floats by default is a big,
backwards incompatible change. Even if it's a good idea, we're
constrained by backwards compatibility: I would imagine we wouldn't want
to even introduce this feature until the majority of people are using
Python 3 rather than Python 2, and then we'd probably want to introduce
it using a "from __future__ import decimal_floats" directive.
So I would guess this couldn't happen until probably 2020 or so.
But we could introduce a decimal literal, say 1.1d for Decimal("1.1").
The first prerequisite is that we have a fast Decimal implementation,
which we now have. Next we would have to decide how the decimal literals
would interact with the decimal module. Do we include full support of
the entire range of decimal features, including globally configurable
precision and other modes? Or just a subset? How will these decimals
interact with other numeric types, like float and Fraction? At the
moment, Decimal isn't even part of the numeric tower.
There's a lot of ground to cover, it's not a trivial change, and will
definitely need a PEP.
> How many of your examples are inherent limitations of decimals vs. problems
> that can be improved upon?
In one sense, they are inherent limitations of floating point numbers
regardless of base. Whether binary, decimal, hexadecimal as used in some
IBM computers, or something else, you're going to see the same problems.
Only the specific details will vary, e.g. 1/3 cannot be represented
exactly in base 2 or base 10, but if you constructed a base 3 float, it
would be exact.
In another sense, Decimal has a big advantage that it is much more
configurable than Python's floats. Decimal lets you configure the
precision, rounding mode, error handling and more. That's not inherent
to base 10 calculations, you can do exactly the same thing for binary
floats too, but Python doesn't offer that feature for floats, only for
Decimals.
But no matter how you configure Decimal, all you can do is shift the
gotchas around. The issue really is inherent to the nature of the
problem, and you cannot defeat the universe. Regardless of what
base you use, binary or decimal or something else, or how many digits
precision, you're still trying to simulate an uncountably infinite
continuous, infinitely divisible number line using a finite,
discontinuous set of possible values. Something has to give.
(For the record, when I say "uncountably infinite", I don't just mean
"too many to count", it's a technical term. To oversimplify horribly, it
means "larger than infinity" in some sense. It's off-topic for here,
but if anyone is interested in learning more, you can email me off-list,
or google for "countable vs uncountable infinity".)
Basically, you're trying to squeeze an infinite number of real numbers
into a finite amount of memory. It can't be done. Consequently, there
will *always* be some calculations where the true value simply cannot be
calculated and the answer you get is slightly too big or slightly too
small. All the other floating point gotchas follow from that simple
fact.
> Admittedly, the only place where I’ve played with decimals extensively is
> on Microsoft’s SQL Server (where they are the default literal
> <https://msdn.microsoft.com/en-us/library/ms179899.aspx>). I’ve stumbled in
> the past on my own decimal gotchas
> <http://dba.stackexchange.com/q/18997/2660>, but looking at your examples
> and trying them on SQL Server I suspect that most of the problems you show
> are problems of precision and scale.
No. Change the precision and scale, and some *specific* problems goes
away, but they reappear with other numbers.
Besides, at the point that you're talking about setting the precision,
we're really not talking about making things easy for beginners any
more.
And not all floating point issues are related to precision and scale in
decimal. You cannot divide a cake into exactly three equal pieces in
Decimal any more than you can divide a cake into exactly three equal
pieces in binary. All you can hope for is to choose a precision were the
rounding errors in one part of your calculation will be cancelled by the
rounding errors in another part of your calculation. And that precision
will be different for any two arbitrary calculations.
--
Steve
More information about the Python-ideas
mailing list