Python -- floating point arithmetic

Thu Jul 8 03:32:24 EDT 2010

On Thu, 08 Jul 2010 06:04:33 +0200, David Cournapeau wrote:

> On Thu, Jul 8, 2010 at 5:41 AM, Zooko O'Whielacronx <zooko at zooko.com>
> wrote:
>> I'm starting to think that one should use Decimals by default and
>> reserve floats for special cases.
>>
>> This is somewhat analogous to the way that Python provides
>> arbitrarily-big integers by default and Python programmers only use
>> old-fashioned fixed-size integers for special cases, such as
>> interoperation with external systems or highly optimized pieces (in
>> numpy or in native extension modules, for example).
> 
> I don't think it is analogous at all. Arbitrary-bit integers have a
> simple tradeoff: you are willing to lose performance and memory for
> bigger integer. If you leave performance aside, there is no downside
> that I know of for using big int instead of "machine int".

Well, sure, but that's like saying that if you leave performance aside, 
there's no downside to Bubblesort instead of Quicksort.

However, I believe that in Python at least, the performance cost of 
arbitrary-sized longs is quite minimal compared to the benefit, at least 
for "reasonable" sized ints, and so the actual real-world cost of 
unifying the int and long types is minimal.

On the other hand, until Decimal is re-written in C, it will always be 
*significantly* slower than float.

$ python -m timeit "2.0/3.0"
1000000 loops, best of 3: 0.139 usec per loop
$ python -m timeit -s "from decimal import Decimal as D" "D(2)/D(3)"
1000 loops, best of 3: 549 usec per loop

That's three orders of magnitude difference in speed. That's HUGE, and 
*alone* is enough to disqualify changing to Decimal as the default 
floating point data type.

Perhaps in the future, if and when Decimal has a fast C implementation, 
this can be re-thought.

> Since you are
> using python, you already bought this kind of tradeoff anyway.
> 
> Decimal vs float is a different matter altogether: decimal has downsides
> compared to float. First, there is this irreconcilable fact that no
> matter how small your range is, it is impossible to represent exactly
> all (even most) numbers exactly with finite memory - float and decimal
> are two different solutions to this issue, with different tradeoffs.

Yes, but how is this a downside *compared* to float? In what way does 
Decimal have downsides that float doesn't? Neither can represent 
arbitrary real numbers exactly, but if anything float is *worse* compared 
to Decimal for two reasons:

* Python floats are fixed to a single number of bits, while the size of 
Decimals can be configured by the user;

* floats can represent sums of powers of two exactly, while Decimals can 
represent sums of powers of ten exactly. Not only does that mean that any 
number exactly representable as a float can also be exactly represented 
as a Decimal, but Decimals can *additionally* represent exactly many 
numbers of human interest that floats cannot.

> Decimal are more "intuitive" than float for numbers that can be
> represented as decimal - but most numbers cannot be represented as
> (finite) decimal.

True, but any number that can't be, also can't be exactly represented as 
a float either, so how does using float help?

[...] 
>> And most of the time (in my experience) the inputs and outputs to your
>> system and the literals in your code are actually decimal, so
>> converting them to float immediately introduces a lossy data conversion
>> before you've even done any computation. Decimal doesn't have that
>> problem.
> 
> That's not true anymore once you start doing any computation, if by
> decimal you mean finite decimal. And that will be never true once you
> start using non trivial computation (i.e. transcendental functions like
> log, exp, etc...).

But none of those complications are *inputs*, and, again, floats suffer 
from exactly the same problem.

-- 
Steven