on floating-point numbers

Fri Sep 3 10:08:21 EDT 2021

On Thu, Sep 2, 2021 at 2:27 PM Chris Angelico <rosuav at gmail.com> wrote:

> On Fri, Sep 3, 2021 at 4:58 AM Hope Rouselle <hrouselle at jevedi.com> wrote:
> >
> > Hope Rouselle <hrouselle at jevedi.com> writes:
> >
> > > Just sharing a case of floating-point numbers.  Nothing needed to be
> > > solved or to be figured out.  Just bringing up conversation.
> > >
> > > (*) An introduction to me
> > >
> > > I don't understand floating-point numbers from the inside out, but I do
> > > know how to work with base 2 and scientific notation.  So the idea of
> > > expressing a number as
> > >
> > >   mantissa * base^{power}
> > >
> > > is not foreign to me. (If that helps you to perhaps instruct me on
> > > what's going on here.)
> > >
> > > (*) A presentation of the behavior
> > >
> > >>>> import sys
> > >>>> sys.version
> > > '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
> > > bit (AMD64)]'
> > >
> > >>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> > >>>> sum(ls)
> > > 39.599999999999994
> > >
> > >>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> > >>>> sum(ls)
> > > 39.60000000000001
> > >
> > > All I did was to take the first number, 7.23, and move it to the last
> > > position in the list.  (So we have a violation of the commutativity of
> > > addition.)
> >
> > Suppose these numbers are prices in dollar, never going beyond cents.
> > Would it be safe to multiply each one of them by 100 and therefore work
> > with cents only?  For instance
>
> Yes and no. It absolutely *is* safe to always work with cents, but to
> do that, you have to be consistent: ALWAYS work with cents, never with
> floating point dollars.
>
> (Or whatever other unit you choose to use. Most currencies have a
> smallest-normally-used-unit, with other currency units (where present)
> being whole number multiples of that minimal unit. Only in forex do
> you need to concern yourself with fractional cents or fractional yen.)
>
> But multiplying a set of floats by 100 won't necessarily solve your
> problem; you may have already fallen victim to the flaw of assuming
> that the numbers are represented accurately.
>
> > --8<---------------cut here---------------start------------->8---
> > >>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> > >>> sum(map(lambda x: int(x*100), ls)) / 100
> > 39.6
> >
> > >>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> > >>> sum(map(lambda x: int(x*100), ls)) / 100
> > 39.6
> > --8<---------------cut here---------------end--------------->8---
> >
> > Or multiplication by 100 isn't quite ``safe'' to do with floating-point
> > numbers either?  (It worked in this case.)
>
> You're multiplying and then truncating, which risks a round-down
> error. Try adding a half onto them first:
>
> int(x * 100 + 0.5)
>
> But that's still not a perfect guarantee. Far safer would be to
> consider monetary values to be a different type of value, not just a
> raw number. For instance, the value $7.23 could be stored internally
> as the integer 723, but you also know that it's a value in USD, not a
> simple scalar. It makes perfect sense to add USD+USD, it makes perfect
> sense to multiply USD*scalar, but it doesn't make sense to multiply
> USD*USD.
>
> > I suppose that if I multiply it by a power of two, that would be an
> > operation that I can be sure will not bring about any precision loss
> > with floating-point numbers.  Do you agree?
>
> Assuming you're nowhere near 2**53, yes, that would be safe. But so
> would multiplying by a power of five. The problem isn't precision loss
> from the multiplication - the problem is that your input numbers
> aren't what you think they are. That number 7.23, for instance, is
> really....
>
> >>> 7.23.as_integer_ratio()
> (2035064081618043, 281474976710656)
>
> ... the rational number 2035064081618043 / 281474976710656, which is
> very close to 7.23, but not exactly so. (The numerator would have to
> be ...8042.88 to be exactly correct.) There is nothing you can do at
> this point to regain the precision, although a bit of multiplication
> and rounding can cheat it and make it appear as if you did.
>
> Floating point is a very useful approximation to real numbers, but
> real numbers aren't the best way to represent financial data. Integers
> are.
>
>
Hmmmmmmm - - - ZI would suggest that you haven't looked into
taxation yet!
In taxation you get a rational number that MUST be multiplied by
the amount in currency.
The error rate here is stupendous.
Some organizations track each transaction with its taxes rounded.
Then some track using  use untaxed and then calculate the taxes
on the whole (when you have 2 or 3 or 4 (dunno about more but
who knows there are some seriously tax loving jurisdictions out there))
the differences between adding amounts and then calculating taxes
and calculating taxes on each amount and then adding all items
together can have some 'interesting' differences.

So financial data MUST be able to handle rational numbers.
(I have been bit by the differences enumerated in the previous!)

Regards