Precision Tail-off?
avi.e.gross at gmail.com
avi.e.gross at gmail.com
Fri Feb 17 12:46:52 EST 2023
Stephen,
What response do you expect from whatever people in the IEEE you want?
The specific IEEE standards were designed and agreed upon by groups working
in caveman times when the memory and CPU time were not so plentiful. The
design of many types, including floating point, had to work decently if not
perfectly so you stored data in ways the hardware of the time could process
it efficiently.
Note all kinds of related issues about what happens if you want an integer
larger than fits into 16 bits or 32 bits or even 64 bits. A Python integer
was designed to be effectively unlimited and uses as much storage as needed.
It can also get ever slower when doing things like looking for gigantic
primes. But it does not have overflow problems.
So could you design a floating point data type with similar features? It
would be some complex data structure that keeps track of the number of
bit/bytes/megabytes currently being used to store the mantissa or exponent
parts and then have some data structure that holds all the bits needed. When
doing any arithmetic like addition or division or more complex things, it
would need to compare the two objects being combined and calculate how to
perhaps expand/convert one to match the other and then do umpteen steps to
generate the result in as many pieces/steps as needed and create a data
structure that holds the result, optionally trimming off terminal parts not
needed or wanted. Then you would need all relevant functions that accept
regular floating point to handle these numbers and generate these numbers.
Can that be done well? Well, sure, but not necessarily WELL. Some would
point you to the Decimal type. It might take a somewhat different tack on
how to do this. But everything comes with a cost.
Perhaps the response from the IEEE would be that what they published was
meant for some purposes but not yours. It may be that a group needs to
formulate a new standard but leave the old ones in place for people willing
to use them as their needs are more modest.
As an analogy, consider the lowly char that stored a single character in a
byte. II mean good old ASCII but also EBCDIC and the ISO family like ISO
8859-1 and so on. Those standards focused in on the needs of just a few
languages and if you wanted to write something in a mix of languages, it
could be a headache as I have had time I had to shift within one document to
say ISO 8859-8 to include some Hebrew, and ISO 8859-3 for Esperanto and so
on while ISO8859-1 was fine for English, French, German, Spanish and many
others. For some purposes, I had to use encodings like shift JIS to do
Japanese as many Asian languages were outside what ISO was doing.
The solutions since then vary but tend to allow or require multiple bytes
per character. But they retain limits and if we ever enter a Star Trek
Universe with infinite diversity and more languages and encodings, we might
need to again enlarge our viewpoint and perhaps be even more wasteful of our
computing resources to accommodate them all!
Standards are often not made to solve ALL possible problems but to make
clear what is supported and what is not required. Mathematical arguments can
be helpful but practical considerations and the limited time available (as
these darn things can take YEARS to be agreed on) are often dominant.
Frankly, by the tie many standards, such as for a programming language, are
finalized, the reality in the field has often changed. The language may
already have been supplanted largely by others for new work, or souped up
with not-yet-standard features.
I am not against striving for ever better standards and realities. But I do
think a better way to approach this is not to reproach what was done but ask
if we can focus on the near-future and make it better.
Arguably, there are now multiple features out there such as Decimal and they
may be quite different. That often happens without a standard. But if you
now want everyone to get together and define a new standard that may break
some implementations, ...
As I see it, many computer courses teach the realities as well as the
mathematical fantasies that break down in the real world. One of those that
tend to be stressed is that floating point is not exact and that comparison
operators need to be used with caution. Often the suggestion is to subtract
one number from another and check if the result is fairly close to zero as
in the absolute value is less than an IEEE standard number where the last
few bits are ones. For more complex calculations where the errors can
accumulate, you may need to choose a small number with more such bits near
the end.
Extended precision arithmetic is perhaps cheaper now and can be done for a
reasonable number of digits. It probably is not realistic to do most such
calculations for billions of digits, albeit some of the calculations for the
first googolplex digits of pi might indeed need such methods, as soon as we
fin a way to keep that many digits in memory give the ten to the 80th or so
particles we think are in our observable universe. But knowing pi to that
precision may not be meaningful if an existing value already is so precise
that given an exact number for the diameter of something the size of the
universe (Yes, I know this is nonsense) you could calculate the
circumference (ditto) to less than the size (ditto) of a proton. Any errors
in such a measurement would be swamped by all kinds of things such as
uncertainties in what we can measure, or niggling details about how space
expands irregularly in the area as we speak and so on.
So if you want a new IEEE (or other such body) standard, would you be
satisfied with a new one for say a 16,384 byte monstrosity that holds
gigantic numbers with lots more precision, or hold out for a relatively
flexible and unlimited version that can be expanded until your computer or
planet runs out of storage room and provides answers after a few billion
years when used to just add two of them together?
-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com at python.org> On
Behalf Of Stephen Tucker
Sent: Friday, February 17, 2023 5:27 AM
To: python-list at python.org
Subject: Re: Precision Tail-off?
Thanks, one and all, for your reponses.
This is a hugely controversial claim, I know, but I would consider this
behaviour to be a serious deficiency in the IEEE standard.
Consider an integer N consisting of a finitely-long string of digits in base
10.
Consider the infinitely-precise cube root of N (yes I know that it could
never be computed unless N is the cube of an integer, but this is a
mathematical argument, not a computational one), also in base 10. Let's call
it RootN.
Now consider appending three zeroes to the right-hand end of N (let's call
it NZZZ) and NZZZ's infinitely-precise cube root (RootNZZZ).
The *only *difference between RootN and RootNZZZ is that the decimal point
in RootNZZZ is one place further to the right than the decimal point in
RootN.
None of the digits in RootNZZZ's string should be different from the
corresponding digits in RootN.
I rest my case.
Perhaps this observation should be brought to the attention of the IEEE. I
would like to know their response to it.
Stephen Tucker.
On Thu, Feb 16, 2023 at 6:49 PM Peter Pearson <pkpearson at nowhere.invalid>
wrote:
> On Tue, 14 Feb 2023 11:17:20 +0000, Oscar Benjamin wrote:
> > On Tue, 14 Feb 2023 at 07:12, Stephen Tucker
> > <stephen_tucker at sil.org>
> wrote:
> [snip]
> >> I have just produced the following log in IDLE (admittedly, in
> >> Python
> >> 2.7.10 and, yes I know that it has been superseded).
> >>
> >> It appears to show a precision tail-off as the supplied float gets
> bigger.
> [snip]
> >>
> >> For your information, the first 20 significant figures of the cube
> >> root
> in
> >> question are:
> >> 49793385921817447440
> >>
> >> Stephen Tucker.
> >> ----------------------------------------------
> >> >>> 123.456789 ** (1.0 / 3.0)
> >> 4.979338592181744
> >> >>> 123456789000000000000000000000000000000000. ** (1.0 / 3.0)
> >> 49793385921817.36
> >
> > You need to be aware that 1.0/3.0 is a float that is not exactly
> > equal to 1/3 ...
> [snip]
> > SymPy again:
> >
> > In [37]: a, x = symbols('a, x')
> >
> > In [38]: print(series(a**x, x, Rational(1, 3), 2))
> > a**(1/3) + a**(1/3)*(x - 1/3)*log(a) + O((x - 1/3)**2, (x, 1/3))
> >
> > You can see that the leading relative error term from x being not
> > quite equal to 1/3 is proportional to the log of the base. You
> > should expect this difference to grow approximately linearly as you
> > keep adding more zeros in the base.
>
> Marvelous. Thank you.
>
>
> --
> To email me, substitute nowhere->runbox, invalid->com.
> --
> https://mail.python.org/mailman/listinfo/python-list
>
--
https://mail.python.org/mailman/listinfo/python-list
More information about the Python-list
mailing list