Why isn't Python king of the hill?

Fri Jun 1 19:35:03 EDT 2001

Martijn writes:
> So there's fixed-point, decimal floating-point and rational 
> arithmetic..
> And plain-old-confusing floats. :)
> 
> They all involve numbers with points in them.. and when you do math
> with them, the answers may be different. That's about the extent
> of my knowledge.

Here's a five minute explanation, and maybe I can dig up a link to a quick
tutorial on FP in general. (Ok, maybe it's more like a fifteen-minute
explanation. Somebody stop me!) Someone who knows better ought to correct me
if I'm talking out my butt.

=====
Floating Point systems
=====

FP numbers are stored as a mantissa times an exponent: 1.22 * (2 ** 15).

Let's look at a base-10 FP system real quick:

1.22 * (10 ** 4) = 12200

We are using 3 digits in the mantissa and 4 in the exponent.

If you want to express 1.2 or .012 in this system, no problem.

However, if you want to express 1.0002 or 100001, you have a problem. You
can't come up with a mantissa that accurately represents that. So you pick a
good representation that's close to what you're trying to describe.

You get three "zones" of inaccuracy in an FP system. The first is the one
you'd expect, at the outside edge of your number system. There's a point at
which your exponents can't get any bigger, so you can't get any more
positive or any more negative. This is usually represented as an overflow
error. In our sample number system, trying to represent 10 million would
cause an overflow error, because you really need a 2 digit exponent for
that.

You also get a zone "between" numbers, where you can't get enough precision
in your mantissa 
to express a certain value. We've seen this already, with 1.0002 or 100001.
Generally the solution is to round off, so you get 1.00 * (10 ** 1) = 1.0
and 1.00 * (10 ** 5) = 100000.
Making sense so far, right?

The other place you get a problem is in fractions close to zero. There's a
"hole at zero", where anything smaller than 1.0 times your smallest negative
exponent falls into the "hole at zero". This is generally a larger range of
values than the range of possible values between two mantissa values.

=====
Decimal FP vs. Binary FP
=====

In a decimal FP system, your 'hole at zero' and the precision you're working
at are values that seem to correspond with the standard base-10 arithmetic
you learned in school. You can look at the number of digits in your mantissa
and predict where rounding will occur. That rounding will make the same kind
of sense it would if you were working with a currency system, or the metric
system, or whatever. If the next digit out from your precision is 5 or more,
you round up, otherwise round down.

Most computers use a binary FP system, though. That means that things that
divide nicely by two are expressed quite naturally. On the other hand,
values that do not divide nicely by two need to be approximated.

The big problem with binary FP is that most people (myself included) don't
know when to expect approximation rather than an exact result. Without doing
the actual translation math, it's hard to know whether the simple decimal
value you're looking at will turn into something big and burly when
translated to binary. For example, .1 is an infinitely-recurring binary
fraction. If you've got python on hand, type .1 at the prompt, and see what
I mean. :) (I note that my version of 1.5.2 doesn't display the problem --
it does some rounding under the covers. I'm not sure why, or how.)

=====
Fixed-Point
=====

As I understand it, fixed-point arithmetic scales the fractions into
integers, and keeps track of the scaling. An example might be that you would
represent .12 as 12 with a scaling factor of 10 ** -2. This differs from FP
in that the mantissa is always an integer.

When you do fixed-point arithmetic, you convert both numbers to an identical
scale, then perform integer math on them, then apply scaling to the result.
It's accurate in a decimal way, but it doesn't do so well when the numbers
have vastly different scales -- you might get overflows with your integer
mantissas.

I need to be reliably decimal-based for financial transactions, which are
all likely to have an identical scale. Fixed point will work fine for me,
even though it's not as efficient as a decimal-based floating-point would
be.

=====
Unbounded rationals
=====

Unbounded rationals would basically give the language the power to express
any rational number accurately. I confess that I'm not sure how it would go
about doing that exactly. I THINK it would do so by maintaining an internal
representation of the operation that gives rise to an infinitely-repeating
decimal, and use that expression in conjunction with known mathematical
properties to join in expressions with other values.

This is pretty complicated to do, and, according to several people, has some
real problems when common repeating decimals get into your calculations --
like .333 for instance. Your performance is highly variable depending on how
hard it is to keep track of the numbers you're using.

Ok, so that's more of a tutorial than I intended to provide, but I guess I'm
in a wordy mood today.

If you're still on sloppy ground, try this:

http://www.cs.utah.edu/~zachary/isp/applets/FP/FP.html 

Then have a read of the FAQ at: 

http://www2.hursley.ibm.com/decimalj/decfaq.html 

They may not be any more articulate than I was, but if you get a couple
different people trying to explain it, it's sometimes easier to triangulate
the truth that lies between. :)

Thanks,
--G.

---
Geoff Gerrietts <geoff at homegain.com>
Software Engineer, HomeGain.com
510-655-0800 x4320