[Python-Dev] Expert floats

Tue Mar 30 14:40:19 EST 2004

[Ping]
[Tim]
>> The immediate motivation at the time was that marshal uses repr(float)
>> to store floats in code objects, so people who use floats seriously
>> found that results differed between running a module directly and
>> importing the same module via a .pyc/.pyo file.  That's flatly
>> intolerable for serious work.

[Ping]
> That doesn't make sense to me.  If the .py file says "1.1" and the
> .pyc file says "1.1", you're going to get the same results.

repr(float) used to round to 12 significant digits (same as str() does
ow  -- repr(float) and str(float) used to be identical).  So the problem was
real, and so was the fix.

> In fact, you've just given a stronger reason for keeping "1.1".
> Currently, compiling a .py file containing "1.1" produces a .pyc file
> containing "1.1000000000000001".  .pyc files are supposed to be
> platform-independent.  If these files are then run on a platform with
> different floating-point precision, the .py and the .pyc will produce
> different results.

But you can't get away from that via any decimal rounding rule.  One of the
*objections* the 754 committee had to the Scheme rule is that moving rounded
shortest-possible decimal output to a platform with greater precision could
cause the latter platform to read in an unnecessarily poor  approximation to
the actual number written on the source platform.  It's simply a fact that
decimal 1.1000000000000001 is a closer approximation to the number stored in
an IEEE double (given input "1.1" perfectly rounded to IEEE double format)
than decimal 1.1, and that has consequences too when moving to a wider
precision.

You have in mind *typing* "1.1" literally, so that storing "1.1" would give
a better approximation to decimal 1.1 on that box with wider precision, but
repr() doesn't know whether its input was typed by hand or computed.  Most
floats in real life are computed.

So if we were to change the marshal format, it would make much more sense to
reuse pickle's binary format for floats (which represents floats exactly, at
least those that don't exceed the precision or dynamic range of a 754
double).  The binary format also *is* portable.  Relying on decimal strings
(of any form) isn't really, so long as Python relies on the platform C to do
string<->float conversion.  Slinging shortest-possible output requires
perfect rounding on input, which is stronger than the 754 standard requires.
Slinging decimal strings rounded to 17 digits is less demanding, and is
portable across all boxes whose C string->float meets the 754 standard.

>> But since we made the change anyway, it had a wonderful consequence:
>> ...

> This is terrible, not wonderful. ...

We've been through all this before, so I'm heartened to see that we'll still
never agree <wink>.