Precision issue

Tim Peters tim.one at comcast.net
Fri Oct 10 14:37:51 EDT 2003


[Duncan Booth]
> I know this is an FAQ, but the one thing I've never seen explained
> satisfactorily is why repr(3.4) has to be '3.3999999999999999' rather
> than '3.4'?

Python doesn't do float<->string conversion itself.  That's done by the
platform C library.

The IEEE-754 standard requires that if a 754 double is converted to a string
with 17 significant decimal digits, then converted back to a 754 double
again, you'll get back exactly the double you started with.

Python does not guarantee that, and it can't, because the C library does the
heavy lifting in both directions.  But because Python use a C %.17g format,
Python eval(repr(x)) == x holds on any platform whose C library meets the
minimal relevant requirements of the 754 standard.  I believe all major C
libraries do meet this now.

The 754 standard does not require that string->double or double->string
round correctly in all cases.  That's a (much) stronger requirement than
that eval(repr(x)) == x.

> ...
> There's no reason why Python couldn't do the same:
>
> def float_repr(x):
> 	s = "%.15g" % x
> 	if float(s)==x: return s
> 	return "%.17g" % x

Sorry, but there is a reason:  if done on a platform whose C library
implements perfect-rounding double->string (e.g., I think gcc does now),
this can hit cases where the string may not reproduce x when eval'ed back on
a different platform whose C library isn't so conscientious but which
nevertheless meets the 754 standard's more forgiving (than perfect rounding)
requirements.

This is acutely important because Python's marshal format (used for .pyc
files) represents floats as repr'ed strings.  By making repr() pump out 17
digits, we maximize the odds that .pyc files ported across platforms load
back exactly the same 754 doubles across (754) platforms.

> This would be MUCH friendlier for newcomers to the language.

A decimal floating type would be truly friendlier for them.  In extensive
experience with Python using %.12g for repr(float) in the old days, the
primary effect of that was to delay the point at which newcomers bumped into
their first fatal fp "surprise", and *recognized* it as being fatal to them.
I've come to love seeing newcomers whine about the output for, e.g., 0.1:
they hit it early, and are immediately directed to the Appendix explaining
what's going on.  This spurs a healthy and necessary mental reset about how
binary floating-point arithemtic really works.  In return, what we see much
less often now are complaints about binary fp surprises in much subtler
contexts.  If 3.4 got displayed as exactly "3.4", newcomers would face the
much harder task of recognizing the subtle consequences of that, no, in
reality it's not exactly 3.4 at all, and assuming that is can have
catastrophic consequences.

All that said, there's an implementation of IBM's (Mike Cowlishaw's)
proposed standard decimal arithmetic in the Python CVS sandbox, begging for
use, docs and improvement.  That would match newcomer expectations much
better, without contorted hacks trying to make it appear that it's something
it isn't.  Effort put there would address a cause instead of a symptom.






More information about the Python-list mailing list