[Python-bugs-list] [Bug #110624] float("1.0e-309") inconsistency on win32 (PR#245)

Fri, 25 Aug 2000 15:27:09 -0700

Bug #110624, was updated on 2000-Jul-31 14:08
Here is a current snapshot of the bug.

Project: Python
Category: Core
Status: Open
Resolution: Remind
Bug Group: None
Priority: 5
Summary: float("1.0e-309") inconsistency on win32 (PR#245)

Details: Jitterbug-Id: 245
Submitted-By: sde@recombinant.demon.co.uk
Date: Wed, 22 Mar 2000 16:13:26 -0500 (EST)
Version: 1.5.2
OS: win32

#! /usr/bin/python

# Inconsistent behaviour.

# Python 1.5.2 win32 fails the second print (why not both?)
# other versions print both expressions

#              Ok  Python 1.5.2 on SuSE Linux 6.3
#              Ok  JPython 1.1 on java1.1.7B
# Partial failure  Python 1.5.2 win32

print eval("float(1.0e-309)")
print float("1.0e-309") # ValueError: float() literal too large: 1.0e-309

====================================================================
Audit trail:
Fri Mar 24 16:42:36 2000	guido	changed notes
Fri Mar 24 16:42:36 2000	guido	moved from incoming to open

Follow-Ups:

Date: 2000-Jul-31 14:08
By: none

Comment:
From: "Tim Peters" <tim_one@email.msn.com>
Subject: RE: [Python-bugs-list] float("1.0e-309") inconsistency on win32 (PR#245)
Date: Thu, 23 Mar 2000 01:43:30 -0500

> -----Original Message-----
> From: python-bugs-list-admin@python.org
> [mailto:python-bugs-list-admin@python.org]On Behalf Of
> sde@recombinant.demon.co.uk
> Sent: Wednesday, March 22, 2000 4:13 PM
> To: python-bugs-list@python.org
> Cc: bugs-py@python.org
> Subject: [Python-bugs-list] float("1.0e-309") inconsistency on win32
> (PR#245)
>
>
> Full_Name: Stephen D Evans
> Version: 1.5.2
> OS: win32
> Submission from: recombinant.demon.co.uk (212.229.73.7)
>
>
> #! /usr/bin/python
>
> # Inconsistent behaviour.
>
> # Python 1.5.2 win32 fails the second print (why not both?)
> # other versions print both expressions
>
> #              Ok  Python 1.5.2 on SuSE Linux 6.3
> #              Ok  JPython 1.1 on java1.1.7B
> # Partial failure  Python 1.5.2 win32
>
> print eval("float(1.0e-309)")
> print float("1.0e-309") # ValueError: float() literal too large: 1.0e-309

First note that these don't do the same thing:  the first passes a float to
"float", the second passes a string to "float".  Change the first to

    print eval("float('1.0e-309')")

and it gives the same bogus error as the second one.

Then it turns out the error is Microsoft's fault.  This tiny C program shows
the bug:

#include <errno.h>
#include <stdlib.h>
#include <stdio.h>

void
main()
{
    double x;
    char* dontcare;
    errno = 0;
    x = strtod("1.0e-309", &dontcare);
    printf("errno after = %d\n", errno);
    printf("x after = %g\n", x);
}

This prints

    errno after = 34
    x after = 0

when compiled & linked with MS's VC5; don't know about VC6.  Same thing for
"1.0e-308".  Works fine for "1.0e-307".  Doubt this will get fixed before MS
fixes their library.

-------------------------------------------------------

Date: 2000-Jul-31 14:08
By: none

Comment:
From: Guido van Rossum <guido@python.org>
Subject: Re: [Python-bugs-list] float("1.0e-309") inconsistency on win32 (PR#245)
Date: Fri, 24 Mar 2000 04:51:57 -0500

> > Full_Name: Stephen D Evans
> > Version: 1.5.2
> > OS: win32
> > Submission from: recombinant.demon.co.uk (212.229.73.7)
> >
> >
> > #! /usr/bin/python
> >
> > # Inconsistent behaviour.
> >
> > # Python 1.5.2 win32 fails the second print (why not both?)
> > # other versions print both expressions
> >
> > #              Ok  Python 1.5.2 on SuSE Linux 6.3
> > #              Ok  JPython 1.1 on java1.1.7B
> > # Partial failure  Python 1.5.2 win32
> >
> > print eval("float(1.0e-309)")
> > print float("1.0e-309") # ValueError: float() literal too large: 1.0e-309
> 
> First note that these don't do the same thing:  the first passes a float to
> "float", the second passes a string to "float".  Change the first to
> 
>     print eval("float('1.0e-309')")
> 
> and it gives the same bogus error as the second one.
> 
> Then it turns out the error is Microsoft's fault.  This tiny C program shows
> the bug:
> 
> #include <errno.h>
> #include <stdlib.h>
> #include <stdio.h>
> 
> void
> main()
> {
>     double x;
>     char* dontcare;
>     errno = 0;
>     x = strtod("1.0e-309", &dontcare);
>     printf("errno after = %d\n", errno);
>     printf("x after = %g\n", x);
> }
> 
> This prints
> 
>     errno after = 34
>     x after = 0
> 
> when compiled & linked with MS's VC5; don't know about VC6.  Same thing for
> "1.0e-308".  Works fine for "1.0e-307".  Doubt this will get fixed before MS
> fixes their library.

The bizarre thing is that this is broken the same way on Solaris:

>>> 1.0e-309
1.0000000000000019e-309
>>> float("1.0e-309")
Traceback (innermost last):
  File "<stdin>", line 1, in ?
ValueError: float() literal too large: 1.0e-309
>>>

I looked and it turns out that Python uses atof() in the first case
(string literal encountered in a Python expression) and strtod() in
the second case (string passed to float()).

Apparently strtod() and atof() differ in implementation, even though
the Solaris man page says "The atof(str) function call is equivalent
to strtod(str, (char **)NULL)."

We could fix this by changing float() to do its own syntax checking
and then use atof()...  Is it worth it?

--Guido van Rossum (home page: http://www.python.org/~guido/)

-------------------------------------------------------

Date: 2000-Jul-31 14:08
By: none

Comment:
From: "Tim Peters" <tim_one@email.msn.com>
Subject: RE: [Python-bugs-list] float("1.0e-309") inconsistency on win32 (PR#245)
Date: Fri, 24 Mar 2000 22:48:50 -0500

[atof and strtod act differently given denormals on both Windows &
 Solaris]

[Guido]
> ...
> Apparently strtod() and atof() differ in implementation, even though
> the Solaris man page says "The atof(str) function call is equivalent
> to strtod(str, (char **)NULL)."

Ya, their man page is lying.  atof() existed in the mists of prehistory and
typically did no error checking at all.  IIRC, ANSI C introduced strtod(),
which generally got implemented as a layer of error-checking around atof.

I have to take it back that this is a bug in MS's strtod:  DBL_MIN is MS's
limits.h is 2.2250738585072014e-308, so strtod() *should* gripe on non-zero
inputs with absolute value smaller than that.

> We could fix this by changing float() to do its own syntax checking
> and then use atof()...  Is it worth it?

Depends on your goal <wink>:  do you want more extreme cases, like 1e-500,
to blow up (strtod) or underflow to 0 (atof)?  The example in the original
test case is subtler because atof made it *appear* to be "a regular old
number"; in fact, it's not, it's small enough that it falls into 754's
"denormalized" range.  This means the conversion loses some extraordinary
amount of-- but not all --information (whereas 1e-500 is below even the
denorm range:  conversion loses all information).

Without a coherent strategy for dealing with 754 issues, it's hard to decide
which is better.  Since strtod() is more restrictive, if this is worth
bothering about at all now (for P3K I think 754 needs to be taken
seriously), I actually recommend changing  current atof() calls to use
native strtod() instead.

-------------------------------------------------------

Date: 2000-Jul-31 14:08
By: none

Comment:
From: Guido van Rossum <guido@python.org>
Subject: Re: [Python-bugs-list] float("1.0e-309") inconsistency on win32 (PR#245)
Date: Fri, 24 Mar 2000 23:14:29 -0500

> [atof and strtod act differently given denormals on both Windows &
>  Solaris]
> 
> [Guido]
> > ...
> > Apparently strtod() and atof() differ in implementation, even though
> > the Solaris man page says "The atof(str) function call is equivalent
> > to strtod(str, (char **)NULL)."

[Tim]
> Ya, their man page is lying.  atof() existed in the mists of prehistory and
> typically did no error checking at all.  IIRC, ANSI C introduced strtod(),
> which generally got implemented as a layer of error-checking around atof.
> 
> I have to take it back that this is a bug in MS's strtod:  DBL_MIN is MS's
> limits.h is 2.2250738585072014e-308, so strtod() *should* gripe on non-zero
> inputs with absolute value smaller than that.
> 
> > We could fix this by changing float() to do its own syntax checking
> > and then use atof()...  Is it worth it?
> 
> Depends on your goal <wink>:  do you want more extreme cases, like 1e-500,
> to blow up (strtod) or underflow to 0 (atof)?  The example in the original
> test case is subtler because atof made it *appear* to be "a regular old
> number"; in fact, it's not, it's small enough that it falls into 754's
> "denormalized" range.  This means the conversion loses some extraordinary
> amount of-- but not all --information (whereas 1e-500 is below even the
> denorm range:  conversion loses all information).
> 
> Without a coherent strategy for dealing with 754 issues, it's hard to decide
> which is better.  Since strtod() is more restrictive, if this is worth
> bothering about at all now (for P3K I think 754 needs to be taken
> seriously), I actually recommend changing  current atof() calls to use
> native strtod() instead.

Hm, I'm not so sure.  Suppose I'm writing a program that reads a data
files generated by some Fortran program.  The Fortran program is
giving me points to plot for example.  If Fortran manages to output
1e-500, wouldn't it make more sense if I rounded that to zero instead
of rejecting it?  After all, after converting to plot precision it's
going to be zero anyway.

This way I could almost defend using strtod() for literals in
Python source code (where it makes more sense to warn about underflow)
but atof() for input.  Except that of course input could conceivably
be using eval()...

Another argument for turning underflow into zero is that that also
happens in regular arithmetic:

>>> 0.1**2**8
1.0000000000000275e-256
>>> 0.1**2**9
0.0

I like this uniform behavior: overflow -> exception, underflow ->
zero.  My calculator does this too.

Am I hopelessly naive about this?  What else can we do?  What control
does C give?  What does sscanf() do?

--Guido van Rossum (home page: http://www.python.org/~guido/)

-------------------------------------------------------

Date: 2000-Jul-31 14:08
By: none

Comment:
From: "Tim Peters" <tim_one@email.msn.com>
Subject: RE: [Python-bugs-list] float("1.0e-309") inconsistency on win32 (PR#245)
Date: Fri, 24 Mar 2000 23:53:10 -0500

"In the face of ambiguity, refuse the temptation to guess".

That's one of the Pythonic Theses you're tempted to ignore too often <wink>.

Part of taking 754 seriously is that 754 gives the user complete control
over what happens in case of exceptions (including underflow):  ignore them,
raise a fatal error, or simply set a flag saying it occurred.
Unfortunately, we have to wait for C9x until there's a portable way to get
at that stuff.  Before then, it requires wildly varying platform-specific
hair.

> Hm, I'm not so sure.  Suppose I'm writing a program that reads a data
> files generated by some Fortran program.  The Fortran program is
> giving me points to plot for example.  If Fortran manages to output
> 1e-500, wouldn't it make more sense if I rounded that to zero instead
> of rejecting it?

This *may* make good sense if Python had certain knowledge that the program
is merely going to plot the points, but probably not even then.  That is,
"insignificantly small" is relative to the application, and e.g. for all we
know the Fortran program generated a million doubles *all* in the range
[1e-500, 10e-500]:  the intended plot of the data could very well be a
pointillistic version of the Mona Lisa rather than a straight line.

> After all, after converting to plot precision it's
> going to be zero anyway.

As above, this conclusion relies on the dubious assumption that 1e-500 is
very much smaller than the other values.

> This way I could almost defend using strtod() for literals in
> Python source code (where it makes more sense to warn about underflow)
> but atof() for input.  Except that of course input could conceivably
> be using eval()...
>
> Another argument for turning underflow into zero is that that also
> happens in regular arithmetic:
>
> >>> 0.1**2**8
> 1.0000000000000275e-256
> >>> 0.1**2**9
> 0.0

Which is often desired but sometimes a disaster -- the language simply can't
guess.  On whatever machine you ran this on, it almost certainly set the
"underflow happened" flag but continued on because the underflow exception
was masked out by default.

> I like this uniform behavior: overflow -> exception, underflow ->
> zero.  My calculator does this too.

Not mine <wink>.  Really, whether underflow gripes is controlled by a
user-settable flag on high end HP calculators.  Note too that neither does
float *overflow* raise an exception under most Pythons today:

D:\Python>python
Python 1.5.42 (#0, Jan 31 2000, 14:05:14) [MSC 32 bit (Intel)] on win32
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> 1e500
1.#INF
>>> 1e200**2
1.#INF
>>>

I personally favor raising exceptions (by default) on 754's overflow, divide
by 0, and invalid operation conditions, while (again by default) letting
underflow and inexact pass without comment.  But it again requires fiddling
the HW's 754 control registers to make that happen.  P3K, not now.

> Am I hopelessly naive about this?

Not *entirely* hopeless, but close <wink>.  If we ever talk about it for an
hour, I'll convince you of the futility of fighting 754.  They beat all
resistance out of me in a mere decade <0.5 wink>.

> What else can we do?

Not much!  Switching uniformly to either atof() or strtod() would be OK by
me for now, although I don't think patching over the current inconsistency
buys enough bang for the buck to be worth the effort.

> What control does C give?

None, until C9X.

> What does sscanf() do?

I don't care -- ANSI C predated 754's absolute universal triumph, and ANSI
C's numerics fight the *right* thing to do now just about every step of the
way.  C9x is supposed to fix all that.

In the meantime, I think what JPython does is much more interesting (but
don't know what that is):  whatever we do here should be consistent with The
Other Python too, and Java has a much better 754 story than ANSI C.  754 is
here to stay, but the last iteration of ANSI C isn't.  Best guess is that
Java acts more like atof than strtod in this case.

-------------------------------------------------------

Date: 2000-Jul-31 14:08
By: none

Comment:
From: Guido van Rossum <guido@python.org>
Subject: Re: [Python-bugs-list] float("1.0e-309") inconsistency on win32 (PR#245)
Date: Sat, 25 Mar 2000 14:15:54 -0500

> "In the face of ambiguity, refuse the temptation to guess".
> 
> That's one of the Pythonic Theses you're tempted to ignore too often <wink>.
> 
> Part of taking 754 seriously is that 754 gives the user complete control
> over what happens in case of exceptions (including underflow):  ignore them,
> raise a fatal error, or simply set a flag saying it occurred.
> Unfortunately, we have to wait for C9x until there's a portable way to get
> at that stuff.  Before then, it requires wildly varying platform-specific
> hair.

OK, so let's stick with the defaults that 754 presecribes until we can
give the user control.  That's the purpose of defaults, right?

> > Hm, I'm not so sure.  Suppose I'm writing a program that reads a data
> > files generated by some Fortran program.  The Fortran program is
> > giving me points to plot for example.  If Fortran manages to output
> > 1e-500, wouldn't it make more sense if I rounded that to zero instead
> > of rejecting it?
> 
> This *may* make good sense if Python had certain knowledge that the program
> is merely going to plot the points, but probably not even then.  That is,
> "insignificantly small" is relative to the application, and e.g. for all we
> know the Fortran program generated a million doubles *all* in the range
> [1e-500, 10e-500]:  the intended plot of the data could very well be a
> pointillistic version of the Mona Lisa rather than a straight line.

OK, forget the example.

> > After all, after converting to plot precision it's
> > going to be zero anyway.
> 
> As above, this conclusion relies on the dubious assumption that 1e-500 is
> very much smaller than the other values.

I think even 754 tells us that 1e-500 is smaller than what we normally
need to deal with.

> > This way I could almost defend using strtod() for literals in
> > Python source code (where it makes more sense to warn about underflow)
> > but atof() for input.  Except that of course input could conceivably
> > be using eval()...
> >
> > Another argument for turning underflow into zero is that that also
> > happens in regular arithmetic:
> >
> > >>> 0.1**2**8
> > 1.0000000000000275e-256
> > >>> 0.1**2**9
> > 0.0
> 
> Which is often desired but sometimes a disaster -- the language simply can't
> guess.  On whatever machine you ran this on, it almost certainly set the
> "underflow happened" flag but continued on because the underflow exception
> was masked out by default.

Again: 754 gives a default.  I want to conform to the default -- it's
better to provide control, but even when we provide control, there will
still be default behavior, and (if I understand 754 correctly) the
default is not to interrupt for underflow.

> > I like this uniform behavior: overflow -> exception, underflow ->
> > zero.  My calculator does this too.
> 
> Not mine <wink>.  Really, whether underflow gripes is controlled by a
> user-settable flag on high end HP calculators.  Note too that neither does
> float *overflow* raise an exception under most Pythons today:
> 
> D:\Python>python
> Python 1.5.42 (#0, Jan 31 2000, 14:05:14) [MSC 32 bit (Intel)] on win32
> Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
> >>> 1e500
> 1.#INF
> >>> 1e200**2
> 1.#INF
> >>>

Oops, you're right.  Must be another 754 default behavior!

> I personally favor raising exceptions (by default) on 754's overflow, divide
> by 0, and invalid operation conditions, while (again by default) letting
> the HW's 754 control registers to make that happen.  P3K, not now.

I would personally also prefer an exception for overflow.  You don't
say what you want for underflow though.  I still like silent underflow
to zero by default.

> > Am I hopelessly naive about this?
> 
> Not *entirely* hopeless, but close <wink>.  If we ever talk about it for an
> hour, I'll convince you of the futility of fighting 754.  They beat all
> resistance out of me in a mere decade <0.5 wink>.

I'm not fighting it.  Say the ideal Python has full user control over
fp exceptions.  What should the defaults be?  Python 1.6 should have
the same defaults, even if it doesn't have the controls.

> > What else can we do?
> 
> Not much!  Switching uniformly to either atof() or strtod() would be OK by
> me for now, although I don't think patching over the current inconsistency
> buys enough bang for the buck to be worth the effort.
> 
> > What control does C give?
> 
> None, until C9X.
> 
> > What does sscanf() do?
> 
> I don't care -- ANSI C predated 754's absolute universal triumph, and ANSI
> C's numerics fight the *right* thing to do now just about every step of the
> way.  C9x is supposed to fix all that.
> 
> In the meantime, I think what JPython does is much more interesting (but
> don't know what that is):  whatever we do here should be consistent with The
> Other Python too, and Java has a much better 754 story than ANSI C.  754 is
> here to stay, but the last iteration of ANSI C isn't.  Best guess is that
> Java acts more like atof than strtod in this case.

Bingo.  Indeed it does.  1e500 prints as Infinity; 1e-500 is 0.0,
either as literal or when converted from a string.

I'll change float() to use atof().

--Guido van Rossum (home page: http://www.python.org/~guido/)

-------------------------------------------------------

Date: 2000-Jul-31 14:08
By: none

Comment:
From: "Tim Peters" <tim_one@email.msn.com>
Subject: RE: [Python-bugs-list] float("1.0e-309") inconsistency on win32 (PR#245)
Date: Tue, 4 Apr 2000 00:29:01 -0400

[Guido]
> OK, so let's stick with the defaults that 754 presecribes until we can
> give the user control.  That's the purpose of defaults, right?

In every world other 754 (see below), but we really have no choice now
(because C doesn't give us any control now).

> I think even 754 tells us that 1e-500 is smaller than what we normally
> need to deal with.

Well, there is no 1e-500 under 754, which is why there's some reason to at
least warn about it (if the user wanted 0, why didn't they type 0?).

> Again: 754 gives a default.  I want to conform to the default -- it's
> better to provide control, but even when we provide control, there will
> still be default behavior, and (if I understand 754 correctly) the
> default is not to interrupt for underflow.

The 754 default is never to raise an exception no matter what, whether
overflow, underflow, invalid operation (like sqrt(-4)), or divide by 0.  So
Python's current behavior wrt these two is non-conforming:

    math.sqrt(-4)
    1. / 0.

However, 754 is primarily a HW std, and the defaults were prescribed by a
committee of HW geeks and math library authors.  They were caught totally
off guard by how long it took for languages to provide the control features
the std also mandates -- for "regular users" it's plainly insane to avoid
griping about the two above, and it was never 754's intent to let them pass
silently for regular users.

Note that Java has been skewered mercilessly by Kahan (Mr. 754 Himself) for
accepting the defaults but not providing the also-mandated control
functions.  The std is subtler than it appears, and all the fiddly bits are
really needed.

>> Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>> >>> 1e500
>> 1.#INF
>> >>> 1e200**2
>> 1.#INF
>> >>>

> Oops, you're right.  Must be another 754 default behavior!

Right.

> I would personally also prefer an exception for overflow.  You don't
> say what you want for underflow though.  I still like silent underflow
> to zero by default.

754 defines (almost) everything:  underflow when the underflow exception is
masked out must deliver a zero with the same sign as the infinite-precision
result.

> I'm not fighting it.

You will <wink>; e.g., returning 1 for "x == x" is incorrect when x is a
NaN.

> Say the ideal Python has full user control over fp exceptions.  What
> should the defaults be?

IMO, exception on overflow, divide-by-0 and invalid operation.  Let
underflow and inexact pass silently.  That's what I implemented at KSR, and
customers were *very* much happier with that than with the 754 defaults.  Of
course I also implemented all the 754 control and status inquiry functions,
so the one 754-savvy programmer per site was happy too (they need the
control to write robust numerical libraries for everyone else to use -- the
*true* purpose of the 754 HW defaults).

> Python 1.6 should have the same defaults, even if it doesn't have the
> controls.

This is a big project, as there's no portable way even to detect fp overflow
now.  The math libraries should play along too.

> I'll change float() to use atof().

OK by me!

-------------------------------------------------------

Date: 2000-Aug-03 05:43
By: twouters

Comment:
I *think* this one is fixed and closed. It looks like Guido promises to fix this, in any case, and it looks done.

-------------------------------------------------------

Date: 2000-Aug-10 11:48
By: gvanrossum

Comment:
No, it's not fixed, but it is platform dependent how it behaves.
The conclusion was that we should use atof() everywhere, and write a separate syntax checker (since atof() stops at the first invalid character).
I made a start at a syntax checker but then got distracted. Here's my code:

static char *
floatsyntax(char *s)
{
	/* Check for valid floating point syntax:
	   space*
	   [sign]
	   (digit+ [period digit*] | period digit+)
	   [(e|E) [sign] digit+]
	   space*
	*/
	int digits, period;

	while (isspace(*s))
		s++;
	if (*s == '+' || *s == '-')
		s++;
	digits = period = 0;
	for (;;) {
		if (isdigit(*s))
			digits++;
		else if (*s == '.') {
			if (period)
				return NULL;
			period++;
		}
		else
			break;
	}
	if (!digits)
		return NULL;
	if (*s == 'e' || *s == 'E') {
		s++;
		if (*s == '+' || *s == '-')
			s++;
		digits = 0;
		while (isdigit(*s))
			digits++;
		if (!digits)
			return NULL;
	}
	return s;
}

-------------------------------------------------------

Date: 2000-Aug-10 11:51
By: gvanrossum

Comment:
Shit. SF removes leading whitespace. Oh well, mail me for a properly formatted version of that code.
-------------------------------------------------------

Date: 2000-Aug-10 21:57
By: tim_one

Comment:
It's curious that in the change mail SF generated, leading indentation was *not* lost.  This must be a browser display thing.
Anyway, by eyeball the syntax checker has two bugs:

1. Infinite loop when looking at an exponent.

x   while (isdigit(*s)) digits++; 

should be

x   while (isdigit(*s)) {digits++; s++;)

2. Like atof, stops at an invalid character.  Before the

x   return s;

it should have, e.g.,

x   while (ispace(*s)) ++s;
x   if (*s) return NULL;

although I'm not sure what the assumptions are about the input to this function.

-------------------------------------------------------

For detailed info, follow this link:
http://sourceforge.net/bugs/?func=detailbug&bug_id=110624&group_id=5470