[Python-Dev] RE: Possible bug (was Re: numpy, overflow, inf, ieee, and rich comparison)

Huaiyu Zhu huaiyu_zhu@yahoo.com
Wed, 11 Oct 2000 16:33:08 -0700 (PDT)


On the issue of whether Python should ignore over/underflow on IEEE-enabled
platforms: 

[Tim Peters]
> That would stop the exception on exp() underflow, which is what you're
> concerned about.  It would also stop exceptions on exp() overflow, and on
> underflow and overflow for all other math functions too.  I doubt Guido will
> ever let Python ignore overflow by default, #ifdef'ed or not.  A semantic
> change that jarring certainly won't happen for 2.0 (which is just a week
> away).

It can be argued that on IEEE enabled systems the proper thing to do for
overflow is simply return Inf.  Raising exception is WRONG.  See below.

[Guido van Rossum]
> Incidentally, math.exp(800) returns inf in 1.5, and raises
> OverflowError in 2.0.  So at least it's consistent.

That is not consistent at all.  Suppose I'm plotting the curve f(x) where
x include some singular points of f.  In the first case the plot works with
some portion of the curve clipped.  In the second case it bombs.  

[Tim Peters] 
> Nothing like that will happen without a PEP first.  I would like to see
> *serious* 754 support myself, but that depends too on many platform experts
> contributing real work (if everyone ran WinTel, I could do it myself
> <wink>).

[Guido van Rossum]
> Bingo!
> 
> 1.5.2 links with -lieee while 2.0 doesn't.  Removing -lieee from the
> 1.5.2 link line makes is raise OverflowError too.  Adding it to the
> 2.0 link line makes it return 0.0 for exp(-1000) and inf for
> exp(1000).

If using ieee is as simple as setting such a flag, there is no reason at all
not to use it.  Here are some more examples:

Suppose you have done hours of computation on a problem.  Just as you are
about to get the result, you get an exception.  Why?  Because the residual
error is too close to zero.

Suppose you want to plot the curve of Gausian distribution.  Oops, it fails.
Because beyond a certain region the value is near zero.

With these kinds of problems, vectorized numerical calculation becomes
nearly impossible.  How do you work in such an environment?  You have to
wrap every calculation in a try/except structure, and whenever there is an
exception, you have to revert to elementwise operations.  In practice this
simply means Python would not be suitable for numerical work at all.

What about the other way round?  No problem.  It is easy to write functions
like isNaN, isInf, etc.  With these one can raise exceptions in any place
one want.  It is even possible to raise exceptions if a matrix is singular
to a certain precision, etc.

The key point to observe here is that most numerical work involve more than
one element.  Some of them may be out of mahcine bounds but the whole thing
could still be quite meaningful.  Vice versa it is also quite possible that
all elements are within bounds while the whole thing is meaningless.  The
language should never take over or subvert decisions based on numerical
analysis.

[Tim Peters] 
> Ignoring ERANGE entirely is not at all the same behavior as 1.5.2, and
> current code certainly relies on detecting overflows in math functions. 

As Guido observed ERANGE is not generated with ieee, even for overflow.  So
it is the same behavior as 1.5.2.  Besides, no correct numerical code should
depend on exceptions like this unless the machine is incapable of handling
Inf and NaN.

Even in the cases where you do want to detect overflow, it is still wrong to
use exceptions.  Here's an example: x*log(x) approaches 0 as x approaches 0.
If x==0 then log(x)==-Inf but 0*-Inf==NaN, not what one would want.  But
exception is the wrong tool to hangle this, because if x is an array, some
of its element may be zero but other's may not.  The right way to do it is
something like

def entropy(probability):
  p = max(probability, 1e-323)
  return p*log(p)

[Tim Peters]
> In no case can you expect to see overflow ignored in 2.0.

You are proposing a dramatic change from the behavior of 1.5.2.  This looks
like to me to need a PEP and a big debate.  It would break a LOT of
numerical computations.

[Thomas Wouters]
> I remember the patch that did this, on SF. It was titled "don't link with
> -lieee if it isn't necessary" or something. Not sure what it would break,
> but mayhaps declaring -lieee necessary on glibc systems is the right fix ?
>
> (For the non-autoconf readers among us: the first snippet writes a test
> program to see if the function '__fpu_control' exists when linking with
> -lieee in addition to $LIBS, and if so, adds -lieee to $LIBS. The second
> snippet writes a test program to see if the function '__fpu_control'
> exists with the current collection of $LIBS. If it doesn't, it tries it
> again with -lieee,
>
> Pesonally, I think the patch should just be reversed... The comment above
> the check certainly could be read as 'Linux requires -lieee for correct
> f.p. operations', and perhaps that's how it was meant.

The patch as described seems to be based on flawed thinking.  The numbers
Inf and NaN are always necessary.  The -lieee could only be unnecessary if
the behavior is the same as IEEE.  Obviously it isn't.  So I second Thomas's
suggestion. 

[Tim Peters]
> If no progress is made on determining the true cause over the next few days,
> I'll hack mathmodule.c to ignore ERANGE in the specific case the result
> returned is a zero (which would "fix" your exp underflow problem without
> stopping overflow detection).  Since this will break code on any platform
> where errno was set to ERANGE on underflow in 1.5.2, I'll need to have a
> long discussion w/ Guido first.  I *believe* that much is actually sellable
> for 2.0, because it moves Python in a direction I know he likes regardless
> of whether he ever becomes a 754 True Believer.

[Guido van Rossum]
> No, the configure patch is right.  Tim will check in a change that
> treats ERANGE with a return value of 0.0 as underflow (returning 0.0,
> not raising OverflowError).

What is the reason to do this?  It looks like intetionally subverting ieee
even when it is available.  I thought Tim meant that only logistical
problems prevent using ieee.

If you do choose this route, please please please ignore ERANGE entirely,
whether return value is zero or not.

The only possible way that ERANGE could be useful at all is if it could be
set independently for each element of an array, and if it behave as a
warning instead of an exception, ie.  the calculation would continue if it
is not caught.  Well, then, Inf and NaN serve this purpose perfectly.

It is very reasonable to set errno in glibc for this; it is completely
unreasonable to raise an exception in Python, because exceptions cannot be
set to individual elements and they cannot be ignored.


Huaiyu
-- 
Huaiyu Zhu                       hzhu@users.sourceforge.net
Matrix for Python Project        http://MatPy.sourceforge.net