[Python-Dev] RE: [Numpy-discussion] RE: Possible bug (was Re: numpy, overflow, inf, ieee, and rich comparison)
Tim Peters
tim_one@email.msn.com
Sun, 15 Oct 2000 16:35:27 -0400
[cbarker@jps.net]
>> I've come to Python from MATLAB for numerics, and I really appreciated
>> MATLAB's way of handling all this. I don't think MATLAB has true 754 ...
[Konrad Hinsen]
> MATLAB is fine for simple interactive data processing, and its
> behaviour is adapted to this task. I don't think anyone would use
> MATLAB for developing complex algorithms, its programming language
> isn't strong enough for that. Python is, so complex algorithms have to
> be considered as well. And for that kind of application, continuing a
> calculation with Infs and NaNs is a debugging nightmare.
A non-constructive (because futile) speculation: the first time I saw what
754 was intending to do, my immediate reaction was "hmm -- the exponent
field is too narrow!". With a max (double) val in the ballpark of 1e300,
you get an infinity as soon as you square something as small as 1e150, and
once you get a few infinities, NaNs are sure to follow (due to Inf-Inf and
Inf/Inf). The dynamic range for 754 singles is much smaller still.
There's no doubt about Cray arithmetic being hard to live with, but while
Mr. Cray didn't worry about proper rounding, he did devote 15 bits to Cray's
exponent field (vs. 11 for 754). As a result, overflows were generally a
non-issue on Cray boxes, and *nobody* complained (in my decade there) about
Cray HW raising a fatal exception if one occurred.
In return, you got only 48 bits of precision (vs. 53 for 754). But, for
most physical problems, how accurate are the inputs? 10 bits on a good day,
20 bits on a great day? Crays worked despite their sloppy numerics because,
for most problems to which they were applied, they carried more than twice
the precision *and* dynamic range than the final results needed.
>> I have not yet heard a decent response to the question of what to do
>> when a single value in a large array is bad, and causes an exception.
I'd usually trace back and try to figure out how it got "bad" to begin with
...
> I'd like to have at least the option of raising an exception in that
> case. Note that this is not what NumPy does today.
Does NumPy use the fpectl module? I suppose not. LLNL contribued that
code, but we hear very little feedback on it. It arranges to (in
platform-dependent ways) convert the 754 overflow, divide-by-0 and
invalid-operation signals into Python exceptions. In core Python, this is
accomplished at the extraordinary expense of doing a setjmp before, and a
function call + double->int conversion after, every single fp operation. A
chief problem is that SIGFPE is the only handle C gives us on fp "errors",
and the C std does not allow returning from a SIGFPE handler (the result of
trying to is undefined, and indeed varies wildly across platforms); so if
you want to regain control, you have to longjmp out of the handler.
The NumPy implementation could use the PyFPE_START_PROTECT and
PyFPE_END_PROTECT macros to brackete entire array operations, though, and so
pay for the setjmp etc only once per array op. This is difficult stuff, but
doable.
>> >> sqrt(-1)
>> ans =
>> 0 + 1.0000i
>>
>> Hey! I like that! Python is dynamic, why can't we just get the actual
>> answer to sqrt(-1), without using cmath, that always returns a complex ?
> For the same reason that makes 2/3 return zero instead of a float
> division result. Like C or Fortran, Python treats integers, floats,
> and complex numbers as different data types.
You know I'm in general agreement with you on this one, but I have to draw a
distinction here: Guido thinks that 2/3 returning 0 was a design mistake,
but not that math.sqrt(-1) raising an exception is a mistake. Most Python
users won't know what to do with a complex number, so it's "an error" to
them. I would like to view this in P3K (if not earlier) as being akin to
754 exceptions: some people are delighted to have 1e300**2 return +Inf,
while others want to see OverflowError instead, and still others want to see
+Inf *and* have a sticky flag set saying "an overflow occurred". We could
treat f(x) (for f == sqrt and everything else) the same way wrt to a new
ComplexResultFromNonComplexInputsError: define "non-stop" complex results,
let the user choose whether to do nonstop or raise an exception, and a
supply a sticky flag saying whether or not any complex results were
generated from non-complex inputs.
> ...
> The "right" solution, in my opinion, would be to have a single
> "number" type of which integers, float, and complex numbers are simply
> different internal representations. But such a change cannot be
> introduced without breaking a lot of code.
The current distinction between ints and longs (unbounded ints) should also
get swallowed by this. A way to get from here to there is especially
irksome at the C API level, since, e.g., many C API functions pass Python
ints as C longs directly. A smaller number pass Python floats as C doubles
directly.
it's-wonderful-that-p3k-will-solve-everything<wink>-ly y'rs - tim