RE: [Numpy-discussion] PEP 209: Multi-dimensional Arrays

Paul Barrett writes:
Rob W. W. Hooft writes:
Being a scientist, I have learned that when you multiply a very accurate number with a very approximate number, your result is going to be very approximate, not very accurate! It would thus be more logical to have Float32*Float64 return a Float32!
If numeric precision was all that mattered, then you would be correct. But numeric range is also important. I would hate to take the chance of overflowing the above multiplication because I stored the result as a Float32, instead of a Float64, even though the Float64 is overkill in terms of precision. FORTRAN has made an attempt to address this issue in FORTRAN 9X by allowing the user to indicate the range and precision of the calculation.
A number in a floating point representation is not necessarily represented inexactly. The discussion of Barrett and Hooft is confusing the distinct concepts of precision and accuracy. Well worth reading is Kahan's scathing critcism of Java's floating-point model, at least some of which relates directly to that of Python or proposals in PEPs 209 and 228. http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf See p18 for "definitions" of precision and accuracy. There's a lot more material in the literature, on Kahan's web-site, and the following is an excellent discussion of floating point arithmetic and the IEEE standards. http://cch.loria.fr/documentation/IEEE754/ACM/goldberg.pdf With regard to the treatment of errors: Correct and detailed handling of floating-point exceptions need not impact speed, provided that a mechanism is provided to (en/dis)able each exception. Users not interested in exceptions can simply mask them. I recall relevant prior discussion including constructive comments from Tim Peters. Many modern and efficient numerical algorithms, and also effective debugging of numerical programs that use large datasets, *require* accurate and prompt identification of exceptions. Accurate meaning that the arrays, their indices, the operation, traceback and type of exception must be reported. Delayed reporting of errors is not satisfactory since operations performed in the interim may destroy valuable data, or take a very long time (esp. if many exceptions are being generated). It is probably unreasonable to ask for more than the capabilities provided by some subset of the still platform dependent optimizing compilers used to implement Python/Numpy, but I don't see why we should have much less. I would encourage the developers of PEPs 209 and 228 to submit their designs for review by a panel of professional numerical analysts (not just numerically literate programmers or scientists). While full IEEE 754 within Python or NumPy may still be just a pipe-dream (for some at least), we can at least take a step closer. Robert Robert Harrison Pacific Northwest National Laboratory Richland, Washington 99352 (509) 375-2037 robert.harrison@pnl.gov
participants (1)
-
Harrison, Robert J