
On Wed, May 31, 2023 at 3:51 PM Benjamin Root <ben.v.root@gmail.com> wrote:
I think it is the special values aspect that is most concerning. Math is just littered with all sorts of identities, especially with trig functions. While I know that floating point calculations are imprecise, there are certain properties of these functions that still held, such as going from -1 to 1.
As a reference point on an M1 Mac using conda-forge: ```
import numpy as np np.__version__ '1.24.3' np.sin(0.0) 0.0 np.cos(0.0) 1.0 np.sin(np.pi) 1.2246467991473532e-16 np.cos(np.pi) -1.0 np.sin(2*np.pi) -2.4492935982947064e-16 np.cos(2*np.pi) 1.0
Not perfect, but still right in most places.
I would say these are all correct. The true value of sin(np.pi) *is* 1.2246467991473532e-16 (to 15 decimal places). Remember np.pi is not π, but just a rational approximation to it. You can see this with mpmath (note the use of np.pi, which is a fixed float with 15 digits of precision):
import mpmath mpmath.mp.dps = 100 np.pi 3.141592653589793 mpmath.sin(np.pi) mpf('0.0000000000000001224646799147353177226065932274997997083053901299791949488257716260869609973258103775093255275690136556') float(_) 1.2246467991473532e-16
On the other hand, the "correct" floating-point value of sin(np.pi/2) is exactly 1.0, because it rounds to 1 within 15 decimal places
mpmath.sin(np.pi/2) mpf('0.9999999999999999999999999999999981253002716726780066919054431429030069600365417183834514572043398874406') float(mpmath.sin(np.pi/2)) 1.0
That's 32 9's after the decimal point. (Things work out nicer at the 1s than the 0s because the derivatives of the trig functions are zero there, meaning the Taylor series have a cubic error term rather than a squared one) Aaron Meurer
I'm ambivalent about reverting. I know I would love speed improvements because transformation calculations in GIS is slow using numpy, but also some coordinate transformations might break because of these changes.
Ben Root
On Wed, May 31, 2023 at 11:40 AM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Wed, May 31, 2023 at 9:12 AM Robert Kern <robert.kern@gmail.com> wrote:
On Wed, May 31, 2023 at 10:40 AM Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Wed, May 31, 2023 at 4:19 PM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Wed, May 31, 2023 at 8:05 AM Robert Kern <robert.kern@gmail.com> wrote:
I would much, much rather have the special functions in the `np.*` namespace be more accurate than fast on all platforms. These would not have been on my list for general purpose speed optimization. How much time is actually spent inside sin/cos even in a trig-heavy numpy program? And most numpy programs aren't trig-heavy, but the precision cost would be paid and noticeable even for those programs. I would want fast-and-inaccurate functions to be strictly opt-in for those times that they really paid off. Probably by providing them in their own module or package rather than a runtime switch, because it's probably only a part of my program that needs that kind of speed and can afford that precision loss while there will be other parts that need the precision.
I think that would be a good policy going forward.
There's a little more to it than "precise and slow good", "fast == less accurate == bad". We've touched on this when SVML got merged (e.g., [1]) and with other SIMD code, e.g. in the "Floating point precision expectations in NumPy" thread [2]. Even libm doesn't guarantee the best possible result of <0.5 ULP max error, and there are also considerations like whether any numerical errors are normally distributed around the exact mathematical answer or not (see, e.g., [3]).
If we had a portable, low-maintenance, high-accuracy library that we could vendorize (and its performance cost wasn't horrid), I might even advocate that we do that. Reliance on platform libms is mostly about our maintenance burden than a principled accuracy/performance tradeoff. My preference is definitely firmly on the "precise and slow good" for these ufuncs because of the role these ufuncs play in real numpy programs; performance has a limited, situational effect while accuracy can have substantial ones across the board.
It seems fairly clear that with this recent change, the feeling is that the tradeoff is bad and that too much accuracy was lost, for not enough real-world gain. However, we now had several years worth of performance work with few complaints about accuracy issues.
Except that we get a flurry of complaints now that they actually affect popular platforms. I'm not sure I'd read much into a lack of complaints before that.
So I wouldn't throw out the baby with the bath water now and say that we always want the best accuracy only. It seems to me like we need a better methodology for evaluating changes. Contributors have been pretty careful, but looking back at SIMD PRs, there were usually detailed benchmarks but not always detailed accuracy impact evaluations.
I've only seen micro-benchmarks testing the runtime of individual functions, but maybe I haven't paid close enough attention. Have there been any benchmarks on real(ish) programs that demonstrate what utility these provide in even optimistic scenarios? I care precisely <1ULP about the absolute performance of `np.sin()` on its own. There are definitely programs that would care about that; I'm not sure any of them are (or should be) written in Python, though.
One of my takeaways is that there are special values where more care should be taken. Given the inherent inaccuracy of floating point computation, it can be argued that there should be no such expectation, but here we are. Some inaccuracies are more visible than others.
I think it is less intrusive to have the option to lessen precision when more speed is needed than the other way around. Our experience is that most users are unsophisticated when it comes to floating point, we should minimise the consequences for those users.
Chuck _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: ben.v.root@gmail.com
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: asmeurer@gmail.com