I am a user, not a numerics type, so this is undoubtedly a naive question.
Might the sin function be written to give greater accuracy for large real numbers? It seems that significant digits are in some sense being discarded needlessly. I.e., compare these:
sin(linspace(0,10*pi,11))
array([ 0.00000000e+00, 1.22460635e-16, -2.44921271e-16, 3.67381906e-16, -4.89842542e-16, 6.12303177e-16, -7.34763812e-16, 8.57224448e-16, -9.79685083e-16, 1.10214572e-15, -1.22460635e-15])
sin(linspace(0,10*pi,11)%(2*pi))
array([ 0.00000000e+00, 1.22460635e-16, 0.00000000e+00, 1.22460635e-16, 0.00000000e+00, 1.22460635e-16, 0.00000000e+00, 1.22460635e-16, 0.00000000e+00, 1.22460635e-16, 0.00000000e+00])
Just wondering.
Cheers, Alan Isaac
Alan G Isaac wrote:
I am a user, not a numerics type, so this is undoubtedly a naive question.
Might the sin function be written to give greater accuracy for large real numbers? It seems that significant digits are in some sense being discarded needlessly.
Not really. The floating point representation of pi is not exact. The problem only gets worse when you multiply it with something. The method you showed of using % (2*pi) is only accurate when the values are created by multiplying the same pi by another value. Otherwise, it just introduces another source of error, I think.
This is one of the few places where a version of trig functions that directly operate on degrees are preferred. 360.0*n is exactly representable by floating point arithmetic until n~=12509998964918 (give or take a power of two). Doing % 360 can be done exactly.
On Thu, 25 May 2006, Robert Kern apparently wrote:
The method you showed of using % (2*pi) is only accurate when the values are created by multiplying the same pi by another value. Otherwise, it just introduces another source of error, I think.
Just to be clear, I meant not (!) to presumptuosly propose a method for improving thigs, but just to illustrate the issue: both the loss of accuracy, and the obvious conceptual point that there is (in an abstract sense, at least) no need for sin(x) and sin(x+ 2*pi) to differ.
Thanks, Alan
Alan G Isaac wrote:
On Thu, 25 May 2006, Robert Kern apparently wrote:
The method you showed of using % (2*pi) is only accurate when the values are created by multiplying the same pi by another value. Otherwise, it just introduces another source of error, I think.
Just to be clear, I meant not (!) to presumptuosly propose a method for improving thigs, but just to illustrate the issue: both the loss of accuracy, and the obvious conceptual point that there is (in an abstract sense, at least) no need for sin(x) and sin(x+ 2*pi) to differ.
But numpy doesn't deal with abstract senses. It deals with concrete floating point arithmetic. The best value you can *use* for pi in that expression is not the real irrational π. And the best floating-point algorithm you can use for sin() won't (and shouldn't!) assume that sin(x) will equal sin(x + 2*pi).
That your demonstration results in the desired exact 0.0 for multiples of 2*pi is an accident. The results for values other than integer multiples of pi will be as wrong or more wrong. It does not demonstrate that floating-point sin(x) and sin(x + 2*pi) need not differ.
On Thu, 25 May 2006, Robert Kern apparently wrote:
But numpy doesn't deal with abstract senses. It deals with concrete floating point arithmetic.
Of course.
That your demonstration results in the desired exact 0.0 for multiples of 2*pi is an accident. The results for values other than integer multiples of pi will be as wrong or more wrong.
It seems that a continuity argument should undermine that as a general claim. Right?
But like I said: I was just wondering if there was anything exploitable here.
Thanks, Alan
Alan G Isaac wrote:
On Thu, 25 May 2006, Robert Kern apparently wrote:
That your demonstration results in the desired exact 0.0 for multiples of 2*pi is an accident. The results for values other than integer multiples of pi will be as wrong or more wrong.
It seems that a continuity argument should undermine that as a general claim. Right?
What continuity? This is floating-point arithmetic.
[~]$ bc -l bc 1.06 Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. scale = 50 s(1000000) -.34999350217129295211765248678077146906140660532871 [~]$ python Python 2.4.1 (#2, Mar 31 2005, 00:05:10) [GCC 3.3 20030304 (Apple Computer, Inc. build 1666)] on darwin Type "help", "copyright", "credits" or "license" for more information.
from numpy import * sin(1000000.0)
-0.34999350217129299
sin(1000000.0 % (2*pi))
-0.34999350213477698
But like I said: I was just wondering if there was anything exploitable here.
Like I said: not really.
On Thu, 25 May 2006, Robert Kern apparently wrote:
What continuity? This is floating-point arithmetic.
Sure, but a continuity argument suggests (in the absence of specific floating point reasons to doubt it) that a better approximation at one point will mean better approximations nearby. E.g.,
epsilon = 0.00001 sin(100*pi+epsilon)
9.999999976550551e-006
sin((100*pi+epsilon)%(2*pi))
9.9999999887966145e-006
Compare to the bc result of 9.9999999998333333e-006
bc 1.05 Copyright 1991, 1992, 1993, 1994, 1997, 1998 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. scale = 50 epsilon = 0.00001 s(100*pi + epsilon) .00000999999999983333333333416666666666468253968254
Cheers, Alan
Alan G Isaac wrote:
On Thu, 25 May 2006, Robert Kern apparently wrote:
What continuity? This is floating-point arithmetic.
Sure, but a continuity argument suggests (in the absence of specific floating point reasons to doubt it) that a better approximation at one point will mean better approximations nearby. E.g.,
epsilon = 0.00001 sin(100*pi+epsilon)
9.999999976550551e-006
sin((100*pi+epsilon)%(2*pi))
9.9999999887966145e-006
Compare to the bc result of 9.9999999998333333e-006
bc 1.05 Copyright 1991, 1992, 1993, 1994, 1997, 1998 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. scale = 50 epsilon = 0.00001 s(100*pi + epsilon) .00000999999999983333333333416666666666468253968254
You aren't using bc correctly.
bc 1.06 Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. 100*pi 0
If you know that you are epsilon from n*2*π (the real number, not the floating point one), you should just be calculating sin(epsilon). Usually, you do not know this, and % (2*pi) will not tell you this. (100*pi + epsilon) is not the same thing as (100*π + epsilon).
FWIW, for the calculation that you did in bc, numpy.sin() gives the same results (up to the last digit):
from numpy import * sin(0.00001)
9.9999999998333335e-06
You wanted to know if something there is something exploitable to improve the accuracy of numpy.sin(). In general, there is not. However, if you know the difference between your value and an integer multiple of the real number 2*π, then you can do your floating-point calculation on that difference. However, you will not in general get an improvement by using % (2*pi) to calculate that difference.
On Thu, 25 May 2006, Robert Kern apparently wrote:
You aren't using bc correctly.
Ooops. I am not a user and was just following your post without reading the manual. I hope the below fixes pi; and (I think) it makes the same point I tried to make before: a continuity argument renders the general claim you made suspect. (Of course it's looking like a pretty narrow range of possible benefit as well.)
If you know that you are epsilon from n*2*π (the real number, not the floating point one), you should just be calculating sin(epsilon). Usually, you do not know this, and % (2*pi) will not tell you this. (100*pi + epsilon) is not the same thing as (100*π + epsilon).
Yes, I understand all this. Of course, it is not quite an answer to the question: can '%(2*pi)' offer an advantage in the right circumstances? And the original question was again different: can we learn from such calculations that **some** method might offer an improvement?
Anyway, you have already been more than generous with your time.
Thanks! Alan
bc 1.05 Copyright 1991, 1992, 1993, 1994, 1997, 1998 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. scale = 50 pi = 4*a(1) epsilon = 0.00001 s(100*pi + epsilon) .00000999999999983333333333416666666666468253967996 or 9.999999999833333e-006
compared to:
epsilon = 0.00001 sin(100*pi+epsilon)
9.999999976550551e-006
sin((100*pi+epsilon)%(2*pi))
9.9999999887966145e-006
Alan G Isaac wrote:
On Thu, 25 May 2006, Robert Kern apparently wrote:
You aren't using bc correctly.
Ooops. I am not a user and was just following your post without reading the manual. I hope the below fixes pi; and (I think) it makes the same point I tried to make before: a continuity argument renders the general claim you made suspect. (Of course it's looking like a pretty narrow range of possible benefit as well.)
Yes, you probably can construct cases where the % (2*pi) step will ultimately yield an answer closer to what you want. You cannot expect that step to give *reliable* improvements.
If you know that you are epsilon from n*2*π (the real number, not the floating point one), you should just be calculating sin(epsilon). Usually, you do not know this, and % (2*pi) will not tell you this. (100*pi + epsilon) is not the same thing as (100*π + epsilon).
Yes, I understand all this. Of course, it is not quite an answer to the question: can '%(2*pi)' offer an advantage in the right circumstances?
Not in any that aren't contrived. Not in any situations where you don't already have enough knowledge to do a better calculation (e.g. calculating sin(epsilon) rather than sin(2*n*pi + epsilon)).
And the original question was again different: can we learn from such calculations that **some** method might offer an improvement?
No, those calculations make no such revelation. Good implementations of sin() already reduce the argument into a small range around 0 just to make the calculation feasible. They do so much more accurately than doing % (2*pi) but they can only work with the information given to the function. It cannot know that, for example, you generated the inputs by multiplying the double-precision approximation of π by an integer.
You can look at the implementation in fdlibm:
http://www.netlib.org/fdlibm/s_sin.c
bc 1.05 Copyright 1991, 1992, 1993, 1994, 1997, 1998 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. scale = 50 pi = 4*a(1) epsilon = 0.00001 s(100*pi + epsilon) .00000999999999983333333333416666666666468253967996 or 9.999999999833333e-006
compared to:
epsilon = 0.00001 sin(100*pi+epsilon)
9.999999976550551e-006
sin((100*pi+epsilon)%(2*pi))
9.9999999887966145e-006
As Sasha noted, that is an artifact of bc's use of decimal rather than binary, and Python's conversion of the literal "0.00001" into binary.
[scipy]$ bc -l bc 1.06 Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. scale = 50 pi = 4*a(1) epsilon = 1./2.^16 s(100*pi + epsilon) .00001525878906190788105354014301687863346141309981 s(epsilon) .00001525878906190788105354014301687863346141310239 [scipy]$ python Python 2.4.1 (#2, Mar 31 2005, 00:05:10) [GCC 3.3 20030304 (Apple Computer, Inc. build 1666)] on darwin Type "help", "copyright", "credits" or "license" for more information.
from numpy import * epsilon = 1./2.**16 sin(100*pi + epsilon)
1.5258789063872268e-05
sin((100*pi + epsilon) % (2*pi))
1.5258789076118735e-05
sin(epsilon)
1.5258789061907882e-05
I do recommend reading up more on floating point arithmetic. A good paper is Goldman's "What Every Computer Scientist Should Know About Floating-Point Arithmetic":
http://www.physics.ohio-state.edu/~dws/grouplinks/floating_point_math.pdf
This example looks like an artifact of decimal to binary conversion.
Consider this:
epsilon = 1./2**16 epsilon
1.52587890625e-05
sin(100*pi+epsilon)
1.5258789063872671e-05
sin((100*pi+epsilon)%(2*pi))
1.5258789076118735e-05
and in bc:
scale=50 epsilon = 1./2.^16 s(100*pi + epsilon) .00001525878906190788105354014301687863346141310239
On 5/25/06, Alan G Isaac aisaac@american.edu wrote:
On Thu, 25 May 2006, Robert Kern apparently wrote:
What continuity? This is floating-point arithmetic.
Sure, but a continuity argument suggests (in the absence of specific floating point reasons to doubt it) that a better approximation at one point will mean better approximations nearby. E.g.,
epsilon = 0.00001 sin(100*pi+epsilon)
9.999999976550551e-006
sin((100*pi+epsilon)%(2*pi))
9.9999999887966145e-006
Compare to the bc result of 9.9999999998333333e-006
bc 1.05 Copyright 1991, 1992, 1993, 1994, 1997, 1998 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. scale = 50 epsilon = 0.00001 s(100*pi + epsilon) .00000999999999983333333333416666666666468253968254
Cheers, Alan
All the advantages of Linux Managed Hosting--Without the Cost and Risk! Fully trained technicians. The highest number of Red Hat certifications in the hosting industry. Fanatical Support. Click to learn more http://sel.as-us.falkag.net/sel?cmdlnk&kid%107521&bid$8729&dat%1... _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Alan G Isaac wrote:
On Thu, 25 May 2006, Robert Kern apparently wrote:
That your demonstration results in the desired exact 0.0 for multiples of 2*pi is an accident. The results for values other than integer multiples of pi will be as wrong or more wrong.
It seems that a continuity argument should undermine that as a general claim. Right?
Let me clarify. Since you created your values by multiplying the floating-point approximation pi by an integer value. When you perform the operation % (2*pi) on those values, the result happens to be exact or nearly so but only because you used the same approximation of pi. Doing that operation on an arbitrary value (like 1000000) only introduces more error to the calculation. Floating-point sin(1000000.0) should return a value within eps (~2**-52) of the true, real-valued function sin(1000000). Calculating (1000000 % (2*pi)) introduces error in two places: the approximation pi and the operation %. A floating-point implementation of sin(.) will return a value within eps of the real sin(.) of the value that is the result of the floating-point operation (1000000 % (2*pi)), which already has some error accumulated.
I agree with Robert. In fact on the FPUs such as x87, where floating point registers have extended precision, sin(x % (2%pi)) will give you a less precise answer than sin(x). The improved presision that you see is illusory because gives that 64 bit pi is not a precise mathematical pi, sin(2*pi) is not 0.
On 5/25/06, Robert Kern robert.kern@gmail.com wrote:
Alan G Isaac wrote:
On Thu, 25 May 2006, Robert Kern apparently wrote:
That your demonstration results in the desired exact 0.0 for multiples of 2*pi is an accident. The results for values other than integer multiples of pi will be as wrong or more wrong.
It seems that a continuity argument should undermine that as a general claim. Right?
Let me clarify. Since you created your values by multiplying the floating-point approximation pi by an integer value. When you perform the operation % (2*pi) on those values, the result happens to be exact or nearly so but only because you used the same approximation of pi. Doing that operation on an arbitrary value (like 1000000) only introduces more error to the calculation. Floating-point sin(1000000.0) should return a value within eps (~2**-52) of the true, real-valued function sin(1000000). Calculating (1000000 % (2*pi)) introduces error in two places: the approximation pi and the operation %. A floating-point implementation of sin(.) will return a value within eps of the real sin(.) of the value that is the result of the floating-point operation (1000000 % (2*pi)), which already has some error accumulated.
-- Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
All the advantages of Linux Managed Hosting--Without the Cost and Risk! Fully trained technicians. The highest number of Red Hat certifications in the hosting industry. Fanatical Support. Click to learn more http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&da... _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
On Thu, 25 May 2006, Robert Kern apparently wrote:
Let me clarify. Since you created your values by multiplying the floating-point approximation pi by an integer value. When you perform the operation % (2*pi) on those values, the result happens to be exact or nearly so but only because you used the same approximation of pi. Doing that operation on an arbitrary value (like 1000000) only introduces more error to the calculation. Floating-point sin(1000000.0) should return a value within eps (~2**-52) of the true, real-valued function sin(1000000). Calculating (1000000 % (2*pi)) introduces error in two places: the approximation pi and the operation %. A floating-point implementation of sin(.) will return a value within eps of the real sin(.) of the value that is the result of the floating-point operation (1000000 % (2*pi)), which already has some error accumulated.
I do not think that we have any disgreement here, except possibly over eps, which is not constant for different argument sizes. So I wondered if there was a tradeoff: smaller eps (from smaller argument) for the cost of computational error in an additional operation.
Anyway, thanks for the feedback on this.
Cheers, Alan
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Robert Kern wrote: | Alan G Isaac wrote: | |>I am a user, not a numerics type, |>so this is undoubtedly a naive question. |> |>Might the sin function be written to give greater |>accuracy for large real numbers? It seems that significant |>digits are in some sense being discarded needlessly. | | | Not really. The floating point representation of pi is not exact. The problem | only gets worse when you multiply it with something. The method you showed of | using % (2*pi) is only accurate when the values are created by multiplying the | same pi by another value. Otherwise, it just introduces another source of error, | I think. | | This is one of the few places where a version of trig functions that directly | operate on degrees are preferred. 360.0*n is exactly representable by floating | point arithmetic until n~=12509998964918 (give or take a power of two). Doing % | 360 can be done exactly.
This reminds me of a story Richard Feynman tells in his autobiography. He used to say: "if you can pose a mathematical question in 10 seconds, I can solve it with 10% accuracy in one minute just calculating in my head". This worked for a long time, until someone told him "please calculate the sine of a million".
Actual mantissa bits are used by the multiple of two-pi, and those are lost at the back of the calculated value. Calculating the sine of a million with the same precision as the sine of zero requires 20 more bits of accuracy.
Rob - -- Rob W.W. Hooft || rob@hooft.net || http://www.hooft.net/people/rob/
This is not really a numpy issue, but general floating point problem.
Consider this:
x=linspace(0,10*pi,11) all(array(map(math.sin, x))==sin(x))
True
If anything can be improved, that would be the C math library.
On 5/25/06, Rob Hooft rob@hooft.net wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Robert Kern wrote: | Alan G Isaac wrote: | |>I am a user, not a numerics type, |>so this is undoubtedly a naive question. |> |>Might the sin function be written to give greater |>accuracy for large real numbers? It seems that significant |>digits are in some sense being discarded needlessly. | | | Not really. The floating point representation of pi is not exact. The problem | only gets worse when you multiply it with something. The method you showed of | using % (2*pi) is only accurate when the values are created by multiplying the | same pi by another value. Otherwise, it just introduces another source of error, | I think. | | This is one of the few places where a version of trig functions that directly | operate on degrees are preferred. 360.0*n is exactly representable by floating | point arithmetic until n~=12509998964918 (give or take a power of two). Doing % | 360 can be done exactly.
This reminds me of a story Richard Feynman tells in his autobiography. He used to say: "if you can pose a mathematical question in 10 seconds, I can solve it with 10% accuracy in one minute just calculating in my head". This worked for a long time, until someone told him "please calculate the sine of a million".
Actual mantissa bits are used by the multiple of two-pi, and those are lost at the back of the calculated value. Calculating the sine of a million with the same precision as the sine of zero requires 20 more bits of accuracy.
Rob
Rob W.W. Hooft || rob@hooft.net || http://www.hooft.net/people/rob/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFEdfziH7J/Cv8rb3QRAiO8AKCQdJ+9EMOP6bOmUX0NIhuWVoEFQgCgmvTS fgO08dI16AUFcYKkpRJXg/Q= =qQXI -----END PGP SIGNATURE-----
All the advantages of Linux Managed Hosting--Without the Cost and Risk! Fully trained technicians. The highest number of Red Hat certifications in the hosting industry. Fanatical Support. Click to learn more http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&da... _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
On Thu, 25 May 2006, Alexander Belopolsky apparently wrote:
This is not really a numpy issue, but general floating point problem. Consider this:
x=linspace(0,10*pi,11) all(array(map(math.sin, x))==sin(x))
True
I think this misses the point. I was not suggesting numpy results differ from the C math library results.
x1=sin(linspace(0,10*pi,21)) x2=sin(linspace(0,10*pi,21)%(2*pi)) all(x1==x2)
False
x1
array([ 0.00000000e+00, 1.00000000e+00, 1.22460635e-16, -1.00000000e+00, -2.44921271e-16, 1.00000000e+00, 3.67381906e-16, -1.00000000e+00, -4.89842542e-16, 1.00000000e+00, 6.12303177e-16, -1.00000000e+00, -7.34763812e-16, 1.00000000e+00, 8.57224448e-16, -1.00000000e+00, -9.79685083e-16, 1.00000000e+00, 1.10214572e-15, -1.00000000e+00, -1.22460635e-15])
x2
array([ 0.00000000e+00, 1.00000000e+00, 1.22460635e-16, -1.00000000e+00, 0.00000000e+00, 1.00000000e+00, 1.22460635e-16, -1.00000000e+00, 0.00000000e+00, 1.00000000e+00, 1.22460635e-16, -1.00000000e+00, 0.00000000e+00, 1.00000000e+00, 1.22460635e-16, -1.00000000e+00, 0.00000000e+00, 1.00000000e+00, 1.22460635e-16, -1.00000000e+00, 0.00000000e+00])
I'd rather have x2: I'm just asking if there is anything exploitable here. Robert suggests not.
Cheers, Alan Isaac