Different results from repeated calculation, part 2

I get slightly different results when I repeat a calculation. I've seen this problem before (it went away but has returned): http://projects.scipy.org/pipermail/numpy-discussion/2007-January/025724.htm... A unit test is attached. It contains three tests: In test1, I construct matrices x and y and then repeatedly calculate z = calc(x,y). The result z is the same every time. So this test passes. In test2, I construct matrices x and y each time before calculating z = calc(x,y). Sometimes z is slightly different. But the x's test to be equal and so do the y's. This test fails (on Debian Lenny, Core 2 Duo, with libatlas3gf-sse2 but not with libatlas3gf-sse). test3 is the same as test2 but I calculate z like this: z = calc(100*x,y) / (100 * 100). This test passes. I get: ====================================================================== FAIL: repeatability #2 ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/[snip]/test/repeat_test.py", line 73, in test_repeat_2 self.assert_(result, msg) AssertionError: Max difference = 2.04946e-16 ---------------------------------------------------------------------- Should a unit test like this be added to numpy?

On 14/08/08: 10:20, Keith Goodman wrote:
A unit test is attached. It contains three tests:
In test1, I construct matrices x and y and then repeatedly calculate z = calc(x,y). The result z is the same every time. So this test passes.
In test2, I construct matrices x and y each time before calculating z = calc(x,y). Sometimes z is slightly different. But the x's test to be equal and so do the y's. This test fails (on Debian Lenny, Core 2 Duo, with libatlas3gf-sse2 but not with libatlas3gf-sse).
test3 is the same as test2 but I calculate z like this: z = calc(100*x,y) / (100 * 100). This test passes.
I get:
====================================================================== FAIL: repeatability #2 ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/[snip]/test/repeat_test.py", line 73, in test_repeat_2 self.assert_(result, msg) AssertionError: Max difference = 2.04946e-16
Could this be because of how the calculations are done? If the floating point numbers are stored in the cpu registers, in this case (intel core duo), they are 80-bit values, whereas 'double' precision is 64-bits. Depending upon gcc's optimization settings, the amount of automatic variables, etc., it is entirely possible that the numbers are stored in registers only in some cases, and are in the RAM in other cases. Thus, in your tests, sometimes some numbers get stored in the cpu registers, making the calculations with those values different from the case if they were not stored in the registers. See "The pitfalls of verifying floating-point computations" at http://portal.acm.org/citation.cfm?doid=1353445.1353446 (or if that needs subscription, you can download the PDF from http://arxiv.org/abs/cs/0701192). The paper has a lot of examples of surprises like this. Quote: We shall discuss the following myths, among others: ... - "Arithmetic operations are deterministic; that is, if I do z=x+y in two places in the same program and my program never touches x and y in the meantime, then the results should be the same." - A variant: "If x < 1 tests true at one point, then x < 1 stays true later if I never modify x." ... -Alok -- * * Alok Singhal * * * http://www.astro.virginia.edu/~as8ca/ * *

Hi, Am 14.08.2008 um 19:48 schrieb Alok Singhal:
On 14/08/08: 10:20, Keith Goodman wrote:
A unit test is attached. It contains three tests:
In test1, I construct matrices x and y and then repeatedly calculate z = calc(x,y). The result z is the same every time. So this test passes.
In test2, I construct matrices x and y each time before calculating z = calc(x,y). Sometimes z is slightly different. But the x's test to be equal and so do the y's. This test fails (on Debian Lenny, Core 2 Duo, with libatlas3gf-sse2 but not with libatlas3gf-sse).
test3 is the same as test2 but I calculate z like this: z = calc(100*x,y) / (100 * 100). This test passes.
I get:
= ===================================================================== FAIL: repeatability #2 ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/[snip]/test/repeat_test.py", line 73, in test_repeat_2 self.assert_(result, msg) AssertionError: Max difference = 2.04946e-16
Could this be because of how the calculations are done? If the floating point numbers are stored in the cpu registers, in this case (intel core duo), they are 80-bit values, whereas 'double' precision is 64-bits. Depending upon gcc's optimization settings, the amount of automatic variables, etc., it is entirely possible that the numbers are stored in registers only in some cases, and are in the RAM in other cases. Thus, in your tests, sometimes some numbers get stored in the cpu registers, making the calculations with those values different from the case if they were not stored in the registers. The tests never fail on my CoreDuo 2 on MacOS X, just for the records ;)
Holger

Alok Singhal wrote:
On 14/08/08: 10:20, Keith Goodman wrote:
A unit test is attached. It contains three tests:
In test1, I construct matrices x and y and then repeatedly calculate z = calc(x,y). The result z is the same every time. So this test passes.
In test2, I construct matrices x and y each time before calculating z = calc(x,y). Sometimes z is slightly different. But the x's test to be equal and so do the y's. This test fails (on Debian Lenny, Core 2 Duo, with libatlas3gf-sse2 but not with libatlas3gf-sse).
test3 is the same as test2 but I calculate z like this: z = calc(100*x,y) / (100 * 100). This test passes.
I get:
====================================================================== FAIL: repeatability #2 ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/[snip]/test/repeat_test.py", line 73, in test_repeat_2 self.assert_(result, msg) AssertionError: Max difference = 2.04946e-16
Could this be because of how the calculations are done? If the floating point numbers are stored in the cpu registers, in this case (intel core duo), they are 80-bit values, whereas 'double' precision is 64-bits. Depending upon gcc's optimization settings, the amount of automatic variables, etc., it is entirely possible that the numbers are stored in registers only in some cases, and are in the RAM in other cases. Thus, in your tests, sometimes some numbers get stored in the cpu registers, making the calculations with those values different from the case if they were not stored in the registers.
See "The pitfalls of verifying floating-point computations" at http://portal.acm.org/citation.cfm?doid=1353445.1353446 (or if that needs subscription, you can download the PDF from http://arxiv.org/abs/cs/0701192). The paper has a lot of examples of surprises like this. Quote:
We shall discuss the following myths, among others:
...
- "Arithmetic operations are deterministic; that is, if I do z=x+y in two places in the same program and my program never touches x and y in the meantime, then the results should be the same."
- A variant: "If x < 1 tests true at one point, then x < 1 stays true later if I never modify x."
...
-Alok
yep! The code is buggy. Please **nerver** use == to test if two floating point numbers are equal. **nerver**. As explained by Alok, floating point computations are *not* the same as computations over the field of reals numbers. Intel 80bits registers versus 64bits representation, IEE754, sse or not sse or mmx : Fun with floating point arithmetic :) The same code with and without SSE could provide you with "no equal" results. Never use == on flaots?!? Well, you would have to use it in some corner cases but please remember that computers are only working on a small subset (finite) of natural numbers. http://docs.python.org/tut/node16.html is a simple part of the story. 80 bits register are the ugly part (but the fun one :)) abs(a-b)<epsilon is the correct way to test that a and b are "equal". Xavier

On Aug 16, 2008, at 3:02 AM, Xavier Gnata wrote:
abs(a-b)<epsilon is the correct way to test that a and b are "equal".
But 1) the value of epsilon is algorithm dependent, 2) it may be that -epsilon1 < a-b < epsilon2 where the epsilons are not the same value, 3) the more valid test may be "<=" instead of "<" as when the maximum permissible difference is 0.5 ulp, and 4) the more important test (as mentioned in the Monniaux paper you referenced) may be the relative error and not the absolute error, which is how you wrote it. So saying that this is the correct way isn't that helpful. It requires proper numeric analysis and that isn't always available. An advantage to checking if a==b is that if they are in fact equal then there's no need to do any analysis. Whereas if you choose some epsilon (or epsilon-relative) then how do you pick that number? Andrew dalke@dalkescientific.com

I get slightly different results when I repeat a calculation.
I've seen this problem before (it went away but has returned):
http://projects.scipy.org/pipermail/numpy-discussion/2007-January/025724.htm...
A unit test is attached. It contains three tests:
In test1, I construct matrices x and y and then repeatedly calculate z = calc(x,y). The result z is the same every time. So this test passes.
In test2, I construct matrices x and y each time before calculating z = calc(x,y). Sometimes z is slightly different. But the x's test to be equal and so do the y's. This test fails (on Debian Lenny, Core 2 Duo, with libatlas3gf-sse2 but not with libatlas3gf-sse).
test3 is the same as test2 but I calculate z like this: z = calc(100*x,y) / (100 * 100). This test passes.
I get:
====================================================================== FAIL: repeatability #2 ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/[snip]/test/repeat_test.py", line 73, in test_repeat_2 self.assert_(result, msg) AssertionError: Max difference = 2.04946e-16
----------------------------------------------------------------------
Should a unit test like this be added to numpy?
------------------------------------------------------------------------
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion Hi, In the function 'test_repeat_2' you are redefining variables 'x and y'
Keith Goodman wrote: that were first defined using the setup function. (Also, you are not using the __init__ function.) I vaguely recall there are some quirks to Python classes with this, so does the problem go away with if you use 'a,b' instead of 'x, y'? (I suspect the answer is yes given test_repeat_3). Note that you should also test that 'x' and 'y' are same here as well (but these have been redefined...). Otherwise, can you please provide your OS (version), computer processor, Python version, numpy version, version of atlas (or similar) and compiler used? I went back and reread the thread but I could not see this information. Bruce

On Thu, Aug 14, 2008 at 11:29 AM, Bruce Southey <bsouthey@gmail.com> wrote:
I get slightly different results when I repeat a calculation.
I've seen this problem before (it went away but has returned):
http://projects.scipy.org/pipermail/numpy-discussion/2007-January/025724.htm...
A unit test is attached. It contains three tests:
In test1, I construct matrices x and y and then repeatedly calculate z = calc(x,y). The result z is the same every time. So this test passes.
In test2, I construct matrices x and y each time before calculating z = calc(x,y). Sometimes z is slightly different. But the x's test to be equal and so do the y's. This test fails (on Debian Lenny, Core 2 Duo, with libatlas3gf-sse2 but not with libatlas3gf-sse).
test3 is the same as test2 but I calculate z like this: z = calc(100*x,y) / (100 * 100). This test passes.
I get:
====================================================================== FAIL: repeatability #2 ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/[snip]/test/repeat_test.py", line 73, in test_repeat_2 self.assert_(result, msg) AssertionError: Max difference = 2.04946e-16
----------------------------------------------------------------------
Should a unit test like this be added to numpy?
------------------------------------------------------------------------
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion Hi, In the function 'test_repeat_2' you are redefining variables 'x and y'
Keith Goodman wrote: that were first defined using the setup function. (Also, you are not using the __init__ function.) I vaguely recall there are some quirks to Python classes with this, so does the problem go away with if you use 'a,b' instead of 'x, y'? (I suspect the answer is yes given test_repeat_3).
Note that you should also test that 'x' and 'y' are same here as well (but these have been redefined...).
Otherwise, can you please provide your OS (version), computer processor, Python version, numpy version, version of atlas (or similar) and compiler used?
I went back and reread the thread but I could not see this information.
Here's a test that doesn't use classes and checks that x and y do not change: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20070127/52... I'm using binaries from Debian Lenny: $ uname -a Linux jan 2.6.25-2-686 #1 SMP Fri Jul 18 17:46:56 UTC 2008 i686 GNU/Linux $ python -V Python 2.5.2
numpy.__version__ '1.1.0'
$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz stepping : 6 cpu MHz : 2402.004 cache size : 4096 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm bogomips : 4807.45 clflush size : 64 processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz stepping : 6 cpu MHz : 2402.004 cache size : 4096 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm bogomips : 4750.69 clflush size : 64

Keith Goodman wrote:
On Thu, Aug 14, 2008 at 11:29 AM, Bruce Southey <bsouthey@gmail.com> wrote:
Keith Goodman wrote:
I get slightly different results when I repeat a calculation.
I've seen this problem before (it went away but has returned):
http://projects.scipy.org/pipermail/numpy-discussion/2007-January/025724.htm...
A unit test is attached. It contains three tests:
In test1, I construct matrices x and y and then repeatedly calculate z = calc(x,y). The result z is the same every time. So this test passes.
In test2, I construct matrices x and y each time before calculating z = calc(x,y). Sometimes z is slightly different. But the x's test to be equal and so do the y's. This test fails (on Debian Lenny, Core 2 Duo, with libatlas3gf-sse2 but not with libatlas3gf-sse).
test3 is the same as test2 but I calculate z like this: z = calc(100*x,y) / (100 * 100). This test passes.
I get:
====================================================================== FAIL: repeatability #2 ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/[snip]/test/repeat_test.py", line 73, in test_repeat_2 self.assert_(result, msg) AssertionError: Max difference = 2.04946e-16
----------------------------------------------------------------------
Should a unit test like this be added to numpy?
------------------------------------------------------------------------
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Hi, In the function 'test_repeat_2' you are redefining variables 'x and y' that were first defined using the setup function. (Also, you are not using the __init__ function.) I vaguely recall there are some quirks to Python classes with this, so does the problem go away with if you use 'a,b' instead of 'x, y'? (I suspect the answer is yes given test_repeat_3).
Note that you should also test that 'x' and 'y' are same here as well (but these have been redefined...).
Otherwise, can you please provide your OS (version), computer processor, Python version, numpy version, version of atlas (or similar) and compiler used?
I went back and reread the thread but I could not see this information.
Here's a test that doesn't use classes and checks that x and y do not change:
http://projects.scipy.org/pipermail/numpy-discussion/attachments/20070127/52...
I'm using binaries from Debian Lenny:
$ uname -a Linux jan 2.6.25-2-686 #1 SMP Fri Jul 18 17:46:56 UTC 2008 i686 GNU/Linux
$ python -V Python 2.5.2
numpy.__version__
'1.1.0'
$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz stepping : 6 cpu MHz : 2402.004 cache size : 4096 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm bogomips : 4807.45 clflush size : 64
processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz stepping : 6 cpu MHz : 2402.004 cache size : 4096 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm bogomips : 4750.69 clflush size : 64 _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
I do not get this on my Intel Quad core2 Linux x64 system with a x86_64 running Fedora 10 supplied Python. I do compile my own versions of NumPy and currently don't use or really plan to use altas. But I know that you previously indicated that this was atlas related (http://projects.scipy.org/pipermail/numpy-discussion/2007-January/025750.htm...) . From Intel's website, the Intel Core2 Duo E6600 (http://processorfinder.intel.com/details.aspx?sSpec=SL9S8) supports EM64T so it is x86 64-bit processor. I do not know Debian but i686 generally refers to 32-bit kernel as x86_64 refers to 64-bit. If so, then you are running a 32-bits on 64-bit processor. So I would suggest you start by compiling your own NumPy without any extras and see if it goes away. If not, then it is NumPy otherwise add the extras until you get the same system back. Bruce
participants (6)
-
Alok Singhal
-
Andrew Dalke
-
Bruce Southey
-
Holger Rapp
-
Keith Goodman
-
Xavier Gnata