![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
I get slightly different results when I repeat a calculation. In a long simulation the differences snowball and swamp the effects I am trying to measure. In the attach script there are three tests. In test1, I construct matrices x and y and then repeatedly calculate z = calc(x,y). The result z is the same every time. So this test passes. In test2, I construct matrices x and y each time before calculating z = calc(x,y). Sometimes z is slightly different---of the order of 1e-21 to 1e-18. But the x's test to be equal and so do the y's. This test fails. (It doesn't fail on my friend's windows box; I'm running linux.) test3 is the same as test2 but I calculate z like this: z = calc(100*x ,y) / (100 * 100). This test passes. What is going on? Here is some sample output:
test1: 0 differences test2: 38 differences test3: 0 differences Repeated runs tend to give me the same number of differences in test2 for several runs. Then I get a new number of differences which last for severals runs...
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 1/27/07, Stefan <stefan@sun.ac.za> wrote:
Yes, test1: 0 differences test2: 51 differences test3: 0 differences Oddly, the relative error is always the same: 98 z different 2.0494565872e-16 99 z different 2.0494565872e-16 Which is nearly the same as the double precision 2.2204460492503131e-16, the difference being due to the fact that the precision is defined relative to 1, and the error in the computation are in a number relatively larger (more bits set, but not yet 2). So this looks like an error in the LSB of the floating number. Could be rounding, could be something not reset quite right. I'm thinking possibly hardware at this time, maybe compiler. Linux fedora 2.6.19-1.2895.fc6 #1 SMP Wed Jan 10 19:28:18 EST 2007 i686 athlon i386 GNU/Linux processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 12 model name : AMD Athlon(tm) 64 Processor 2800+ stepping : 0 cpu MHz : 1808.786 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow up ts fid vid ttp bogomips : 3618.83 Athlon 64 running 32 bit linux. Chuck
![](https://secure.gravatar.com/avatar/af6c39d6943bd4b0e1fde23161e7bb8c.jpg?s=120&d=mm&r=g)
On Sat, Jan 27, 2007 at 03:11:58PM -0700, Charles R Harris wrote:
Interesting! I don't see it on Linux alpha 2.6.17-10-386 #2 Fri Oct 13 18:41:40 UTC 2006 i686 GNU/Linux vendor_id : AuthenticAMD model name : AMD Athlon(tm) XP 2400+ flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow up ts but I do see it on Linux voyager 2.6.17-10-generic #2 SMP Fri Oct 13 18:45:35 UTC 2006 i686 GNU/Linux processor : 0 vendor_id : GenuineIntel model name : Genuine Intel(R) CPU T2300 @ 1.66GHz processor : 1 vendor_id : GenuineIntel model name : Genuine Intel(R) CPU T2300 @ 1.66GHz flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc pni monitor vmx est tm2 xtpr Both machines are running Ubuntu Edgy, exact same software versions. Cheers Stéfan
![](https://secure.gravatar.com/avatar/95198572b00e5fbcd97fb5315215bf7a.jpg?s=120&d=mm&r=g)
On 1/27/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
OK, this is weird. I modified the repeat code a little to ease collecting of results, and all of a sudden the differences went away. If you look at the attached code, here's what happens for me: a) If I have line 77 like this (commented out): #print '-'*75 I get: [...] 94 z different 8.47032947254e-22 95 z different 8.47032947254e-22 96 z different 8.47032947254e-22 98 z different 8.47032947254e-22 99 z different 8.47032947254e-22 Numpy version: 1.0.2.dev3521 test1: 0 differences test2: 75 differences test3: 0 differences b) If I remove the comment char from that line, I get: tlon[~/Desktop]> python repeat.py --------------------------------------------------------------------------- Numpy version: 1.0.2.dev3521 test1: 0 differences test2: 0 differences test3: 0 differences That's it. One comment char removed, and something that's done /after/ the tests are actually executed. That kind of 'I add a printf() call and the bug disappears' is unpleasantly reminiscent of lurking pointer errors in C code... Cheers, f
![](https://secure.gravatar.com/avatar/95198572b00e5fbcd97fb5315215bf7a.jpg?s=120&d=mm&r=g)
On 1/27/07, Fernando Perez <fperez.net@gmail.com> wrote:
Sorry, I forgot to add: tlon[~/Desktop]> uname -a Linux tlon 2.6.17-10-generic #2 SMP Tue Dec 5 22:28:26 UTC 2006 i686 GNU/Linux tlon[~/Desktop]> python -V Python 2.4.4c1 tlon[~/Desktop]> cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 35 model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ This box runs up to date Ubuntu Edgy. Cheers, f
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
Fernando Perez wrote:
Heh. Fantastic. It might be worthwhile porting this code to C to see what happens. If we can definitively point the finger at the kernel, that would be great (less work for me!). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/95198572b00e5fbcd97fb5315215bf7a.jpg?s=120&d=mm&r=g)
On 1/27/07, Robert Kern <robert.kern@gmail.com> wrote:
It's definitely looking like something SMP related: on my laptop, with everything other than the hardware being identical (Linux distro, kernel, numpy build, etc), I can't make it fail no matter how I muck with it. I always get '0 differences'. The desktop is a dual-core AMD Athlon as indicated before, the laptop is an oldie Pentium III. They both run the same SMP-aware Ubuntu i686 kernel, since Ubuntu now ships a unified kernel, though obviously on the laptop the SMP code isn't active. Cheers, f
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 1/27/07, Fernando Perez <fperez.net@gmail.com> wrote:
After installing a kernel that is not smp aware, I still have the same problem. --------------------------------------------------------------------------- Numpy version: 1.0.1 test1: 0 differences test2: 55 differences test3: 0 differences $ uname -a Linux kel 2.6.18-3-486 #1 Mon Dec 4 15:59:52 UTC 2006 i686 GNU/Linux $ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Pentium(R) 4 CPU 2.80GHz stepping : 9 cpu MHz : 2793.143 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr bogomips : 5589.65
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 1/28/07, Keith Goodman <kwgoodman@gmail.com> wrote:
Interesting, I wonder if ATLAS is resetting the FPU flags and changing the rounding mode? It is just the LSB of the mantissa that looks to be changing. Before reporting the problem it might be good to pin it down a bit more if possible. How come your computation is so sensitive to these small effects? Chuck
![](https://secure.gravatar.com/avatar/95198572b00e5fbcd97fb5315215bf7a.jpg?s=120&d=mm&r=g)
On 1/28/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
Well, the fact that I don't see the problem on a PentiumIII (with atlas-sse) but I see it on my desktop (atlas-sse2) should tell us something. The test code uses double arrays, and SSE2 has double precision support but it's purely 64-bit doubles. SSE is single-precision only, which means that for a double computation, ATLAS isn't used and the Intel FPU does the computation instead. Intel FPUs use 80 bits internally for intermediate operations (even though they only return a normal 64-bit double result), so it's fairly common to see this kind of thing. You can test things by writing a little program in C that does the same operations, and use this little trick: #include <fpu_control.h> // Define DOUBLE to set the FPU in regular double-precision mode, disabling // the internal 80-bit mode which Intel FPUs have. //#define DOUBLE // ... later in the code's main(): // set FPU control word for double precision int cword = 4722; _FPU_SETCW(cword); This can show you if the problem is indeed caused by rounding differences between 64-bit and 80-bit mode. Cheers, f
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 1/28/07, Keith Goodman <kwgoodman@gmail.com> wrote:
It is strange, isn't it. I'm still thinking race condition, but where? I suppose even python could be involved someplace. BTW, your algorithm sounds like a great way to expose small descrepancies. There's a great test for floating point errors lurking in there. Chuck
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 1/28/07, Fernando Perez <fperez.net@gmail.com> wrote:
But how come it isn't consistent and seems to depend on timing? That is what makes me think there is a race somewhere in doing something, like setting flags . I googled yesterday for floating point errors and didn't find anything that looked relevant. Maybe I should try again with the combination of atlas and sse2. Chuck
![](https://secure.gravatar.com/avatar/95198572b00e5fbcd97fb5315215bf7a.jpg?s=120&d=mm&r=g)
On 1/28/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
There could be more than one thing at work here, I honestly don't know. I'm just trying to throw familiar-sounding data bits at the wall, perhaps somebody will see a pattern in the blobs. It worked for Pollock :) Cheers, f
![](https://secure.gravatar.com/avatar/95198572b00e5fbcd97fb5315215bf7a.jpg?s=120&d=mm&r=g)
On 1/28/07, Keith Goodman <kwgoodman@gmail.com> wrote:
Well, the SSE part won't, but you're still better off with ATLAS than with the default reference BLAS implementation. I think even an ATLAS SSE has special code for double (not using any SSE-type engine) that's faster than the reference BLAS which is pure generic Fortran. Someone who knows the ATLAS internals please correct me if that's not the case. Cheers, f
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 1/28/07, Keith Goodman <kwgoodman@gmail.com> wrote:
Hmmm, I wonder if stuff could be done in different orders. That could affect rounding. Even optimization settings could if someone wasn't careful to use parenthesis to force the order of evaluation. This is all very interesting. Chuck
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 1/28/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
How come your computation is so sensitive to these small effects?
I don't know. The differences I am seeing are larger than in the test script---but still of the order of eps. Each time step of my simulation selects a maximum value and then uses that for the next time step. Depending on which item is chosen of several items that are very close in value, the simulation can head in a new direction. I guess, as always, I need to randomly perturb my parameters and look at the distribution of test results to see if the effect I am trying to measure is significant. I had no idea it was this sensitive.
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 1/28/07, Keith Goodman <kwgoodman@gmail.com> wrote:
This problem may be related to this bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=279294 Chuck
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 2/1/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
This problem may be related to this bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=279294
It says it is fixed in libc6 2.3.5. I'm on 2.3.6. But do you think it is something similar? A port to Octave of the test script works fine on the same system.
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
Keith Goodman wrote:
A port to Octave of the test script works fine on the same system.
Are you sure that your Octave port uses ATLAS to do the matrix product? Could you post your port? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/38d5ac232150013cbf1a4639538204c0.jpg?s=120&d=mm&r=g)
Hi, I am curious why I do not see any mention of the compilers and versions that were used in this thread. Having just finally managed to get SciPY installed from scratch (but not with atlas), I could see that using different compliers or versions or options especially compiling done at different times could be a factor. Bruce On 2/1/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 2/2/07, Bruce Southey <bsouthey@gmail.com> wrote:
Yeah, good point. I installed 1.0.1 from binary from Debian sid. Maybe a chart of which configurations have the problem and which don't would help. If the problem is ATLAS I don't understand why test1 passes. Could the loading of the values be the problem and not the multiplication itself?
![](https://secure.gravatar.com/avatar/af6c39d6943bd4b0e1fde23161e7bb8c.jpg?s=120&d=mm&r=g)
On Sat, Jan 27, 2007 at 04:00:33PM -0700, Charles R Harris wrote:
It runs fine on this Ubuntu/Edgy machine, though: Linux genugtig 2.6.17-10-generic #2 SMP Tue Dec 5 21:16:35 UTC 2006 x86_64 GNU/Linux processor : 0 vendor_id : AuthenticAMD model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ processor : 1 vendor_id : AuthenticAMD model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow up pni lahf_lm cmp_legacy Cheers Stéfan
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
Charles R Harris wrote:
[today]$ python repeat.py --------------------------------------------------------------------------- Numpy version: 1.0.2.dev3520 test1: 0 differences test2: 0 differences test3: 0 differences [today]$ uname -a Darwin Sacrilege.local 8.8.2 Darwin Kernel Version 8.8.2: Thu Sep 28 20:43:26 PDT 2006; root:xnu-792.14.14.obj~1/RELEASE_I386 i386 i386 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/af6c39d6943bd4b0e1fde23161e7bb8c.jpg?s=120&d=mm&r=g)
On Sat, Jan 27, 2007 at 03:11:58PM -0700, Charles R Harris wrote:
And just for the hell of it, with 4 CPUs :) Linux dirac 2.6.17-10-generic #2 SMP Tue Dec 5 21:16:35 UTC 2006 x86_64 GNU/Linux processor : 0 vendor_id : AuthenticAMD model name : Dual Core AMD Opteron(tm) Processor 275 processor : 1 vendor_id : AuthenticAMD model name : Dual Core AMD Opteron(tm) Processor 275 processor : 2 vendor_id : AuthenticAMD model name : Dual Core AMD Opteron(tm) Processor 275 processor : 3 vendor_id : AuthenticAMD model name : Dual Core AMD Opteron(tm) Processor 275 flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow up pni lahf_lm cmp_legacy Works fine. Cheers Stéfan
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 1/29/07, Keith Goodman <kwgoodman@gmail.com> wrote:
That's odd, the LSB bit of the double precision mantissa is only about 2.2e-16, so you can't *get* differences as small as 8.4e-22 without about 70 bit mantissa's. Hmmm, and extended double precision only has 63 bit mantissa's. Are you sure you are computing the error correctly? Chuck
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 1/29/07, Keith Goodman <kwgoodman@gmail.com> wrote:
Here is a setting for x and y that gives me a difference (using the unit test in this thread) of 4.54747e-13! That is huge---and a serious problem. I am sure I can get bigger. # x data x = M.zeros((3,3)) x[0,0] = 9.0030140479499 x[0,1] = 9.0026474226671 x[0,2] = -9.0011270502873 x[1,0] = 9.0228605377994 x[1,1] = 9.0033715311274 x[1,2] = -9.0082367491299 x[2,0] = 9.0044783987583 x[2,1] = 0.0027488028057 x[2,2] = -9.0036113393360 # y data y = M.zeros((3,1)) y[0,0] =10.00088539878978 y[1,0] = 0.00667193234012 y[2,0] = 0.00032472712345
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
Keith Goodman wrote: > On 1/29/07, Keith Goodman <kwgoodman@gmail.com> wrote: >> On 1/29/07, Keith Goodman <kwgoodman@gmail.com> wrote: >>> On 1/29/07, Charles R Harris <charlesr.harris@gmail.com> wrote: >>> >>>> That's odd, the LSB bit of the double precision mantissa is only about >>>> 2.2e-16, so you can't *get* differences as small as 8.4e-22 without about >>>> 70 bit mantissa's. Hmmm, and extended double precision only has 63 bit >>>> mantissa's. Are you sure you are computing the error correctly? >>> That is odd. >>> >>> 8.4e-22 is just the output of the test script: abs(z - z0).max(). That >>> abs is from python. >> By playing around with x and y I can get all sorts of values for abs(z >> - z0).max(). I can get down to the e-23 range and to 2.2e-16. I've >> also seen e-18 and e-22. > > Here is a setting for x and y that gives me a difference (using the > unit test in this thread) of 4.54747e-13! That is huge---and a serious > problem. I am sure I can get bigger. Check the size of z0. Only the relative difference abs((z-z0)/z0) is going to be about 1e-16. If you adjust the size of z0, the absolute difference will also change in size. In the original unittest that you wrote, z0 is about 1e-6, so 1e-22 corresponds to 1 ULP. With the data you give here, z0 is about 1e3, so 1e-13 also corresponds to 1 ULP. There is no (additional) problem here. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/ccb440c822567bba3d49d0ea2894b8a1.jpg?s=120&d=mm&r=g)
On a PPC MacOS X box I don't see an error. If I append if __name__ == "__main__": run() to your test code and then run it I get: repeatability #1 ... ok repeatability #2 ... ok repeatability #3 ... ok ---------------------------------------------------------------------- Ran 3 tests in 0.156s OK
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 1/29/07, Russell E. Owen <rowen@cesmail.net> wrote:
So far no one has duplicated the problem on windows or mac. The problem has only been seen on linux with atlas3-sse2. (I get a similar problem with other versions of atlas.) Are you running atlas on your PPC mac? Perhaps atlas3-altivec?
![](https://secure.gravatar.com/avatar/fd8e71405bcd3efac5cb6aea94b07c0d.jpg?s=120&d=mm&r=g)
On a 64-bit Intel Core2 Duo running Debian unstable with atlas3 (there is no specific atlas3-sse2 for AMD64 Debian, although I think that it is included) everything checks out fine: eiger:~$ uname -a Linux eiger 2.6.18-3-amd64 #1 SMP Sun Dec 10 19:57:44 CET 2006 x86_64 GNU/Linux eiger:~$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz stepping : 6 cpu MHz : 2660.009 cache size : 4096 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm bogomips : 5324.65 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: (same for 2nd CPU) Scott On Monday 29 January 2007 16:10, Keith Goodman wrote:
-- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom@nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
Keith Goodman wrote:
Another datapoint using atlas3-base on Ubuntu AMD-64. Looking at the source package, I think it sets ISAEXT="sse2" for AMD-64 when building. rkern@rkernx2:~$ python repeat_test.py repeatability #1 ... ok repeatability #2 ... ok repeatability #3 ... ok ---------------------------------------------------------------------- Ran 3 tests in 0.043s OK rkern@rkernx2:~$ uname -a Linux rkernx2 2.6.17-10-generic #2 SMP Fri Oct 13 15:34:39 UTC 2006 x86_64 GNU/Linux rkern@rkernx2:~$ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 43 model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ stepping : 1 cpu MHz : 2211.346 cache size : 512 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow up pni lahf_lm cmp_legacy bogomips : 4426.03 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp processor : 1 vendor_id : AuthenticAMD cpu family : 15 model : 43 model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ stepping : 1 cpu MHz : 2211.346 cache size : 512 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow up pni lahf_lm cmp_legacy bogomips : 4423.03 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 1/29/07, Robert Kern <robert.kern@gmail.com> wrote:
I ported the test to octave which like numpy uses Atlas. On my machine (debian etch atlas3-sse2) I get the problem in numpy but not in octave. Plus test1 always passes. So it is only when you reload x and y that the problem occurs. If you load x and y once (test1) and repeat the calculation, there is no problem. Do these two results point, however weakly, away from atlas?
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 1/27/07, Keith Goodman <kwgoodman@gmail.com> wrote:
I built a new computer: Core 2 Duo 32-bit Debian etch with numpy 1.0.2.dev3546. The repeatability test still fails. In order to make my calculations repeatable I'll have to remove ATLAS. That really slows things down. Does anyone with Debian not have this problem?
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 1/27/07, Stefan <stefan@sun.ac.za> wrote:
Yes, test1: 0 differences test2: 51 differences test3: 0 differences Oddly, the relative error is always the same: 98 z different 2.0494565872e-16 99 z different 2.0494565872e-16 Which is nearly the same as the double precision 2.2204460492503131e-16, the difference being due to the fact that the precision is defined relative to 1, and the error in the computation are in a number relatively larger (more bits set, but not yet 2). So this looks like an error in the LSB of the floating number. Could be rounding, could be something not reset quite right. I'm thinking possibly hardware at this time, maybe compiler. Linux fedora 2.6.19-1.2895.fc6 #1 SMP Wed Jan 10 19:28:18 EST 2007 i686 athlon i386 GNU/Linux processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 12 model name : AMD Athlon(tm) 64 Processor 2800+ stepping : 0 cpu MHz : 1808.786 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow up ts fid vid ttp bogomips : 3618.83 Athlon 64 running 32 bit linux. Chuck
![](https://secure.gravatar.com/avatar/af6c39d6943bd4b0e1fde23161e7bb8c.jpg?s=120&d=mm&r=g)
On Sat, Jan 27, 2007 at 03:11:58PM -0700, Charles R Harris wrote:
Interesting! I don't see it on Linux alpha 2.6.17-10-386 #2 Fri Oct 13 18:41:40 UTC 2006 i686 GNU/Linux vendor_id : AuthenticAMD model name : AMD Athlon(tm) XP 2400+ flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow up ts but I do see it on Linux voyager 2.6.17-10-generic #2 SMP Fri Oct 13 18:45:35 UTC 2006 i686 GNU/Linux processor : 0 vendor_id : GenuineIntel model name : Genuine Intel(R) CPU T2300 @ 1.66GHz processor : 1 vendor_id : GenuineIntel model name : Genuine Intel(R) CPU T2300 @ 1.66GHz flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc pni monitor vmx est tm2 xtpr Both machines are running Ubuntu Edgy, exact same software versions. Cheers Stéfan
![](https://secure.gravatar.com/avatar/95198572b00e5fbcd97fb5315215bf7a.jpg?s=120&d=mm&r=g)
On 1/27/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
OK, this is weird. I modified the repeat code a little to ease collecting of results, and all of a sudden the differences went away. If you look at the attached code, here's what happens for me: a) If I have line 77 like this (commented out): #print '-'*75 I get: [...] 94 z different 8.47032947254e-22 95 z different 8.47032947254e-22 96 z different 8.47032947254e-22 98 z different 8.47032947254e-22 99 z different 8.47032947254e-22 Numpy version: 1.0.2.dev3521 test1: 0 differences test2: 75 differences test3: 0 differences b) If I remove the comment char from that line, I get: tlon[~/Desktop]> python repeat.py --------------------------------------------------------------------------- Numpy version: 1.0.2.dev3521 test1: 0 differences test2: 0 differences test3: 0 differences That's it. One comment char removed, and something that's done /after/ the tests are actually executed. That kind of 'I add a printf() call and the bug disappears' is unpleasantly reminiscent of lurking pointer errors in C code... Cheers, f
![](https://secure.gravatar.com/avatar/95198572b00e5fbcd97fb5315215bf7a.jpg?s=120&d=mm&r=g)
On 1/27/07, Fernando Perez <fperez.net@gmail.com> wrote:
Sorry, I forgot to add: tlon[~/Desktop]> uname -a Linux tlon 2.6.17-10-generic #2 SMP Tue Dec 5 22:28:26 UTC 2006 i686 GNU/Linux tlon[~/Desktop]> python -V Python 2.4.4c1 tlon[~/Desktop]> cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 35 model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ This box runs up to date Ubuntu Edgy. Cheers, f
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
Fernando Perez wrote:
Heh. Fantastic. It might be worthwhile porting this code to C to see what happens. If we can definitively point the finger at the kernel, that would be great (less work for me!). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/95198572b00e5fbcd97fb5315215bf7a.jpg?s=120&d=mm&r=g)
On 1/27/07, Robert Kern <robert.kern@gmail.com> wrote:
It's definitely looking like something SMP related: on my laptop, with everything other than the hardware being identical (Linux distro, kernel, numpy build, etc), I can't make it fail no matter how I muck with it. I always get '0 differences'. The desktop is a dual-core AMD Athlon as indicated before, the laptop is an oldie Pentium III. They both run the same SMP-aware Ubuntu i686 kernel, since Ubuntu now ships a unified kernel, though obviously on the laptop the SMP code isn't active. Cheers, f
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 1/27/07, Fernando Perez <fperez.net@gmail.com> wrote:
After installing a kernel that is not smp aware, I still have the same problem. --------------------------------------------------------------------------- Numpy version: 1.0.1 test1: 0 differences test2: 55 differences test3: 0 differences $ uname -a Linux kel 2.6.18-3-486 #1 Mon Dec 4 15:59:52 UTC 2006 i686 GNU/Linux $ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Pentium(R) 4 CPU 2.80GHz stepping : 9 cpu MHz : 2793.143 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr bogomips : 5589.65
![](https://secure.gravatar.com/avatar/95198572b00e5fbcd97fb5315215bf7a.jpg?s=120&d=mm&r=g)
On 1/28/07, Keith Goodman <kwgoodman@gmail.com> wrote:
I was wondering if atlas-sse2 might be the problem, since my desktop is an sse2 machine, but my laptop uses only sse (old PentiumIII). Why don't you try putting in just atlas-sse and seeing what happens? Cheers, f
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 1/28/07, Keith Goodman <kwgoodman@gmail.com> wrote:
Interesting, I wonder if ATLAS is resetting the FPU flags and changing the rounding mode? It is just the LSB of the mantissa that looks to be changing. Before reporting the problem it might be good to pin it down a bit more if possible. How come your computation is so sensitive to these small effects? Chuck
![](https://secure.gravatar.com/avatar/95198572b00e5fbcd97fb5315215bf7a.jpg?s=120&d=mm&r=g)
On 1/28/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
Well, the fact that I don't see the problem on a PentiumIII (with atlas-sse) but I see it on my desktop (atlas-sse2) should tell us something. The test code uses double arrays, and SSE2 has double precision support but it's purely 64-bit doubles. SSE is single-precision only, which means that for a double computation, ATLAS isn't used and the Intel FPU does the computation instead. Intel FPUs use 80 bits internally for intermediate operations (even though they only return a normal 64-bit double result), so it's fairly common to see this kind of thing. You can test things by writing a little program in C that does the same operations, and use this little trick: #include <fpu_control.h> // Define DOUBLE to set the FPU in regular double-precision mode, disabling // the internal 80-bit mode which Intel FPUs have. //#define DOUBLE // ... later in the code's main(): // set FPU control word for double precision int cword = 4722; _FPU_SETCW(cword); This can show you if the problem is indeed caused by rounding differences between 64-bit and 80-bit mode. Cheers, f
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 1/28/07, Keith Goodman <kwgoodman@gmail.com> wrote:
It is strange, isn't it. I'm still thinking race condition, but where? I suppose even python could be involved someplace. BTW, your algorithm sounds like a great way to expose small descrepancies. There's a great test for floating point errors lurking in there. Chuck
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 1/28/07, Fernando Perez <fperez.net@gmail.com> wrote:
But how come it isn't consistent and seems to depend on timing? That is what makes me think there is a race somewhere in doing something, like setting flags . I googled yesterday for floating point errors and didn't find anything that looked relevant. Maybe I should try again with the combination of atlas and sse2. Chuck
![](https://secure.gravatar.com/avatar/95198572b00e5fbcd97fb5315215bf7a.jpg?s=120&d=mm&r=g)
On 1/28/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
There could be more than one thing at work here, I honestly don't know. I'm just trying to throw familiar-sounding data bits at the wall, perhaps somebody will see a pattern in the blobs. It worked for Pollock :) Cheers, f
![](https://secure.gravatar.com/avatar/95198572b00e5fbcd97fb5315215bf7a.jpg?s=120&d=mm&r=g)
On 1/28/07, Keith Goodman <kwgoodman@gmail.com> wrote:
Well, the SSE part won't, but you're still better off with ATLAS than with the default reference BLAS implementation. I think even an ATLAS SSE has special code for double (not using any SSE-type engine) that's faster than the reference BLAS which is pure generic Fortran. Someone who knows the ATLAS internals please correct me if that's not the case. Cheers, f
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 1/28/07, Keith Goodman <kwgoodman@gmail.com> wrote:
Hmmm, I wonder if stuff could be done in different orders. That could affect rounding. Even optimization settings could if someone wasn't careful to use parenthesis to force the order of evaluation. This is all very interesting. Chuck
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 1/28/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
How come your computation is so sensitive to these small effects?
I don't know. The differences I am seeing are larger than in the test script---but still of the order of eps. Each time step of my simulation selects a maximum value and then uses that for the next time step. Depending on which item is chosen of several items that are very close in value, the simulation can head in a new direction. I guess, as always, I need to randomly perturb my parameters and look at the distribution of test results to see if the effect I am trying to measure is significant. I had no idea it was this sensitive.
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 1/28/07, Keith Goodman <kwgoodman@gmail.com> wrote:
This problem may be related to this bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=279294 Chuck
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 2/1/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
This problem may be related to this bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=279294
It says it is fixed in libc6 2.3.5. I'm on 2.3.6. But do you think it is something similar? A port to Octave of the test script works fine on the same system.
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
Keith Goodman wrote:
A port to Octave of the test script works fine on the same system.
Are you sure that your Octave port uses ATLAS to do the matrix product? Could you post your port? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/38d5ac232150013cbf1a4639538204c0.jpg?s=120&d=mm&r=g)
Hi, I am curious why I do not see any mention of the compilers and versions that were used in this thread. Having just finally managed to get SciPY installed from scratch (but not with atlas), I could see that using different compliers or versions or options especially compiling done at different times could be a factor. Bruce On 2/1/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 2/2/07, Bruce Southey <bsouthey@gmail.com> wrote:
Yeah, good point. I installed 1.0.1 from binary from Debian sid. Maybe a chart of which configurations have the problem and which don't would help. If the problem is ATLAS I don't understand why test1 passes. Could the loading of the values be the problem and not the multiplication itself?
![](https://secure.gravatar.com/avatar/af6c39d6943bd4b0e1fde23161e7bb8c.jpg?s=120&d=mm&r=g)
On Sat, Jan 27, 2007 at 04:00:33PM -0700, Charles R Harris wrote:
It runs fine on this Ubuntu/Edgy machine, though: Linux genugtig 2.6.17-10-generic #2 SMP Tue Dec 5 21:16:35 UTC 2006 x86_64 GNU/Linux processor : 0 vendor_id : AuthenticAMD model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ processor : 1 vendor_id : AuthenticAMD model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow up pni lahf_lm cmp_legacy Cheers Stéfan
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
Charles R Harris wrote:
[today]$ python repeat.py --------------------------------------------------------------------------- Numpy version: 1.0.2.dev3520 test1: 0 differences test2: 0 differences test3: 0 differences [today]$ uname -a Darwin Sacrilege.local 8.8.2 Darwin Kernel Version 8.8.2: Thu Sep 28 20:43:26 PDT 2006; root:xnu-792.14.14.obj~1/RELEASE_I386 i386 i386 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/af6c39d6943bd4b0e1fde23161e7bb8c.jpg?s=120&d=mm&r=g)
On Sat, Jan 27, 2007 at 03:11:58PM -0700, Charles R Harris wrote:
And just for the hell of it, with 4 CPUs :) Linux dirac 2.6.17-10-generic #2 SMP Tue Dec 5 21:16:35 UTC 2006 x86_64 GNU/Linux processor : 0 vendor_id : AuthenticAMD model name : Dual Core AMD Opteron(tm) Processor 275 processor : 1 vendor_id : AuthenticAMD model name : Dual Core AMD Opteron(tm) Processor 275 processor : 2 vendor_id : AuthenticAMD model name : Dual Core AMD Opteron(tm) Processor 275 processor : 3 vendor_id : AuthenticAMD model name : Dual Core AMD Opteron(tm) Processor 275 flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow up pni lahf_lm cmp_legacy Works fine. Cheers Stéfan
![](https://secure.gravatar.com/avatar/96dd777e397ab128fedab46af97a3a4a.jpg?s=120&d=mm&r=g)
On 1/29/07, Keith Goodman <kwgoodman@gmail.com> wrote:
That's odd, the LSB bit of the double precision mantissa is only about 2.2e-16, so you can't *get* differences as small as 8.4e-22 without about 70 bit mantissa's. Hmmm, and extended double precision only has 63 bit mantissa's. Are you sure you are computing the error correctly? Chuck
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 1/29/07, Keith Goodman <kwgoodman@gmail.com> wrote:
Here is a setting for x and y that gives me a difference (using the unit test in this thread) of 4.54747e-13! That is huge---and a serious problem. I am sure I can get bigger. # x data x = M.zeros((3,3)) x[0,0] = 9.0030140479499 x[0,1] = 9.0026474226671 x[0,2] = -9.0011270502873 x[1,0] = 9.0228605377994 x[1,1] = 9.0033715311274 x[1,2] = -9.0082367491299 x[2,0] = 9.0044783987583 x[2,1] = 0.0027488028057 x[2,2] = -9.0036113393360 # y data y = M.zeros((3,1)) y[0,0] =10.00088539878978 y[1,0] = 0.00667193234012 y[2,0] = 0.00032472712345
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
Keith Goodman wrote: > On 1/29/07, Keith Goodman <kwgoodman@gmail.com> wrote: >> On 1/29/07, Keith Goodman <kwgoodman@gmail.com> wrote: >>> On 1/29/07, Charles R Harris <charlesr.harris@gmail.com> wrote: >>> >>>> That's odd, the LSB bit of the double precision mantissa is only about >>>> 2.2e-16, so you can't *get* differences as small as 8.4e-22 without about >>>> 70 bit mantissa's. Hmmm, and extended double precision only has 63 bit >>>> mantissa's. Are you sure you are computing the error correctly? >>> That is odd. >>> >>> 8.4e-22 is just the output of the test script: abs(z - z0).max(). That >>> abs is from python. >> By playing around with x and y I can get all sorts of values for abs(z >> - z0).max(). I can get down to the e-23 range and to 2.2e-16. I've >> also seen e-18 and e-22. > > Here is a setting for x and y that gives me a difference (using the > unit test in this thread) of 4.54747e-13! That is huge---and a serious > problem. I am sure I can get bigger. Check the size of z0. Only the relative difference abs((z-z0)/z0) is going to be about 1e-16. If you adjust the size of z0, the absolute difference will also change in size. In the original unittest that you wrote, z0 is about 1e-6, so 1e-22 corresponds to 1 ULP. With the data you give here, z0 is about 1e3, so 1e-13 also corresponds to 1 ULP. There is no (additional) problem here. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 1/27/07, Keith Goodman <kwgoodman@gmail.com> wrote:
Here's a unit test for the problem. I am distributing it in hopes of raising awareness of the problem. (What color should I make the Repeatability Wristbands?) I am sure others are having this problem without even knowing it.
![](https://secure.gravatar.com/avatar/ccb440c822567bba3d49d0ea2894b8a1.jpg?s=120&d=mm&r=g)
On a PPC MacOS X box I don't see an error. If I append if __name__ == "__main__": run() to your test code and then run it I get: repeatability #1 ... ok repeatability #2 ... ok repeatability #3 ... ok ---------------------------------------------------------------------- Ran 3 tests in 0.156s OK
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 1/29/07, Russell E. Owen <rowen@cesmail.net> wrote:
So far no one has duplicated the problem on windows or mac. The problem has only been seen on linux with atlas3-sse2. (I get a similar problem with other versions of atlas.) Are you running atlas on your PPC mac? Perhaps atlas3-altivec?
![](https://secure.gravatar.com/avatar/fd8e71405bcd3efac5cb6aea94b07c0d.jpg?s=120&d=mm&r=g)
On a 64-bit Intel Core2 Duo running Debian unstable with atlas3 (there is no specific atlas3-sse2 for AMD64 Debian, although I think that it is included) everything checks out fine: eiger:~$ uname -a Linux eiger 2.6.18-3-amd64 #1 SMP Sun Dec 10 19:57:44 CET 2006 x86_64 GNU/Linux eiger:~$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz stepping : 6 cpu MHz : 2660.009 cache size : 4096 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm bogomips : 5324.65 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: (same for 2nd CPU) Scott On Monday 29 January 2007 16:10, Keith Goodman wrote:
-- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom@nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
Keith Goodman wrote:
Another datapoint using atlas3-base on Ubuntu AMD-64. Looking at the source package, I think it sets ISAEXT="sse2" for AMD-64 when building. rkern@rkernx2:~$ python repeat_test.py repeatability #1 ... ok repeatability #2 ... ok repeatability #3 ... ok ---------------------------------------------------------------------- Ran 3 tests in 0.043s OK rkern@rkernx2:~$ uname -a Linux rkernx2 2.6.17-10-generic #2 SMP Fri Oct 13 15:34:39 UTC 2006 x86_64 GNU/Linux rkern@rkernx2:~$ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 43 model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ stepping : 1 cpu MHz : 2211.346 cache size : 512 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow up pni lahf_lm cmp_legacy bogomips : 4426.03 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp processor : 1 vendor_id : AuthenticAMD cpu family : 15 model : 43 model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ stepping : 1 cpu MHz : 2211.346 cache size : 512 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow up pni lahf_lm cmp_legacy bogomips : 4423.03 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 1/29/07, Robert Kern <robert.kern@gmail.com> wrote:
I ported the test to octave which like numpy uses Atlas. On my machine (debian etch atlas3-sse2) I get the problem in numpy but not in octave. Plus test1 always passes. So it is only when you reload x and y that the problem occurs. If you load x and y once (test1) and repeat the calculation, there is no problem. Do these two results point, however weakly, away from atlas?
![](https://secure.gravatar.com/avatar/6a1dc50b8d79fe3b9a5e9f5d8a118901.jpg?s=120&d=mm&r=g)
On 1/27/07, Keith Goodman <kwgoodman@gmail.com> wrote:
I built a new computer: Core 2 Duo 32-bit Debian etch with numpy 1.0.2.dev3546. The repeatability test still fails. In order to make my calculations repeatable I'll have to remove ATLAS. That really slows things down. Does anyone with Debian not have this problem?
participants (11)
-
Bruce Southey
-
Charles R Harris
-
Fernando Perez
-
Keith Goodman
-
Robert Kern
-
Russell E. Owen
-
Scott Ransom
-
Sebastian Haase
-
Sebastian Haase
-
Stefan
-
Stefan van der Walt