fft segfault, 64 Bit Opteron

Hi Travis, some more info on the segfault for fft when running scipy.test(10,10). Looks similar to the other one, but is a different place: Multi-dimensional Fast Fourier Transform =================================================== | real input | complex input --------------------------------------------------- size | scipy | Numeric | scipy | Numeric --------------------------------------------------- 100x100 Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 46912507335168 (LWP 2965)] DOUBLE_subtract (args=0x1620400, dimensions=0x7fffffed722c, steps=0x16bd550, func=0x2aaab7bce000) at umathmodule.c:1959 1959 *((double *)op)=*((double *)i1) - *((double *)i2); (gdb) bt #0 DOUBLE_subtract (args=0x1620400, dimensions=0x7fffffed722c, steps=0x16bd550, func=0x2aaab7bce000) at umathmodule.c:1959 #1 0x00002aaaabd421fd in PyUFunc_GenericFunction (self=0x719520, args=0x2aaab600c830, mps=0x2) at ufuncobject.c:1569 #2 0x00002aaaabd44c69 in ufunc_generic_call (self=0x719520, args=0x2aaab600c830) at ufuncobject.c:2553 #3 0x0000000000417808 in PyObject_CallFunction (callable=0x719520, format=0x67fffffd796 <Address 0x67fffffd796 out of bounds>) at abstract.c:1756 #4 0x00002aaaabba68f7 in array_subtract (m1=0x67fffffd796, m2=0x7fffffed722c) at arrayobject.c:2261 #5 0x00000000004145a7 in binary_op1 (v=0x159a530, w=0x162f490, op_slot=8) at abstract.c:371 #6 0x000000000041500e in PyNumber_Subtract (v=0x159a530, w=0x162f490) at abstract.c:422 #7 0x0000000000474c9e in PyEval_EvalFrame (f=0xf741c0) at ceval.c:1144 #8 0x000000000047ad2f in PyEval_EvalCodeEx (co=0x2aaaabd233b0, globals=0x7fffffed722c, locals=0x16bd550, args=0xf741c0, argcount=2, kws=0xf736b8, kwcount=0, defs=0x2aaaae93ce28, defcount=1, closure=0x0) at ceval.c:2736 #9 0x00000000004788f7 in PyEval_EvalFrame (f=0xf734b0) at ceval.c:3650 #10 0x000000000047ad2f in PyEval_EvalCodeEx (co=0x2aaaab944e30, globals=0x7fffffed722c, locals=0x16bd550, args=0xf734b0, argcount=2, kws=0x834010, kwcount=0, defs=0x2aaaab967a88, defcount=2, closure=0x0) at ceval.c:2736 #11 0x00000000004788f7 in PyEval_EvalFrame (f=0x833e20) at ceval.c:3650 #12 0x000000000047ad2f in PyEval_EvalCodeEx (co=0x2aaab68f6c00, globals=0x7fffffed722c, locals=0x16bd550, args=0x833e20, argcount=1, kws=0x6f1490, kwcount=0, defs=0x2aaab68f5b28, defcount=1, closure=0x0) at ceval.c:2736 #13 0x00000000004788f7 in PyEval_EvalFrame (f=0x6f12e0) at ceval.c:3650 #14 0x000000000047ad2f in PyEval_EvalCodeEx (co=0x2aaaab95c810, globals=0x7fffffed722c, locals=0x16bd550, args=0x6f12e0, argcount=2, kws=0xb32110, kwcount=0, defs=0x2aaaab9643a8, defcount=1, closure=0x0) at ceval.c:2736 #15 0x00000000004c6099 in function_call (func=0x2aaaab969758, arg=0x2aaab5425128, kw=0xfab4e0) at funcobject.c:548 #16 0x0000000000417700 in PyObject_Call (func=0x1620400, arg=0x7fffffed722c, kw=0x16bd550) at abstract.c:1756 #17 0x00000000004772ea in PyEval_EvalFrame (f=0x894f90) at ceval.c:3835 #18 0x000000000047ad2f in PyEval_EvalCodeEx (co=0x2aaaab95c880, globals=0x7fffffed722c, locals=0x16bd550, args=0x894f90, argcount=2, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at ceval.c:2736 #19 0x00000000004c6099 in function_call (func=0x2aaaab9697d0, arg=0x2aaab54215a8, kw=0x0) at funcobject.c:548 #20 0x0000000000417700 in PyObject_Call (func=0x1620400, arg=0x7fffffed722c, kw=0x16bd550) at abstract.c:1756 #21 0x0000000000420ee0 in instancemethod_call (func=0x1620400, arg=0x2aaab54215a8, kw=0x0) at classobject.c:2447 #22 0x0000000000417700 in PyObject_Call (func=0x1620400, arg=0x7fffffed722c, kw=0x16bd550) at abstract.c:1756 #23 0x00000000004777d9 in PyEval_EvalFrame (f=0x833780) at ceval.c:3766 #24 0x000000000047ad2f in PyEval_EvalCodeEx (co=0x2aaaaab42340, globals=0x7fffffed722c, locals=0x16bd550, args=0x833780, argcount=2, kws=0x0, kwcount=0, defs=0x2aaaab964168, defcount=1, closure=0x0) at ceval.c:2736 #25 0x00000000004c6099 in function_call (func=0x2aaaab96b668, arg=0x2aaab5421488, kw=0x0) at funcobject.c:548 #26 0x0000000000417700 in PyObject_Call (func=0x1620400, arg=0x7fffffed722c, kw=0x16bd550) at abstract.c:1756 #27 0x0000000000420ee0 in instancemethod_call (func=0x1620400, arg=0x2aaab5421488, kw=0x0) at classobject.c:2447 #28 0x0000000000417700 in PyObject_Call (func=0x1620400, arg=0x7fffffed722c, kw=0x16bd550) at abstract.c:1756 #29 0x000000000044fd80 in slot_tp_call (self=0x2aaab68f5d10, args=0x2aaab693ef90, kwds=0x0) at typeobject.c:4536 #30 0x0000000000417700 in PyObject_Call (func=0x1620400, arg=0x7fffffed722c, kw=0x16bd550) at abstract.c:1756 #31 0x00000000004777d9 in PyEval_EvalFrame (f=0x6d3dc0) at ceval.c:3766 #32 0x000000000047ad2f in PyEval_EvalCodeEx (co=0x2aaaab95e2d0, globals=0x7fffffed722c, locals=0x16bd550, args=0x6d3dc0, argcount=2, kws=0xf51060, kwcount=0, defs=0x0, defcount=0, closure=0x0) at ceval.c:2736 #33 0x00000000004c6099 in function_call (func=0x2aaaab96a050, arg=0x2aaab5421680, kw=0xf4d590) at funcobject.c:548 #34 0x0000000000417700 in PyObject_Call (func=0x1620400, arg=0x7fffffed722c, kw=0x16bd550) at abstract.c:1756 #35 0x00000000004772ea in PyEval_EvalFrame (f=0xc00a80) at ceval.c:3835 #36 0x000000000047ad2f in PyEval_EvalCodeEx (co=0x2aaaab95e340, globals=0x7fffffed722c, locals=0x16bd550, args=0xc00a80, argcount=2, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at ceval.c:2736 #37 0x00000000004c6099 in function_call (func=0x2aaaab96a0c8, arg=0x2aaab5421518, kw=0x0) at funcobject.c:548 #38 0x0000000000417700 in PyObject_Call (func=0x1620400, arg=0x7fffffed722c, kw=0x16bd550) at abstract.c:1756 HTH, Arnd

Hi Travis, one more addition - the build log gives: In file included from build/src/scipy/base/src/umathmodule.c:8036: scipy/base/src/ufuncobject.c: In function `PyUFunc_GenericFunction': scipy/base/src/ufuncobject.c:1569: warning: passing arg 2 of pointer to function from incompatible pointer type gcc -pthread -shared build/temp.linux-x86_64-2.4/build/src/scipy/base/src/umathm The patch below fixes the compile warning. However, it does not fix the segfault I get ... (unless the changed file did not transfer to the destination directory after building - see my distutils questions ...;-) Best, Arnd abaecker@ptphp01:~/BUILDS2/Build_100/core> svn diff Index: scipy/base/src/ufuncobject.c =================================================================== --- scipy/base/src/ufuncobject.c (revision 1616) +++ scipy/base/src/ufuncobject.c (working copy) @@ -1442,7 +1442,7 @@ int fastmemcpy[MAX_ARGS]; int *needbuffer=loop->needbuffer; intp index=loop->index, size=loop->size; - int bufsize; + intp bufsize; int copysizes[MAX_ARGS]; void **bufptr = loop->bufptr; void **buffer = loop->buffer;

Arnd Baecker wrote:
Hi Travis,
one more addition - the build log gives:
In file included from build/src/scipy/base/src/umathmodule.c:8036: scipy/base/src/ufuncobject.c: In function `PyUFunc_GenericFunction': scipy/base/src/ufuncobject.c:1569: warning: passing arg 2 of pointer to function from incompatible pointer type gcc -pthread -shared build/temp.linux-x86_64-2.4/build/src/scipy/base/src/umathm
Thanks much for this testing. Could you send the entire build log again. Perhaps there is something else. I've made a fix which should work better on 64-bit. More 64-bit testing needed. One trick to test the buffered section of code using smaller arrays is to set the buffer size to something very small (but a multiple of 16 --- 16 is the smallest). For arrays smaller than the buffer size, array's are just copied when a cast is needed. But, for larger arrays, the buffered code is exercised. For example: scipy.setbufsize(16) scipy.test(1,1) At some-point we might play with different buffer sizes to see if some numbers are better than others. -Travis

FYI, on my Athlon64 (Debian sarge, gcc 3.3, stock atlas), I'm not getting segfaults using scipy.test(10,10) but only a single failure. This is a svn checkout last night, so it may lag slightly, but it seems to indicate Arnd's segfaults may reflect a gcc vs. icc issue. In [2]: scipy.__core_version__ Out[2]: '0.8.1.1611' In [3]: scipy.__scipy_version__ Out[3]: '0.4.3.1479' ====================================================================== FAIL: check_odeint1 (scipy.integrate.test_integrate.test_odeint) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/astraw/py24-amd64/lib/python2.4/site-packages/scipy/integrate/tests/test_integrate.py", line 51, in check_odeint1 assert res < 1.0e-6 AssertionError ---------------------------------------------------------------------- Ran 1238 tests in 60.816s FAILED (failures=1) Travis Oliphant wrote:
Arnd Baecker wrote:
Hi Travis,
one more addition - the build log gives:
In file included from build/src/scipy/base/src/umathmodule.c:8036: scipy/base/src/ufuncobject.c: In function `PyUFunc_GenericFunction': scipy/base/src/ufuncobject.c:1569: warning: passing arg 2 of pointer to function from incompatible pointer type gcc -pthread -shared build/temp.linux-x86_64-2.4/build/src/scipy/base/src/umathm
Thanks much for this testing.
Could you send the entire build log again. Perhaps there is something else. I've made a fix which should work better on 64-bit.
More 64-bit testing needed.
One trick to test the buffered section of code using smaller arrays is to set the buffer size to something very small (but a multiple of 16 --- 16 is the smallest). For arrays smaller than the buffer size, array's are just copied when a cast is needed. But, for larger arrays, the buffered code is exercised.
For example:
scipy.setbufsize(16) scipy.test(1,1)
At some-point we might play with different buffer sizes to see if some numbers are better than others.
-Travis
_______________________________________________ Scipy-dev mailing list Scipy-dev@scipy.net http://www.scipy.net/mailman/listinfo/scipy-dev

On Fri, 9 Dec 2005, Andrew Straw wrote:
FYI, on my Athlon64 (Debian sarge, gcc 3.3, stock atlas), I'm not getting segfaults using scipy.test(10,10) but only a single failure. This is a svn checkout last night, so it may lag slightly, but it seems to indicate Arnd's segfaults may reflect a gcc vs. icc issue.
Sorry if I created confusion by trying to install scipy on too many machines. Presently we have dumped icc for almost all purposes (apart from those like ATLAS, which penetrantly finds icc, even if it is removed from any paths - rumours say that icc is even found when the corresponding hard disk is disconnected ... ;-) ((sorry could not resist - just the signs of 5 days of installation troubles ...))) The reported segfault comes from a gcc only installation (fortunately there is no icc on that 64 Bit Opteron ...) Best, Arnd P.S.: the check_odeint1 is a persistent failure (don't remember in which version this was introduced - around the beginning of this week?)

On Fri, 9 Dec 2005, Travis Oliphant wrote:
Arnd Baecker wrote:
On
The reported segfault comes from a gcc only installation (fortunately there is no icc on that 64 Bit Opteron ...)
Did you try the most recent SVN?
by now yes (not to the time of writing the above lines)... see my other message. Arnd

Andrew Straw wrote:
FYI, on my Athlon64 (Debian sarge, gcc 3.3, stock atlas), I'm not getting segfaults using scipy.test(10,10) but only a single failure. This is a svn checkout last night, so it may lag slightly, but it seems to indicate Arnd's segfaults may reflect a gcc vs. icc issue.
In [2]: scipy.__core_version__ Out[2]: '0.8.1.1611'
In [3]: scipy.__scipy_version__ Out[3]: '0.4.3.1479'
====================================================================== FAIL: check_odeint1 (scipy.integrate.test_integrate.test_odeint) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/astraw/py24-amd64/lib/python2.4/site-packages/scipy/integrate/tests/test_integrate.py", line 51, in check_odeint1 assert res < 1.0e-6 AssertionError
----------------------------------------------------------------------
Finally tracked this one down. It was a problem with a Fortran array getting created inappropriately because when I changed the implementation of the back-wards compatible Numeric C-API for PyArray_FromDims, I made a mistake so that Fortran arrays always got created. Now, what surprises me is that it didn't cause more problems. I suppose that's because f2py does not use PyArray_FromDims anymore. But, lots of other people's code does I'm sure.... Anyway it should be fixed and now all scipy tests pass for me again. -Travis

On Fri, 9 Dec 2005, Travis Oliphant wrote: [...]
I've made a fix which should work better on 64-bit.
With scipy.__core_version__ = '0.8.1.1617' we got it working - both on the Opteron and Itanium!! In both cases gcc was used.
More 64-bit testing needed.
One trick to test the buffered section of code using smaller arrays is to set the buffer size to something very small (but a multiple of 16 --- 16 is the smallest). For arrays smaller than the buffer size, array's are just copied when a cast is needed. But, for larger arrays, the buffered code is exercised.
For example:
scipy.setbufsize(16) scipy.test(1,1)
Works fine for both machines! Travis, there is something which is bothering me for a while, and which clearly shows in Jan's build: The fftw3 performance is very poor/weird for one-dimensional complex arrays (this is on the Itanium2 with gcc): Fast Fourier Transform ================================================= | real input | complex input ------------------------------------------------- size | scipy | Numeric | scipy | Numeric ------------------------------------------------- 100 | 1.28 | 1.59 | 10.06 | 1.57 (secs for 7000 calls) 1000 | 1.08 | 3.06 | 9.36 | 3.00 (secs for 2000 calls) 256 | 2.39 | 3.74 | 19.29 | 3.68 (secs for 10000 calls) 512 | 3.54 | 8.27 | 26.81 | 8.10 (secs for 10000 calls) 1024 | 0.57 | 1.44 | 4.29 | 1.41 (secs for 1000 calls) 2048 | 0.99 | 3.18 | 7.40 | 3.12 (secs for 1000 calls) 4096 | 0.96 | 3.04 | 6.93 | 2.99 (secs for 500 calls) 8192 | 2.04 | 7.91 | 14.40 | 7.85 (secs for 500 calls) Multi-dimensional Fast Fourier Transform =================================================== | real input | complex input --------------------------------------------------- size | scipy | Numeric | scipy | Numeric --------------------------------------------------- 100x100 | 0.83 | 2.33 | 0.82 | 2.24 (secs for 100 calls) 1000x100 | 0.57 | 2.04 | 0.58 | 2.13 (secs for 7 calls) 256x256 | 0.68 | 1.62 | 0.67 | 1.63 (secs for 10 calls) 512x512 | 1.85 | 3.28 | 1.71 | 3.41 (secs for 3 calls) Do you have any idea, what could be causing this? Best, Arnd
participants (3)
-
Andrew Straw
-
Arnd Baecker
-
Travis Oliphant