Strange scipy.test crashes on OS X/Intel
Hi folks- With the problems I've been having with scipy-0.6.0 on RHEL 5, and my colleague with it on FC4, I experimented with it on my OS X machines. MacBook (Intel), OS 10.4, Python 2.5.1, numpy 1.0.3.1: Running scipy.test() the 1st time completed 1719 tests and reported 5 failures: FAIL: check_cosine_weighted_infinite (scipy.integrate.tests.test_quadpack.test_quad) FAIL: check_sine_weighted_finite (scipy.integrate.tests.test_quadpack.test_quad) FAIL: check_sine_weighted_infinite (scipy.integrate.tests.test_quadpack.test_quad) FAIL: check_dot (scipy.lib.tests.test_blas.test_fblas1_simple) FAIL: check_x_stride (scipy.lib.blas.tests.test_fblas.test_cgemv) I ran it again (in the same Python invocation) and got 4 of the 5 failures (all but check_x_stride). I ran it a third time, and it crashed with a seg fault: Found 42 tests for scipy.lib.lapack Found 41 tests for scipy.linalg.basic <module 'scipy.linalg.fblas' from '/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/fblas.so'> Segmentation fault The seg fault was 100% repeatable: Every time I ran scipy.test() a 3rd time, I got the same seg fault. I deleted all scipy stuff from site-packages, and tried this again with scipy 0.5.2.1. Different tests failed. I got bus errors or seg faults, but inconsistently, after 3-5 runs of scipy.test(), and at different places in the tests (but typically soon after the tests started, i.e., after just a few "..." lines). I removed scipy stuff, and reinstalled 0.6.0. This time I could run scipy.test() 9 times before getting a bus error on the 10th run. Subsequent Python invocations led to bus errors or seg faults sooner and sooner, until the original behavior was duplicated: seg fault on the 3rd run of scipy.test(), always here: Found 41 tests for scipy.linalg.basic <module 'scipy.linalg.fblas' from '/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/fblas.so'> Segmentation fault Subsequently, I always get the same crash on the 3rd run of scipy.test(). I tried similar tests on my G4 (PPC) desktop, OS 10.4, same numpy, but with an older Python (2.4.4) and scipy 0.5.2.1. I ran scipy.test() a dozen times with no problems apart from the 3 failures and 9 errors I've always gotten. I realize this is inconclusive, as the problem could be with Python 2.5.1 or gfortran or scipy, but I wanted to report it. It seems a notable coincidence that linalg seems involved in the crashes, since it is also giving me problems on RHEL 5, and my colleague on FC4. If there is anything further I can do to help diagnose these issues, please advise. Thanks, Tom ------------------------------------------------- This mail sent through IMP: http://horde.org/imp/
Tom Loredo wrote:
If there is anything further I can do to help diagnose these issues, please advise.
Can you run it in gdb and give us the backtrace? $ gdb python GNU gdb 6.3.50-20050815 (Apple version gdb-573) (Fri Oct 20 15:50:43 GMT 2006) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-apple-darwin"...Reading symbols for shared libraries .. done (gdb) run -c "import scipy; scipy.test(10,10)" Starting program: /usr/local/bin/python -c "import scipy; scipy.test()" Reading symbols for shared libraries . done Program received signal SIGTRAP, Trace/breakpoint trap. 0x8fe01010 in __dyld__dyld_start () (gdb) c Continuing. Reading symbols for shared libraries . done Reading symbols for shared libraries . done ... Then use the "bt" command after the segfault happens to get the backtrace. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Robert Kern wrote:
Tom Loredo wrote:
If there is anything further I can do to help diagnose these issues, please advise.
Can you run it in gdb and give us the backtrace?
$ gdb python GNU gdb 6.3.50-20050815 (Apple version gdb-573) (Fri Oct 20 15:50:43 GMT 2006) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-apple-darwin"...Reading symbols for shared libraries .. done
(gdb) run -c "import scipy; scipy.test(10,10)" Starting program: /usr/local/bin/python -c "import scipy; scipy.test()" Reading symbols for shared libraries . done
Program received signal SIGTRAP, Trace/breakpoint trap. 0x8fe01010 in __dyld__dyld_start () (gdb) c Continuing. Reading symbols for shared libraries . done Reading symbols for shared libraries . done ...
Then use the "bt" command after the segfault happens to get the backtrace.
I don't think there is a need for gdb, the error is again in the check_dot: the dot function when using apple vecLib framework does not work. I tried to fix it in scipy.linalg, but the code is duplicated in scipy.lib, and I didn't make the modification there. My understanding, but Pearu should confirm this since I am not involved at all with this code, is that scipy.lib is a refactoring of the core of scipy.linalg; as such, maybe the module can be disabled at the test level ? scipy.lib is not used anywhere else in scipy, and people should use scipy.linalg, no ? This is a mere suggestion, maybe someone else has a better idea ? cheers, David
participants (3)
-
David Cournapeau
-
Robert Kern
-
Tom Loredo