Numpy Segmentation Fault
Hi all, I am getting numpy segmentation fault on a custom install of python to a prefix. I am running this example code. *import numpy import numpy.linalg x=numpy.eye(1000) for i in range(10): eigenvalues,eigenvectors=numpy.linalg.eig(x) eigenvalues,eigenvectors=numpy.linalg.eig(x) print str(i),'-------------------------------'* I have been trying to debug this for the last two weeks and with no success so far. The same code runs fine with Python-2.6.1 and numpy-1.2.1 but seg faults in Python-2.6.4 and numpy-1.3.0 Also, the error occurs non-deterministically in the loop and sometimes it does not even occur at all. I have tried reinstalling and rebuilding python but even that does not help. Please find attached the information which I think might be relevant for us to figure out the root cause. System info *Python 2.6.4 (r264:75706, Apr 6 2010, 04:49:11) [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
*
strace output tail *) = 34 futex(0x1001f6e0, FUTEX_WAKE, 1) = 0 futex(0x1001f6e0, FUTEX_WAKE, 1) = 0 write(1, "here\n", 5here ) = 5 rt_sigaction(SIGINT, {SIG_DFL}, {0x2b78c55fec60, [], SA_RESTORER, 0x34d940de70}, 8) = 0 brk(0x117ff000) = 0x117ff000 brk(0x108b1000) = 0x108b1000 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++* Regards, Yogesh Tomar
On Tue, Apr 6, 2010 at 7:46 PM, Yogesh Tomar <yogesh.tomar@gmail.com> wrote:
Hi all,
I am getting numpy segmentation fault on a custom install of python to a prefix.
What happens if you build numpy 1.3.0 against python 2.6.1 (instead of your own 2.6.4) ? For numpy, a strace is not really useful (a backtrace from gdb much more), cheers, David
numpy 1.3.0 also segfaults the same way. Is it the problem with libc library? Gdb stacktrace. gdb) run /home/eqan/64bit/current/segf.py Starting program: /home/eqan/tapas/64bit/Python/2.6.4_x641/bin/python /home/eqan/64bit/current/segf.py [Thread debugging using libthread_db enabled] [New Thread 47653213733440 (LWP 24970)] 0 ------------------------------- 1 ------------------------------- 2 ------------------------------- 3 ------------------------------- 4 ------------------------------- 5 ------------------------------- 6 ------------------------------- 7 ------------------------------- 8 ------------------------------- 9 ------------------------------- here Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 47653213733440 (LWP 24970)] 0x00000034d8c71033 in _int_free () from /lib64/libc.so.6 (gdb) up #1 0x00000034d8c74c5c in free () from /lib64/libc.so.6 (gdb) up #2 0x00002b57203451db in code_dealloc (co=0xdab63f0) at Objects/codeobject.c:260 260 Py_XDECREF(co->co_code); (gdb) up #3 0x00002b5720359a73 in func_dealloc (op=0xdab67d0) at Objects/funcobject.c:454 454 Py_DECREF(op->func_code); (gdb) up #4 0x00002b57203691ed in insertdict (mp=0xda8ae90, key=0xda2f180, hash=1904558708720393281, value=0x2b572066c210) at Objects/dictobject.c:459 459 Py_DECREF(old_value); /* which **CAN** re-enter */ (gdb) up #5 0x00002b572036b153 in PyDict_SetItem (op=0xda8ae90, key=0xda2f180, value=0x2b572066c210) at Objects/dictobject.c:701 701 if (insertdict(mp, key, hash, value) != 0) (gdb) up #6 0x00002b572036d5c8 in _PyModule_Clear (m=<value optimized out>) at Objects/moduleobject.c:138 138 PyDict_SetItem(d, key, Py_None); (gdb) up #7 0x00002b57203dfa34 in PyImport_Cleanup () at Python/import.c:474 474 _PyModule_Clear(value); (gdb) up #8 0x00002b57203ed167 in Py_Finalize () at Python/pythonrun.c:434 434 PyImport_Cleanup(); (gdb) up #9 0x00002b57203f9cdc in Py_Main (argc=<value optimized out>, argv=0x7fff8a7bf478) at Modules/main.c:625 625 Py_Finalize(); (gdb) up #10 0x00000034d8c1d8b4 in __libc_start_main () from /lib64/libc.so.6 (gdb) up Regards, Yogesh Tomar On Tue, Apr 6, 2010 at 4:22 PM, David Cournapeau <cournape@gmail.com> wrote:
On Tue, Apr 6, 2010 at 7:46 PM, Yogesh Tomar <yogesh.tomar@gmail.com> wrote:
Hi all,
I am getting numpy segmentation fault on a custom install of python to a prefix.
What happens if you build numpy 1.3.0 against python 2.6.1 (instead of your own 2.6.4) ? For numpy, a strace is not really useful (a backtrace from gdb much more),
cheers,
David _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Tue, Apr 6, 2010 at 8:28 PM, Yogesh Tomar <ytomar@gmail.com> wrote:
numpy 1.3.0 also segfaults the same way.
I mean building numpy 1.3.0 against python 2.6.1 instead of 2.6.4 - since the crash happen on a python you built by yourself, that's the first thing I would look into before looking into numpy or python bug.
Is it the problem with libc library?
Very unlikely, this looks like a ref count bug, cheers, David
I also think the same. There is some problem with my python installation. Because a similar installation python-2.6.4 and numpy-1.3.0 which I did elsewhere does not seg fault for the same code. But I need your help to figure it out. Can you please elaborate on ref count bug? that might help. Regards, Yogesh Tomar On Tue, Apr 6, 2010 at 5:17 PM, David Cournapeau <cournape@gmail.com> wrote:
On Tue, Apr 6, 2010 at 8:28 PM, Yogesh Tomar <ytomar@gmail.com> wrote:
numpy 1.3.0 also segfaults the same way.
I mean building numpy 1.3.0 against python 2.6.1 instead of 2.6.4 - since the crash happen on a python you built by yourself, that's the first thing I would look into before looking into numpy or python bug.
Is it the problem with libc library?
Very unlikely, this looks like a ref count bug,
cheers,
David _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
I have confirmed it, It seems like this is a garbage collection issue and very likely a ref count one. How can these kind of things can be fixed? Regards, Yogesh Tomar On Tue, Apr 6, 2010 at 6:12 PM, Yogesh Tomar <ytomar@gmail.com> wrote:
I also think the same. There is some problem with my python installation. Because a similar installation python-2.6.4 and numpy-1.3.0 which I did elsewhere does not seg fault for the same code.
But I need your help to figure it out. Can you please elaborate on ref count bug? that might help.
Regards, Yogesh Tomar
On Tue, Apr 6, 2010 at 5:17 PM, David Cournapeau <cournape@gmail.com>wrote:
On Tue, Apr 6, 2010 at 8:28 PM, Yogesh Tomar <ytomar@gmail.com> wrote:
numpy 1.3.0 also segfaults the same way.
I mean building numpy 1.3.0 against python 2.6.1 instead of 2.6.4 - since the crash happen on a python you built by yourself, that's the first thing I would look into before looking into numpy or python bug.
Is it the problem with libc library?
Very unlikely, this looks like a ref count bug,
cheers,
David _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On 04/08/2010 10:05 AM, Yogesh Tomar wrote:
I have confirmed it, It seems like this is a garbage collection issue and very likely a ref count one.
How can these kind of things can be fixed?
Regards, Yogesh Tomar
On Tue, Apr 6, 2010 at 6:12 PM, Yogesh Tomar <ytomar@gmail.com <mailto:ytomar@gmail.com>> wrote:
I also think the same. There is some problem with my python installation. Because a similar installation python-2.6.4 and numpy-1.3.0 which I did elsewhere does not seg fault for the same code.
But I need your help to figure it out. Can you please elaborate on ref count bug? that might help.
Regards, Yogesh Tomar
On Tue, Apr 6, 2010 at 5:17 PM, David Cournapeau <cournape@gmail.com <mailto:cournape@gmail.com>> wrote:
On Tue, Apr 6, 2010 at 8:28 PM, Yogesh Tomar <ytomar@gmail.com <mailto:ytomar@gmail.com>> wrote: > numpy 1.3.0 also segfaults the same way.
I mean building numpy 1.3.0 against python 2.6.1 instead of 2.6.4 - since the crash happen on a python you built by yourself, that's the first thing I would look into before looking into numpy or python bug.
> Is it the problem with libc library?
Very unlikely, this looks like a ref count bug,
cheers,
David
Hi, Based on the list, I always suspect lapack/atlas issues when I see problems related to eigenvalues. What is configuration including lapack/atlas ie the output from 'numpy.show_config()'? Bruce
Here it is atlas_threads_info: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/home/eqan/tapas/install/atlas/64/lib'] language = f77 blas_opt_info: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/home/eqan/tapas/install/atlas/64/lib'] define_macros = [('ATLAS_INFO', '"\\"3.8.3\\""')] language = c atlas_blas_threads_info: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/home/eqan/tapas/install/atlas/64/lib'] language = c lapack_opt_info: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/home/eqan/tapas/install/atlas/64/lib'] define_macros = [('ATLAS_INFO', '"\\"3.8.3\\""')] language = f77 lapack_mkl_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE mkl_info: NOT AVAILABLE Regards, Yogesh Tomar On Thu, Apr 8, 2010 at 9:34 PM, Bruce Southey <bsouthey@gmail.com> wrote:
On 04/08/2010 10:05 AM, Yogesh Tomar wrote:
I have confirmed it, It seems like this is a garbage collection issue and very likely a ref count one.
How can these kind of things can be fixed?
Regards, Yogesh Tomar
On Tue, Apr 6, 2010 at 6:12 PM, Yogesh Tomar <ytomar@gmail.com> wrote:
I also think the same. There is some problem with my python installation. Because a similar installation python-2.6.4 and numpy-1.3.0 which I did elsewhere does not seg fault for the same code.
But I need your help to figure it out. Can you please elaborate on ref count bug? that might help.
Regards, Yogesh Tomar
On Tue, Apr 6, 2010 at 5:17 PM, David Cournapeau <cournape@gmail.com>wrote:
On Tue, Apr 6, 2010 at 8:28 PM, Yogesh Tomar <ytomar@gmail.com> wrote:
numpy 1.3.0 also segfaults the same way.
I mean building numpy 1.3.0 against python 2.6.1 instead of 2.6.4 - since the crash happen on a python you built by yourself, that's the first thing I would look into before looking into numpy or python bug.
Is it the problem with libc library?
Very unlikely, this looks like a ref count bug,
cheers,
David
Hi, Based on the list, I always suspect lapack/atlas issues when I see problems related to eigenvalues.
What is configuration including lapack/atlas ie the output from 'numpy.show_config()'?
Bruce
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Fri, Apr 9, 2010 at 12:05 AM, Yogesh Tomar <ytomar@gmail.com> wrote:
I have confirmed it, It seems like this is a garbage collection issue and very likely a ref count one.
How can these kind of things can be fixed?
No - as you changed both numpy and python versions, we have to isolate the problem. So I would prefer to see what happens with the same numpy version built against the Red Hat python (2.6.1) before looking into numpy proper. Given how simple your example is, it is quite unlikely that there is a ref count bug that nobody encountered before, David
Hi David, Many thanks for pointing this out, it turns out that it is indeed an ATLAS issues and now I am able to run the test without any seg fault, after re-installing numpy without ATLAS A small question, if I use numpy without ATLAS will there be any functions that will not be available? Is there any major performance hit that I need to be aware of? Regards, Yogesh Tomar On Fri, Apr 9, 2010 at 1:04 PM, David Cournapeau <cournape@gmail.com> wrote:
On Fri, Apr 9, 2010 at 12:05 AM, Yogesh Tomar <ytomar@gmail.com> wrote:
I have confirmed it, It seems like this is a garbage collection issue and very likely a ref count one.
How can these kind of things can be fixed?
No - as you changed both numpy and python versions, we have to isolate the problem. So I would prefer to see what happens with the same numpy version built against the Red Hat python (2.6.1) before looking into numpy proper. Given how simple your example is, it is quite unlikely that there is a ref count bug that nobody encountered before,
David _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Fri, Apr 9, 2010 at 3:33 AM, Yogesh Tomar <ytomar@gmail.com> wrote:
Hi David,
Many thanks for pointing this out, it turns out that it is indeed an ATLAS issues and now I am able to run the test without any seg fault, after re-installing numpy without ATLAS
A small question, if I use numpy without ATLAS will there be any functions that will not be available? Is there any major performance hit that I need to be aware of?
The performance hit is likely to come with large arrays. For small arrays the call overhead will dominate and most of the data will reside in cache anyway. It is large arrays where the cache awareness of ATLAS makes a difference in performance and the difference can be a substantial. Where did you get ATLAS from? Did you build it yourself? <snip> Chuck
Yes I built it myself, some time back. I am using version 3.8.3 Regards, Yogesh Tomar On Fri, Apr 9, 2010 at 7:14 PM, Charles R Harris <charlesr.harris@gmail.com>wrote:
On Fri, Apr 9, 2010 at 3:33 AM, Yogesh Tomar <ytomar@gmail.com> wrote:
Hi David,
Many thanks for pointing this out, it turns out that it is indeed an ATLAS issues and now I am able to run the test without any seg fault, after re-installing numpy without ATLAS
A small question, if I use numpy without ATLAS will there be any functions that will not be available? Is there any major performance hit that I need to be aware of?
The performance hit is likely to come with large arrays. For small arrays the call overhead will dominate and most of the data will reside in cache anyway. It is large arrays where the cache awareness of ATLAS makes a difference in performance and the difference can be a substantial. Where did you get ATLAS from? Did you build it yourself?
<snip>
Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On 4/6/2010 6:46 AM, Yogesh Tomar wrote:
import numpy import numpy.linalg x=numpy.eye(1000) for i in range(10): eigenvalues,eigenvectors=numpy.linalg.eig(x) eigenvalues,eigenvectors=numpy.linalg.eig(x) print str(i),'-------------------------------'*
I'm not seeing any problem with 1.4.1rc. Alan Isaac (Python 2.6.5 on Vista)
participants (6)
-
Alan G Isaac
-
Bruce Southey
-
Charles R Harris
-
David Cournapeau
-
Yogesh Tomar
-
Yogesh Tomar