On 09/21/2012 11:41 AM, Ondřej Čertík wrote:
Hi Orion,
On Thu, Sep 20, 2012 at 2:56 PM, Orion Poplawski wrote:
This is a plea for some help. We've been having trouble getting scipy to
pass all of the tests in the Fedora 18 build with python 3.3 (although it
seems to build okay in Fedora 19). Below are the logs of the build. There
appears to be some kind of memory corruption that manifests itself a little
differently on 32-bit vs. 64-bit. I really have no idea myself how to
pursue debugging this, though I'm happy to provide any more needed
information.
Thanks for testing the latest beta2 release.
Task 4509077 on buildvm-35.phx2.fedoraproject.org
Task Type: buildArch (scipy-0.11.0-0.1.rc2.fc18.src.rpm, i686)
logs:
http://koji.fedoraproject.org/koji/getfile?taskID=4509077&name=build.log
This link has the following stacktrace:
/lib/libpython3.3m.so.1.0(PyMem_Free+0x1c)[0xbf044c]
/usr/lib/python3.3/site-packages/numpy/core/multiarray.cpython-33m.so(+0x4d52b)[0x42252b]
/usr/lib/python3.3/site-packages/numpy/core/multiarray.cpython-33m.so(+0xcb7c5)[0x4a07c5]
/usr/lib/python3.3/site-packages/numpy/core/multiarray.cpython-33m.so(+0xcbc5e)[0x4a0c5e]
Which indeed looks like in NumPy. Would you be able to obtain full stacktrace?
There has certainly been segfaults in Python 3.3 with NumPy, but we've
fixed all that we could reproduce. That doesn't mean there couldn't be
more. If you could nail it down a little bit more, that would be
great. I'll help once I can reproduce it somehow.
Ondrej
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Trying to get back to this as we still see it with numpy 1.7.0 and scipy 0.11.
I'm seeing a segfault in malloc_consolidate(), which seems like would only
occur if there were problems earlier, so I'm not sure a stack trace is all
that useful.
Starting program: /usr/bin/python3
/export/home/orion/redhat/BUILDROOT/scipy-0.11.0-3.fc19.x86_64/usr/lib64/python3.3/site-packages/scipy/linalg/tests/test_decomp.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
..............
Program received signal SIGSEGV, Segmentation fault.
0x0000003d8d67bdad in malloc_consolidate (av=av@entry=0x3d8d9b1740 )
at malloc.c:4151
4151 unlink(av, nextchunk, bck, fwd);
Here's some:
#0 0x0000003d8d67bdad in malloc_consolidate (av=av@entry=0x3d8d9b1740
)
at malloc.c:4151
#1 0x0000003d8d67d09e in _int_malloc (av=0x3d8d9b1740 ,
bytes=<optimized out>)
at malloc.c:3422
#2 0x0000003d8d67f443 in __GI___libc_malloc (bytes=2632) at malloc.c:2862
#3 0x00007ffff121816c in PyArray_IterNew (obj=)
at numpy/core/src/multiarray/iterators.c:385
#4 0x00007ffff1218201 in PyArray_IterAllButAxis (obj=obj@entry=
, inaxis=inaxis@entry=0x7fffffff873c)
at numpy/core/src/multiarray/iterators.c:488
#5 0x00007ffff1257970 in _new_argsort (which=NPY_QUICKSORT, axis=0, op=0xe02fd0)
at numpy/core/src/multiarray/item_selection.c:815
#6 PyArray_ArgSort (op=op@entry=0xe02fd0, axis=0, which=NPY_QUICKSORT)
at numpy/core/src/multiarray/item_selection.c:1104
#7 0x00007ffff125873a in array_argsort (self=0xe02fd0, args=<optimized out>,
kwds=<optimized out>) at numpy/core/src/multiarray/methods.c:1213
#8 0x0000003b74d0cc8e in call_function (oparg=<optimized out>,
pp_stack=0x7fffffff8998)
at /usr/src/debug/Python-3.3.0/Python/ceval.c:4091
#9 PyEval_EvalFrameEx (f=f@entry=
Frame 0xd3ecb0, for file
/usr/lib64/python3.3/site-packages/numpy/core/fromnumeric.py, line 681, in
argsort (a=, axis=-1, kind='quicksort',
order=None, argsort=),
throwflag=throwflag@entry=0) at
/usr/src/debug/Python-3.3.0/Python/ceval.c:2703
#10 0x0000003b74d0de63 in PyEval_EvalCodeEx (_co=_co@entry=,
globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>,
argcount=argcount@entry=1, kws=0xe23ab8, kwcount=kwcount@entry=0,
defs=0x7ffff1a965b8,
defcount=3, kwdefs=0x0, closure=0x0) at
/usr/src/debug/Python-3.3.0/Python/ceval.c:3462
#11 0x0000003b74d0c707 in fast_function (nk=0, na=1, n=<optimized out>, pp_stack=
0x7fffffff8c88, func=)
at /usr/src/debug/Python-3.3.0/Python/ceval.c:4189
#12 call_function (oparg=<optimized out>, pp_stack=0x7fffffff8c88)
at /usr/src/debug/Python-3.3.0/Python/ceval.c:4112
(gdb) up 3
#3 0x00007ffff121816c in PyArray_IterNew (obj=)
at numpy/core/src/multiarray/iterators.c:385
385 it = (PyArrayIterObject *)PyArray_malloc(sizeof(PyArrayIterObject));
(gdb) print *obj
$4 = {ob_refcnt = 5, ob_type = 0x7ffff14c6900 }
(gdb) list
380 PyErr_BadInternalCall();
381 return NULL;
382 }
383 ao = (PyArrayObject *)obj;
384
385 it = (PyArrayIterObject *)PyArray_malloc(sizeof(PyArrayIterObject));
386 PyObject_Init((PyObject *)it, &PyArrayIter_Type);
387 /* it = PyObject_New(PyArrayIterObject, &PyArrayIter_Type);*/
388 if (it == NULL) {
389 return NULL;
valgrind reports problems like:
==10886== Invalid write of size 8
==10886== at 0x3D9C5CB576: dlacpy_ (in /usr/lib64/atlas/liblapack.so.3.0)
==10886== by 0x3D9C6481F7: dsbevx_ (in /usr/lib64/atlas/liblapack.so.3.0)
==10886== by 0x115D8212: ??? (in
/export/home/orion/redhat/BUILDROOT/scipy-0.11.0-3.fc19.x86_64/usr/lib64/python3.3/site-packages/scipy/linalg/flapack.cpython-33m.so)
==10886== by 0x3B74C5EF8E: PyObject_Call (abstract.c:2082)
==10886== by 0x3B74D07DDC: PyEval_EvalFrameEx (ceval.c:4311)
==10886== by 0x3B74D0C9B4: PyEval_EvalFrameEx (ceval.c:4179)
==10886== by 0x3B74D0DE62: PyEval_EvalCodeEx (ceval.c:3462)
==10886== by 0x3B74D0C706: PyEval_EvalFrameEx (ceval.c:4189)
==10886== by 0x3B74D0DE62: PyEval_EvalCodeEx (ceval.c:3462)
==10886== by 0x3B74C8547F: function_call (funcobject.c:633)
==10886== by 0x3B74C5EF8E: PyObject_Call (abstract.c:2082)
==10886== by 0x3B74D05F7F: PyEval_EvalFrameEx (ceval.c:4406)
==10886== Address 0xbbc8cd0 is 0 bytes after a block of size 80 alloc'd
==10886== at 0x4A0883C: malloc (vg_replace_malloc.c:270)
==10886== by 0xE8A103A: PyDataMem_NEW (multiarraymodule.c:3492)
==10886== by 0xE8C3F74: PyArray_NewFromDescr (ctors.c:970)
==10886== by 0x115E032B: array_from_pyobj (in
/export/home/orion/redhat/BUILDROOT/scipy-0.11.0-3.fc19.x86_64/usr/lib64/python3.3/site-packages/scipy/linalg/flapack.cpython-33m.so)
==10886== by 0x115D7F5E: ??? (in
/export/home/orion/redhat/BUILDROOT/scipy-0.11.0-3.fc19.x86_64/usr/lib64/python3.3/site-packages/scipy/linalg/flapack.cpython-33m.so)
==10886== by 0x3B74C5EF8E: PyObject_Call (abstract.c:2082)
==10886== by 0x3B74D07DDC: PyEval_EvalFrameEx (ceval.c:4311)
==10886== by 0x3B74D0C9B4: PyEval_EvalFrameEx (ceval.c:4179)
==10886== by 0x3B74D0DE62: PyEval_EvalCodeEx (ceval.c:3462)
==10886== by 0x3B74D0C706: PyEval_EvalFrameEx (ceval.c:4189)
==10886== by 0x3B74D0DE62: PyEval_EvalCodeEx (ceval.c:3462)
==10886== by 0x3B74C8547F: function_call (funcobject.c:633)
==10886==
==10886== Invalid read of size 8
==10886== at 0x3D9C61DAD5: dlasr_ (in /usr/lib64/atlas/liblapack.so.3.0)
==10886== by 0x3D9C663092: dsteqr_ (in /usr/lib64/atlas/liblapack.so.3.0)
==10886== by 0x3D9C648290: dsbevx_ (in /usr/lib64/atlas/liblapack.so.3.0)
==10886== by 0x115D8212: ??? (in
/export/home/orion/redhat/BUILDROOT/scipy-0.11.0-3.fc19.x86_64/usr/lib64/python3.3/site-packages/scipy/linalg/flapack.cpython-33m.so)
==10886== by 0x3B74C5EF8E: PyObject_Call (abstract.c:2082)
==10886== by 0x3B74D07DDC: PyEval_EvalFrameEx (ceval.c:4311)
==10886== by 0x3B74D0C9B4: PyEval_EvalFrameEx (ceval.c:4179)
==10886== by 0x3B74D0DE62: PyEval_EvalCodeEx (ceval.c:3462)
==10886== by 0x3B74D0C706: PyEval_EvalFrameEx (ceval.c:4189)
==10886== by 0x3B74D0DE62: PyEval_EvalCodeEx (ceval.c:3462)
==10886== by 0x3B74C8547F: function_call (funcobject.c:633)
==10886== by 0x3B74C5EF8E: PyObject_Call (abstract.c:2082)
==10886== Address 0xbbc8dc0 is not stack'd, malloc'd or (recently) free'd
So perhaps an atlas issue, or the way scipy/numpy calls it. I'll try to look
into it more. Suggestions welcome.
--
Orion Poplawski
Technical Manager 303-415-9701 x222
NWRA, Boulder Office FAX: 303-415-9702
3380 Mitchell Lane orion@nwra.com
Boulder, CO 80301 http://www.nwra.com