how to debug python application crashed occasionally

jacky wang bugking.wang at gmail.com
Wed Apr 21 02:55:51 EDT 2010


Hello

  recently, I met a problem with one python application running with
python2.5 | debian/lenny adm64 system: it crashed occasionally in our
production environment. The problem started to happen just after we
upgraded the python application from python2.4 | debian/etch amd64.

  after configuring the system to enable core dump & debugging with
the core dumps by following the guide line from http://wiki.python.org/moin/DebuggingWithGdb,
I became more confused about that.

  The first crash case was happening in calling python-xml module,
which is claimed as a pure python module, and it's not supposed to
crash python interpreter. because the python application is relatively
a big one, I can not show u guys the exact source code related with
the crash, but only the piece of python modules. GDB shows it's
crashed at string join operation:

#0  string_join (self=0x7f7075baf030, orig=<value optimized out>)
at ../Objects/stringobject.c:1795
1795    ../Objects/stringobject.c: No such file or directory.
        in ../Objects/stringobject.c

and pystack macro shows the details:

gdb) pystack
/usr/lib/python2.5/StringIO.py (271): getvalue
/usr/lib/python2.5/site-packages/_xmlplus/dom/minidom.py (62):
toprettyxml
/usr/lib/python2.5/site-packages/_xmlplus/dom/minidom.py (47): toxml

  at that time, we also found python-xml module has performance issue
for our application, so we decided to use python-lxml to replace
python-xml. After that replacement, the crash was gone. That's a bit
weird for me, but anyway, it's gone.

  Unfortunately, another two 'kinds' of crashes happening after that,
and the core dumps show they are not related with the replacement.

  One is crashed with "Program terminated with signal 11", and the
pystack macro shows it's crashed at calling the built-in id()
function.

#0  visit_decref (op=0x20200a3e22726574, data=0x0) at ../Modules/
gcmodule.c:270
270     ../Modules/gcmodule.c: No such file or directory.
        in ../Modules/gcmodule.c


  Another is crashed with "Program terminated with signal 7", and the
pystack macro shows it's crashed at the exactly same operation (string
join) as the first one (python-xml), but in different library python-
simplejson:

#0  string_join (self=0x7f5149877030, orig=<value optimized out>)
at ../Objects/stringobject.c:1795
1795    ../Objects/stringobject.c: No such file or directory.
        in ../Objects/stringobject.c

(gdb) pystack
/var/lib/python-support/python2.5/simplejson/encoder.py (367): encode
/var/lib/python-support/python2.5/simplejson/__init__.py (243): dumps

  I'm not good at using gdb & C programming, then I tried some other
ways to dig further:
   * get the source code of python2.5, but can not figure out the
crash reason :(
   * since Debian distribution provides python-dbg package, and I
tried to use python2.5-dbg interpreter, but not the python2.5, so that
I can get more debug information in the core dump file. Unfortunately,
my python application is using a bunch of C modules, and not all of
them provides -dbg package in Debian/Lenny. So it still doesn't make
any progress yet.

  I will be really appreciated if somebody can help me about how to
debug the python crashes.

  Thanks in advance!

BR
Jacky Wang



More information about the Python-list mailing list