[lxml-dev] segfaults under python2.3 compiled with gcc 2.95.2
Hi, -- long-post-warning -- I run into segfaults using a python2.3, gcc 2.95.2 - build (yes I know, mighty old compiler :-): Lib versions: lxml 1.1.2 libxml2-2.6.23 libxslt-1.1.15 Tests work fine with everything gcc3.4.4-built, using python2.4 (same lib versions), see below. Both segfaults seem to arise when an error is raised internally (though it is unclear why the first is an error - it does not raise an exception when run under python2.4/gcc3.4.4). 1. Using unprefixed namespace: python2.3 Python 2.3.4 (#6, Jul 20 2004, 11:09:38) [GCC 2.95.2 19991024 (release)] on sunos5 Type "help", "copyright", "credits" or "license" for more information.
from lxml import etree elt = etree.fromstring("""<RvXML xmlns="myURI"></RvXML>""") Segmentation Fault (core dumped)
backtrace:
gdb /apps/prod/bin/python2.3
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "sparc-sun-solaris2.6"...
(gdb) r -i -c 'from lxml import etree; elt = etree.fromstring("""<RvXML
xmlns="myURI"></RvXML>""")'
Starting program: /apps/prod/bin/python2.3 -i -c 'from lxml import etree;
elt = etree.fromstring("""<RvXML xmlns="myURI"></RvXML>""")'
[New LWP 2 ]
[New LWP 3 ]
[New LWP 4 ]
[New LWP 5 ]
Program received signal SIGSEGV, Segmentation fault.
0xff07b760 in pthread_mutex_lock () from /usr/lib/libthread.so.1
(gdb) bt
#0 0xff07b760 in pthread_mutex_lock () from /usr/lib/libthread.so.1
#1 0xaa760 in PyThread_release_lock (lock=0x0) at
Python/thread_pthread.h:532
#2 0x840c0 in PyEval_ReleaseThread (tstate=0x11abc0) at Python/ceval.c:339
#3 0xa4338 in PyGILState_Release (oldstate=PyGILState_UNLOCKED) at
Python/pystate.c:473
#4 0xfe68ceb4 in __pyx_f_5etree__receiveError
(__pyx_v_c_log_handler=0x2146c0, __pyx_v_error=0x206700)
at src/lxml/etree.c:15789
#5 0xfe4324b4 in __xmlRaiseError (schannel=0xfe68ce68
<__pyx_f_5etree__receiveError>,
channel=0xfe432880 <xmlParserWarning>, data=0x2146c0, ctx=0x206580,
nod=0x0, domain=1, code=100,
level=XML_ERR_WARNING, file=0x0, line=1, str1=0x25f01b "myURI",
str2=0x0, str3=0x0, int1=0, col=13,
msg=0xfe518560 "xmlns: URI %s is not absolute\n") at error.c:612
#6 0xfe43668c in xmlWarningMsg (ctxt=0xfe518560,
error=XML_WAR_NS_URI_RELATIVE,
msg=0xfe518560 "xmlns: URI %s is not absolute\n", str1=0x25f01b
"myURI", str2=0x0) at parser.c:415
#7 0xfe44593c in xmlParseStartTag2 (ctxt=0x206580, pref=0xffbeee34,
URI=0xffbeee30, tlen=0xffbeee2c)
at parser.c:7854
#8 0xfe44719c in xmlParseElement (ctxt=0x206580) at parser.c:8437
#9 0xfe448b9c in xmlParseDocument (ctxt=0x206580) at parser.c:9129
#10 0xfe44e454 in xmlDoRead (ctxt=0x206580, URL=0x0, encoding=0x0,
options=16386, reuse=1)
at parser.c:13101
#11 0xfe44e80c in xmlCtxtReadMemory (ctxt=0x206580, buffer=0x24e568 "",
size=29, URL=0x0, encoding=0x0,
options=16386) at parser.c:13379
#12 0xfe649bd0 in __pyx_f_5etree_11_BaseParser__parseDoc
(__pyx_v_self=0x1c8630,
__pyx_v_c_text=0x14f8bc "
from lxml import etree elt = etree.fromstring('<root>
</root>') elt = etree.fromstring('<root> </root>')
Program received signal SIGSEGV, Segmentation fault.
0xff138dcc in mutex_lock_impl () from /lib/libc.so.1
(gdb) bt
#0 0xff138dcc in mutex_lock_impl () from /lib/libc.so.1
#1 0xaa760 in PyThread_release_lock (lock=0x0) at
Python/thread_pthread.h:532
#2 0x840c0 in PyEval_ReleaseThread (tstate=0x119cc8) at Python/ceval.c:339
#3 0xa4338 in PyGILState_Release (oldstate=PyGILState_UNLOCKED) at
Python/pystate.c:473
#4 0xfef0ceb4 in __pyx_f_5etree__receiveError
(__pyx_v_c_log_handler=0x2179e0, __pyx_v_error=0x209f30)
at src/lxml/etree.c:15789
#5 0xfecb24b4 in __xmlRaiseError (schannel=0xfef0ce68
<__pyx_f_5etree__receiveError>, channel=0,
data=0x2179e0, ctx=0x209db0, nod=0x0, domain=1, code=68,
level=XML_ERR_FATAL, file=0x0, line=1,
str1=0x0, str2=0x0, str3=0x0, int1=0, col=9, msg=0xfed982d8 "error
parsing attribute name\n")
at error.c:612
#6 0xfecb65c0 in xmlFatalErrMsg (ctxt=0x209db0,
error=XML_ERR_NAME_REQUIRED,
msg=0xfed982d8 "error parsing attribute name\n") at parser.c:387
#7 0xfecc5428 in xmlParseAttribute2 (ctxt=0x209db0, pref=0x0,
elem=0x261933 "k", prefix=0xffbfec80,
value=0xffbfec7c, len=0xffbfec78, alloc=0xffbfec74) at parser.c:7676
#8 0xfecc5864 in xmlParseStartTag2 (ctxt=0x209db0, pref=0xffbfed04,
URI=0xffbfed00, tlen=0xffbfecfc)
at parser.c:7838
#9 0xfecc719c in xmlParseElement (ctxt=0x209db0) at parser.c:8437
#10 0xfecc6fa4 in xmlParseContent (ctxt=0x209db0) at parser.c:8361
#11 0xfecc7464 in xmlParseElement (ctxt=0x209db0) at parser.c:8521
#12 0xfecc8b9c in xmlParseDocument (ctxt=0x209db0) at parser.c:9129
#13 0xfecce454 in xmlDoRead (ctxt=0x209db0, URL=0x0, encoding=0x0,
options=16386, reuse=1)
at parser.c:13101
#14 0xfecce80c in xmlCtxtReadMemory (ctxt=0x209db0, buffer=0x250e78 "",
size=34, URL=0x0, encoding=0x0,
options=16386) at parser.c:13379
#15 0xfeec9bd0 in __pyx_f_5etree_11_BaseParser__parseDoc
(__pyx_v_self=0x21a1f0,
__pyx_v_c_text=0x21a274 "<root>
from lxml import etree elt = etree.fromstring("""<RvXML xmlns="myURI"></RvXML>""")
2. $ /apps/pydev/gcc/3.4.4/bin/python2.4 Python 2.4.3 (#1, Apr 12 2006, 11:57:09) [GCC 3.4.4] on sunos5 Type "help", "copyright", "credits" or "license" for more information.
from lxml import etree elt = etree.fromstring('<root>
</root>') elt = etree.fromstring('<root> </root>') Traceback (most recent call last): File "<stdin>", line 1, in ? File "etree.pyx", line 1695, in etree.XML File "parser.pxi", line 920, in etree._parseMemoryDocument File "parser.pxi", line 816, in etree._parseDoc File "parser.pxi", line 502, in etree._BaseParser._parseDoc File "parser.pxi", line 605, in etree._handleParseResult File "parser.pxi", line 576, in etree._raiseParseError etree.XMLSyntaxError: line 1: Extra content at the end of the document
Greetings, Holger P.S.: I haven't forgotten about the objectify benchmarks, but these days I find it hard dedicating myself to the true cause - but one fine day I hope to come up with them. Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene Empfänger sind oder falls diese E-Mail irrtümlich an Sie adressiert wurde, verständigen Sie bitte den Absender sofort und löschen Sie die E-Mail sodann. Das unerlaubte Kopieren sowie die unbefugte Übermittlung sind nicht gestattet. Die Sicherheit von Übermittlungen per E-Mail kann nicht garantiert werden. Falls Sie eine Bestätigung wünschen, fordern Sie bitte den Inhalt der E-Mail als Hardcopy an. The contents of this e-mail are confidential. If you are not the named addressee or if this transmission has been addressed to you in error, please notify the sender immediately and then delete this e-mail. Any unauthorized copying and transmission is forbidden. E-Mail transmission cannot be guaranteed to be secure. If verification is required, please request a hard copy version.
Hi Holger, Holger Joukl wrote:
I run into segfaults using a python2.3, gcc 2.95.2 - build (yes I know, mighty old compiler :-):
and difficult to track down problems. Impossible to say if it's because of a specific system environment (who compiled your pthreads, for example? where did your Python binary come from?) or because of the compiler itself, specific compiler options... Have you tried it with 'safe' options like "-O -m386" and the like? (or the same for Solaris respectively).
Tests work fine with everything gcc3.4.4-built, using python2.4 (same lib versions)
Did you use the same compiler for the libs?
Both segfaults seem to arise when an error is raised internally (though it is unclear why the first is an error - it does not raise an exception when run under python2.4/gcc3.4.4).
But the stack trace tells me that you got a warning from libxml2 that went up into the Python environment:
1. Using unprefixed namespace:
python2.3
elt = etree.fromstring("""<RvXML xmlns="myURI"></RvXML>""") Segmentation Fault (core dumped)
Program received signal SIGSEGV, Segmentation fault. 0xff07b760 in pthread_mutex_lock () from /usr/lib/libthread.so.1 (gdb) bt #0 0xff07b760 in pthread_mutex_lock () from /usr/lib/libthread.so.1 #1 0xaa760 in PyThread_release_lock (lock=0x0) at Python/thread_pthread.h:532 #2 0x840c0 in PyEval_ReleaseThread (tstate=0x11abc0) at Python/ceval.c:339 #3 0xa4338 in PyGILState_Release (oldstate=PyGILState_UNLOCKED) at Python/pystate.c:473 #4 0xfe68ceb4 in __pyx_f_5etree__receiveError (__pyx_v_c_log_handler=0x2146c0, __pyx_v_error=0x206700) at src/lxml/etree.c:15789 #5 0xfe4324b4 in __xmlRaiseError (schannel=0xfe68ce68 <__pyx_f_5etree__receiveError>, channel=0xfe432880 <xmlParserWarning>, data=0x2146c0, ctx=0x206580, nod=0x0, domain=1, code=100, level=XML_ERR_WARNING, file=0x0, line=1, str1=0x25f01b "myURI", str2=0x0, str3=0x0, int1=0, col=13, msg=0xfe518560 "xmlns: URI %s is not absolute\n") at error.c:612 #6 0xfe43668c in xmlWarningMsg (ctxt=0xfe518560, error=XML_WAR_NS_URI_RELATIVE, msg=0xfe518560 "xmlns: URI %s is not absolute\n", str1=0x25f01b "myURI", str2=0x0) at parser.c:415 [...]
Anyway, it doesn't look like the error is the cause, it's more because of a problem when acquiring the GIL.
2. Using illegal tag names:
Here is the stack trace (no clue what the "warning: Lowest section in ..." is supposed to mean, btw):
elt = etree.fromstring('<root>
</root>') Program received signal SIGSEGV, Segmentation fault. 0xff138dcc in mutex_lock_impl () from /lib/libc.so.1 (gdb) bt #0 0xff138dcc in mutex_lock_impl () from /lib/libc.so.1 #1 0xaa760 in PyThread_release_lock (lock=0x0) at Python/thread_pthread.h:532 #2 0x840c0 in PyEval_ReleaseThread (tstate=0x119cc8) at Python/ceval.c:339 #3 0xa4338 in PyGILState_Release (oldstate=PyGILState_UNLOCKED) at Python/pystate.c:473 #4 0xfef0ceb4 in __pyx_f_5etree__receiveError (__pyx_v_c_log_handler=0x2179e0, __pyx_v_error=0x209f30) at src/lxml/etree.c:15789 #5 0xfecb24b4 in __xmlRaiseError (schannel=0xfef0ce68 <__pyx_f_5etree__receiveError>, channel=0, data=0x2179e0, ctx=0x209db0, nod=0x0, domain=1, code=68, level=XML_ERR_FATAL, file=0x0, line=1, str1=0x0, str2=0x0, str3=0x0, int1=0, col=9, msg=0xfed982d8 "error parsing attribute name\n") at error.c:612 #6 0xfecb65c0 in xmlFatalErrMsg (ctxt=0x209db0, error=XML_ERR_NAME_REQUIRED, msg=0xfed982d8 "error parsing attribute name\n") at parser.c:387
Looks pretty similar to me, both segfault in pthreads mutex calls. Maybe it's not a compiler issue but an issue with Python 2.3 on Solaris - or a mixture of many causes...
Both work fine with python2.4, compiled with gcc 3.4.4: [...]
Uhm, on the same machine? What do you need the gcc 2.95 for, then? Personally, I feel no big incentive in supporting a compiler as old as 2.95. GCC is at version 4 by now and if we run fine with 3.4, that's all I'd ask for. Stefan
Hi Stefan,
no segfault with gcc2.95.2-built python2.4, see below...still testing.
Stefan Behnel
Hi Holger,
Holger Joukl wrote:
I run into segfaults using a python2.3, gcc 2.95.2 - build (yes I know, mighty old compiler :-):
and difficult to track down problems. Impossible to say if it's because of a specific system environment (who compiled your pthreads, for example? where did your Python binary come from?) or because of the compiler itself, specific compiler options...
Have you tried it with 'safe' options like "-O -m386" and the like? (or
same for Solaris respectively).
Tests work fine with everything gcc3.4.4-built, using python2.4 (same
We compile Python from source, pthreads comes with Solaris afaik. So in theory I have control over the usage of compiler options. I tried some more combinations and do not run into the segfault with a gcc2.95.2-built python2.4.3 So my current best guess would be it is some python2.3-related problem. I _do_ run into the segfault in every combination where I used python2.3 built with gcc2.95.2. I haven't tried building python2.3 with gcc3.4.4 so maybe I will do this if I find time. the lib
versions)
Did you use the same compiler for the libs?
Yes, gcc 3.4.4 for python interpreter, libxml2, libxslt, lxml.
Both segfaults seem to arise when an error is raised internally (though it is unclear why the first is an error - it does not raise an exception when run under python2.4/gcc3.4.4).
But the stack trace tells me that you got a warning from libxml2 that went up into the Python environment: [...] Looks pretty similar to me, both segfault in pthreads mutex calls. Maybe it's not a compiler issue but an issue with Python 2.3 on Solaris - or a mixture of many causes...
Both work fine with python2.4, compiled with gcc 3.4.4: [...]
Uhm, on the same machine? What do you need the gcc 2.95 for, then?
Yes, on the same sparc solaris box. The reason for sticking to gcc 2.95.2 is an old _heavily_ modified boost.python we used to auto-wrap (with openc++) the TIB/Rv-API. Due to different partial template specialization with newer compilers this mechanism cannot be easily changed, if at all. A main reason why we want to get rid of this, migrating either to a newer boost version or another wrapping tool. But we have successfully tried running the old rv module with the gcc3.4.4-built python2.4 so we are not really stuck here.
Personally, I feel no big incentive in supporting a compiler as old as 2.95. GCC is at version 4 by now and if we run fine with 3.4, that's all I'd ask for.
Stefan
For plain C programs I think gcc 2.95.2 should still work. Holger Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene Empfänger sind oder falls diese E-Mail irrtümlich an Sie adressiert wurde, verständigen Sie bitte den Absender sofort und löschen Sie die E-Mail sodann. Das unerlaubte Kopieren sowie die unbefugte Übermittlung sind nicht gestattet. Die Sicherheit von Übermittlungen per E-Mail kann nicht garantiert werden. Falls Sie eine Bestätigung wünschen, fordern Sie bitte den Inhalt der E-Mail als Hardcopy an. The contents of this e-mail are confidential. If you are not the named addressee or if this transmission has been addressed to you in error, please notify the sender immediately and then delete this e-mail. Any unauthorized copying and transmission is forbidden. E-Mail transmission cannot be guaranteed to be secure. If verification is required, please request a hard copy version.
participants (2)
-
Holger Joukl
-
Stefan Behnel