possible bug in lxml ? test_thread_error_log fail on openbsd

Hi, I am chasing a bug (python3.6 coredump, seems to be double-free) on python3 program of mine, which use lxml due to dependency (weboob in particular). I am still unsure if the problem is in lxml or in a dependency (libxml2 for example). I am able to reproduce the problem with the testsuite of lxml in both python2 and python3 environment, under OpenBSD. The failing test is test_thread_error_log. For simplicity I will speak only about test_thread_error_log. First, on OpenBSD, the malloc subsystem is very sensible to invalid use of malloc family function: it crashs (call abort(3)) early in order to avoid mis-use and/or possible exploitation of a bug. It could explain why the testsuite doesn't pass on OpenBSD whereas it is on other system. For good reproductibility of the problem (near 100%), I am using MALLOC_OPTIONS=S environment variable which enable malloc options for security auditing. In short, malloc(3) or free(3) will do more tests, and abort more easily on problem. See https://man.openbsd.org/malloc.3#MALLOC_OPTIONS for details. I am using lxml-3.7.0, as present on -current OpenBSD version. It seems to be compiled with standard options. $ MALLOC_OPTIONS=S python3.6 test.py -vv '.' test_thread_error_log Comparing with ElementTree 1.3.0 TESTED VERSION: 3.7.0 Python: sys.version_info(major=3, minor=6, micro=8, releaselevel='final', serial=0) lxml.etree: (3, 7, 0, 0) libxml used: (2, 9, 8) libxml compiled: (2, 9, 8) libxslt used: (1, 1, 32) libxslt compiled: (1, 1, 32) test_thread_error_log (lxml.tests.test_threading.ThreadingTestCase) ... python3.6(41533) in free(): chunk canary corrupted 0xf169db4b100 0x37@0x37 (double free?) Abort trap (core dumped) or $ MALLOC_OPTIONS=S python3.6 test.py -vv 'threading' test_thread_error_log test_thread_error_log (lxml.tests.test_threading.ThreadingTestCase) ... python3.6(79924) in free(): chunk is already free 0x9dccffba6c0 Abort trap (core dumped) or $ MALLOC_OPTIONS=S python3.6 test.py -vv 'threading' test_thread_error_log test_thread_error_log (lxml.tests.test_threading.ThreadingTestCase) ... python3.6(34868) in free(): chunk canary corrupted 0x1e598f5e9d80 0x37@0x37 (double free?) Abort trap (core dumped) $ gdb python3.6 python3.6.core ... Core was generated by `python3.6'. Program terminated with signal SIGABRT, Aborted. #0 thrkill () at -:3 3 -: No such file or directory. [Current thread is 1 (process 447114)] (gdb) bt #0 thrkill () at -:3 #1 0x00001e597f4d477e in _libc_abort () at /usr/src/lib/libc/stdlib/abort.c:51 #2 0x00001e597f444539 in wrterror (d=0x1e58f73c71f0, msg=0x1e597f3fe3e9 "chunk canary corrupted %p %#tx@%#zx%s") at /usr/src/lib/libc/stdlib/malloc.c:293 #3 0x00001e597f44748a in validate_canary (d=<optimized out>, ptr=<optimized out>, sz=1, allocated=<optimized out>) at /usr/src/lib/libc/stdlib/malloc.c:1025 #4 find_chunknum (d=0x0, info=<optimized out>, ptr=0x0, check=<optimized out>) at /usr/src/lib/libc/stdlib/malloc.c:1050 #5 0x00001e597f4449ec in ofree (argpool=<optimized out>, p=0x1e598f5e9d80, clear=<optimized out>, check=<optimized out>, argsz=0) at /usr/src/lib/libc/stdlib/malloc.c:1376 #6 0x00001e597f444628 in free (ptr=0x1e598f5e9d80) at /usr/src/lib/libc/stdlib/malloc.c:1433 #7 0x00001e593c9e7a71 in xmlCopyError () from /usr/local/lib/libxml2.so.16.1 #8 0x00001e593c9e76e6 in __xmlRaiseError () from /usr/local/lib/libxml2.so.16.1 #9 0x00001e593ca03123 in xmlFatalErrMsgStrIntStr () from /usr/local/lib/libxml2.so.16.1 #10 0x00001e593ca03628 in xmlParseEndTag2 () from /usr/local/lib/libxml2.so.16.1 #11 0x00001e593ca00f48 in xmlParseElement () from /usr/local/lib/libxml2.so.16.1 #12 0x00001e593ca04c4c in xmlParseDocument () from /usr/local/lib/libxml2.so.16.1 #13 0x00001e593ca0c3b7 in xmlDoRead () from /usr/local/lib/libxml2.so.16.1 #14 0x00001e5929ff77c2 in __pyx_f_4lxml_5etree_11_BaseParser__parseUnicodeDoc (__pyx_v_self=0x1e5955932cc0, __pyx_v_utext=0x1e58d710a800, __pyx_v_c_filename=0x0) at src/lxml/lxml.etree.c:110840 #15 0x00001e592a07b5a4 in __pyx_f_4lxml_5etree__parseDoc (__pyx_v_text=0x1e58d710a800, __pyx_v_filename=0x1e58baa4e198 <_Py_NoneStruct>, __pyx_v_parser=0x1e5955932cc0) at src/lxml/lxml.etree.c:116892 #16 0x00001e592a07916d in __pyx_f_4lxml_5etree__parseMemoryDocument (__pyx_v_text=0x1e58d710a800, __pyx_v_url=0x1e58baa4e198 <_Py_NoneStruct>, __pyx_v_parser=0x1e5955932cc0) at src/lxml/lxml.etree.c:118334 #17 0x00001e592a1f2f68 in __pyx_pf_4lxml_5etree_20XML (__pyx_self=0x1e58f8eff9a0, __pyx_v_text=0x1e58d710a800, __pyx_v_parser=0x1e5955932cc0, __pyx_v_base_url=0x1e58baa4e198 <_Py_NoneStruct>) at src/lxml/lxml.etree.c:78756 #18 0x00001e592a1f2c29 in __pyx_pw_4lxml_5etree_21XML (__pyx_self=0x1e58f8eff9a0, __pyx_args=0x1e58fedead48, __pyx_kwds=0x0) at src/lxml/lxml.etree.c:78642 #19 0x00001e592a214e96 in __Pyx_CyFunction_CallMethod (func=0x1e58f8eff9a0, self=0x1e58f8eff9a0, arg=0x1e58fedead48, kw=0x0) at src/lxml/lxml.etree.c:231798 #20 0x00001e592a21518a in __Pyx_CyFunction_Call (func=0x1e58f8eff9a0, arg=0x1e58fedead48, kw=0x0) at src/lxml/lxml.etree.c:231837 #21 0x00001e592a2140ee in __Pyx_CyFunction_CallAsMethod (func=0x1e58f8eff9a0, args=0x1e58fedead48, kw=0x0) at src/lxml/lxml.etree.c:231858 #22 0x00001e58ba8a8764 in _PyObject_FastCallDict () from /usr/local/lib/libpython3.6m.so.0.0 #23 0x00001e58ba98c799 in call_function () from /usr/local/lib/libpython3.6m.so.0.0 #24 0x00001e58ba986e82 in _PyEval_EvalFrameDefault () from /usr/local/lib/libpython3.6m.so.0.0 #25 0x00001e58ba98d15d in _PyEval_EvalCodeWithName () from /usr/local/lib/libpython3.6m.so.0.0 #26 0x00001e58ba984822 in PyEval_EvalCodeEx () from /usr/local/lib/libpython3.6m.so.0.0 #27 0x00001e58ba8d92f2 in function_call () from /usr/local/lib/libpython3.6m.so.0.0 #28 0x00001e58ba8a84c6 in PyObject_Call () from /usr/local/lib/libpython3.6m.so.0.0 #29 0x00001e58ba987e52 in _PyEval_EvalFrameDefault () from /usr/local/lib/libpython3.6m.so.0.0 #30 0x00001e58ba98dd16 in fast_function () from /usr/local/lib/libpython3.6m.so.0.0 #31 0x00001e58ba98c7a0 in call_function () from /usr/local/lib/libpython3.6m.so.0.0 #32 0x00001e58ba986e82 in _PyEval_EvalFrameDefault () from /usr/local/lib/libpython3.6m.so.0.0 #33 0x00001e58ba98dd16 in fast_function () from /usr/local/lib/libpython3.6m.so.0.0 #34 0x00001e58ba98c7a0 in call_function () from /usr/local/lib/libpython3.6m.so.0.0 #35 0x00001e58ba986e82 in _PyEval_EvalFrameDefault () from /usr/local/lib/libpython3.6m.so.0.0 #36 0x00001e58ba98e17a in _PyFunction_FastCallDict () from /usr/local/lib/libpython3.6m.so.0.0 #37 0x00001e58ba8a86d5 in _PyObject_FastCallDict () from /usr/local/lib/libpython3.6m.so.0.0 #38 0x00001e58ba8a88a6 in _PyObject_Call_Prepend () from /usr/local/lib/libpython3.6m.so.0.0 #39 0x00001e58ba8a84c6 in PyObject_Call () from /usr/local/lib/libpython3.6m.so.0.0 #40 0x00001e58ba9de5c0 in t_bootstrap () from /usr/local/lib/libpython3.6m.so.0.0 #41 0x00001e58ba9d668d in pythread_wrapper () from /usr/local/lib/libpython3.6m.so.0.0 #42 0x00001e58c56bc96e in _rthread_start (v=0x0) at /usr/src/lib/librthread/rthread.c:96 #43 0x00001e597f4d48eb in __tfork_thread () at /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75 #44 0x0000000000000000 in ?? () (gdb) Any advice to ensure if the bug is in lxml or libxml2 would be welcome... Thanks. -- Sebastien Marie

On Sat, Jan 12, 2019 at 03:02:46PM +0100, Sebastien Marie wrote:
I found that on OpenBSD, libxml2 is built with --without-threads option, and it seems to be the problem (rebuilding it without the option, makes test_thread_error_log to pass reliably). I dunno if it is something that lxml could detect or not, but the underline problem of double-free isn't directly related to lxml. Sorry for the noise. -- Sebastien Marie

On Sat, Jan 12, 2019 at 03:02:46PM +0100, Sebastien Marie wrote:
I found that on OpenBSD, libxml2 is built with --without-threads option, and it seems to be the problem (rebuilding it without the option, makes test_thread_error_log to pass reliably). I dunno if it is something that lxml could detect or not, but the underline problem of double-free isn't directly related to lxml. Sorry for the noise. -- Sebastien Marie
participants (1)
-
Sebastien Marie