[lxml-dev] improved valgrind suppressions for Python?

Hi there, Since upgrading to a new version of valgrind quite a while ago, valgrind gets spammy, even with the Python suppressions in place. I suspect this is because valgrind is getting more picky. I'm getting stuff like this: ==10199== Conditional jump or move depends on uninitialised value(s) ==10199== at 0x1B8F4FD1: (within /lib/ld-2.3.5.so) ==10199== by 0x1B8EA4AA: (within /lib/ld-2.3.5.so) ==10199== by 0x4111F18F: (within /lib/tls/libc-2.3.5.so) ==10199== by 0x1B8EF026: (within /lib/ld-2.3.5.so) ==10199== by 0x4111FB85: _dl_open (in /lib/tls/libc-2.3.5.so) ==10199== by 0x4118FD32: (within /lib/tls/libdl-2.3.5.so) ==10199== by 0x1B8EF026: (within /lib/ld-2.3.5.so) ==10199== by 0x41190486: (within /lib/tls/libdl-2.3.5.so) ==10199== by 0x4118FDB0: dlopen (in /lib/tls/libdl-2.3.5.so) ==10199== by 0x80DCE40: _PyImport_GetDynLoadFunc (in /usr/bin/python2.4) ==10199== by 0x80D2622: _PyImport_LoadDynamicModule (in /usr/bin/python2.4) ==10199== by 0x80D223F: (within /usr/bin/python2.4) and generally ld related output, presumably something to do with dynamic linking. This makes it harder to pick out potentially real memory issues. I'm wondering whether others get this same behavior with valgrind and have it fixed, or whether everybody is stuck with it. I looked for newer suppressions in the Python repository at some point, but couldn't find it then. My valgrind version is 3.0.1 Regards, Martijn

Hey, Also note that running valgrind over the lxml testsuite now reports quite a few problems -- invalid reads of content previously freed by libxml2, for instance. Regards, Martijn

Hi Martijn, Martijn Faassen wrote:
I hope you're refering to the test_attribute_xmlns_move test. That's the xml:id bug that Noah found. I'm working on that, but it's rather tricky. See here: http://bugzilla.gnome.org/show_bug.cgi?id=343302 Stefan

Stefan Behnel wrote:
I indeed get problems with test_attribute_xmlns_move, but also with test_module_HTML_unicode. I also appear to get a whole bunch of problems near the end of the test run (possbily when it's running doctests? not sure..). Note that test_attribute_xmlns_move currently also fails when I run the tests. Are you checking with valgrind, by the way? What do you do with the supression? Or is valgrind not an option for you and is this something I should do regularly for you? Regards, Martijn

Martijn Faassen wrote:
I see you did a fix; the last one is now gone.
I think I have been looking wrong and mistook the error summary for errors at the end. There are some reports at the end but they look to have something to do with the Python interpreter again, not lxml in particular: ==31495== Invalid read of size 4 ==31495== at 0x41129379: (within /lib/tls/libc-2.3.5.so) ==31495== by 0x410E9121: (within /lib/tls/libc-2.3.5.so) ==31495== by 0x410E9211: tdestroy (in /lib/tls/libc-2.3.5.so) ==31495== by 0x4112971B: (within /lib/tls/libc-2.3.5.so) ==31495== by 0x41129D51: __libc_freeres (in /lib/tls/libc-2.3.5.so) ==31495== by 0x1B8FC68A: _vgw_freeres (vg_preloaded.c:62) ==31495== by 0x41045785: exit (in /lib/tls/libc-2.3.5.so) ==31495== by 0x80D8468: (within /usr/bin/python2.4) ==31495== by 0x80D8645: PyErr_PrintEx (in /usr/bin/python2.4) ==31495== by 0x80D9143: PyRun_SimpleFileExFlags (in /usr/bin/python2.4) ==31495== by 0x8055A05: Py_Main (in /usr/bin/python2.4) ==31495== by 0x4102DEBF: __libc_start_main (in /lib/tls/libc-2.3.5.so) ==31495== Address 0xC is not stack'd, malloc'd or (recently) free'd ==31495== ==31495== Process terminating with default action of signal 11 (SIGSEGV) ==31495== Access not within mapped region at address 0xC ==31495== at 0x41129379: (within /lib/tls/libc-2.3.5.so) ==31495== by 0x410E9121: (within /lib/tls/libc-2.3.5.so) ==31495== by 0x410E9211: tdestroy (in /lib/tls/libc-2.3.5.so) ==31495== by 0x4112971B: (within /lib/tls/libc-2.3.5.so) ==31495== by 0x41129D51: __libc_freeres (in /lib/tls/libc-2.3.5.so) ==31495== by 0x1B8FC68A: _vgw_freeres (vg_preloaded.c:62) ==31495== by 0x41045785: exit (in /lib/tls/libc-2.3.5.so) ==31495== by 0x80D8468: (within /usr/bin/python2.4) ==31495== by 0x80D8645: PyErr_PrintEx (in /usr/bin/python2.4) ==31495== by 0x80D9143: PyRun_SimpleFileExFlags (in /usr/bin/python2.4) ==31495== by 0x8055A05: Py_Main (in /usr/bin/python2.4) ==31495== by 0x4102DEBF: __libc_start_main (in /lib/tls/libc-2.3.5.so) Regards, Martijn

Hi Martijn, Martijn Faassen wrote:
I know, we had two bug reports for the problem that is now covered by that test.
but also with test_module_HTML_unicode.
Don't you ever do updates? That was fixed at least half an hour ago! :)
Don't know about those, If you want to check, feel free.
I do from time to time, not regularly. The fix above slipped through since the last valgrind run. It was actually for fixing a few memory leaks, but I added one call too much. I'm stuffed with work currently, so if you want to run some tests and check them, go ahead. Stefan

Hi Martijn, Martijn Faassen wrote:
Are you checking with valgrind, by the way? How are the supressions working for you?
There are a lot of "uninitialised values" and "conditional jumps" before we get to etree.initetree. So I happily ignore those. When the test cases run (I run test.py -vv), I get a few more, but most of them do not make me too suspicious, as they seem to be triggered by Python code (might still be GC issues, though). Note that ElementTree actually triggers most of those. There are a few things left I'll look at today. Stefan

Hey, Also note that running valgrind over the lxml testsuite now reports quite a few problems -- invalid reads of content previously freed by libxml2, for instance. Regards, Martijn

Hi Martijn, Martijn Faassen wrote:
I hope you're refering to the test_attribute_xmlns_move test. That's the xml:id bug that Noah found. I'm working on that, but it's rather tricky. See here: http://bugzilla.gnome.org/show_bug.cgi?id=343302 Stefan

Stefan Behnel wrote:
I indeed get problems with test_attribute_xmlns_move, but also with test_module_HTML_unicode. I also appear to get a whole bunch of problems near the end of the test run (possbily when it's running doctests? not sure..). Note that test_attribute_xmlns_move currently also fails when I run the tests. Are you checking with valgrind, by the way? What do you do with the supression? Or is valgrind not an option for you and is this something I should do regularly for you? Regards, Martijn

Martijn Faassen wrote:
I see you did a fix; the last one is now gone.
I think I have been looking wrong and mistook the error summary for errors at the end. There are some reports at the end but they look to have something to do with the Python interpreter again, not lxml in particular: ==31495== Invalid read of size 4 ==31495== at 0x41129379: (within /lib/tls/libc-2.3.5.so) ==31495== by 0x410E9121: (within /lib/tls/libc-2.3.5.so) ==31495== by 0x410E9211: tdestroy (in /lib/tls/libc-2.3.5.so) ==31495== by 0x4112971B: (within /lib/tls/libc-2.3.5.so) ==31495== by 0x41129D51: __libc_freeres (in /lib/tls/libc-2.3.5.so) ==31495== by 0x1B8FC68A: _vgw_freeres (vg_preloaded.c:62) ==31495== by 0x41045785: exit (in /lib/tls/libc-2.3.5.so) ==31495== by 0x80D8468: (within /usr/bin/python2.4) ==31495== by 0x80D8645: PyErr_PrintEx (in /usr/bin/python2.4) ==31495== by 0x80D9143: PyRun_SimpleFileExFlags (in /usr/bin/python2.4) ==31495== by 0x8055A05: Py_Main (in /usr/bin/python2.4) ==31495== by 0x4102DEBF: __libc_start_main (in /lib/tls/libc-2.3.5.so) ==31495== Address 0xC is not stack'd, malloc'd or (recently) free'd ==31495== ==31495== Process terminating with default action of signal 11 (SIGSEGV) ==31495== Access not within mapped region at address 0xC ==31495== at 0x41129379: (within /lib/tls/libc-2.3.5.so) ==31495== by 0x410E9121: (within /lib/tls/libc-2.3.5.so) ==31495== by 0x410E9211: tdestroy (in /lib/tls/libc-2.3.5.so) ==31495== by 0x4112971B: (within /lib/tls/libc-2.3.5.so) ==31495== by 0x41129D51: __libc_freeres (in /lib/tls/libc-2.3.5.so) ==31495== by 0x1B8FC68A: _vgw_freeres (vg_preloaded.c:62) ==31495== by 0x41045785: exit (in /lib/tls/libc-2.3.5.so) ==31495== by 0x80D8468: (within /usr/bin/python2.4) ==31495== by 0x80D8645: PyErr_PrintEx (in /usr/bin/python2.4) ==31495== by 0x80D9143: PyRun_SimpleFileExFlags (in /usr/bin/python2.4) ==31495== by 0x8055A05: Py_Main (in /usr/bin/python2.4) ==31495== by 0x4102DEBF: __libc_start_main (in /lib/tls/libc-2.3.5.so) Regards, Martijn

Hi Martijn, Martijn Faassen wrote:
I know, we had two bug reports for the problem that is now covered by that test.
but also with test_module_HTML_unicode.
Don't you ever do updates? That was fixed at least half an hour ago! :)
Don't know about those, If you want to check, feel free.
I do from time to time, not regularly. The fix above slipped through since the last valgrind run. It was actually for fixing a few memory leaks, but I added one call too much. I'm stuffed with work currently, so if you want to run some tests and check them, go ahead. Stefan

Hi Martijn, Martijn Faassen wrote:
Are you checking with valgrind, by the way? How are the supressions working for you?
There are a lot of "uninitialised values" and "conditional jumps" before we get to etree.initetree. So I happily ignore those. When the test cases run (I run test.py -vv), I get a few more, but most of them do not make me too suspicious, as they seem to be triggered by Python code (might still be GC issues, though). Note that ElementTree actually triggers most of those. There are a few things left I'll look at today. Stefan
participants (2)
-
Martijn Faassen
-
Stefan Behnel