[lxml-dev] Replace/copy related segfault in lxml

So, I've been making extensive use of lxml 1.0.3, and have come across another crash bug. This one also appears to be related to subtree replacement. This is with libxml2 2.6.26, and I haven't tested with lxml 1.1 beta to see if the bug is present there. There is a simple workaround, which appears to be to avoid using the new replace function. This is the error the attached test program gives me: *** glibc detected *** double free or corruption (fasttop): 0x080daec8 *** However, minor differences in the location and amount of whitespace in the input data change the crash, to errors such as this: *** glibc detected *** corrupted double-linked list: 0x0813b9f8 *** -- John Krukoff <jkrukoff@ltgc.com> Land Title Guarantee Company

John Krukoff wrote:
Hm, I'm on an ubuntu 6.06, python 2.4, libxml 2.6.24, lxml-1.0 branch from svn, and so far I cannot reproduce your problem by running your script. Trying the 1.0.3 release now, same platform, still cannot reproduce the crash. What platform are you on? I can find a problem I run this code using 'valgrind' to detect memory errors - I get exuberant warnings now. Looks like you're on to something.. valgrind doesn't report these warnings when the workaround is enabled instead. I'll try to look into this more deeply later. Regards, Martijn

Hi John, John Krukoff wrote:
Thanks for reporting this. It's a bug in the replace() method. The Python document reference (and thus the document itself) can be freed before copying the tail content from it. Here's a fix against the trunk that should also apply to 1.0.3. Please test it. Stefan Index: src/lxml/etree.pyx =================================================================== --- src/lxml/etree.pyx (Revision 31246) +++ src/lxml/etree.pyx (Arbeitskopie) @@ -797,9 +797,9 @@ c_new_node = new_element._c_node c_new_next = c_new_node.next tree.xmlReplaceNode(c_old_node, c_new_node) - moveNodeToDocument(new_element, self._doc) _moveTail(c_new_next, c_new_node) _moveTail(c_old_next, c_old_node) + moveNodeToDocument(new_element, self._doc) # PROPERTIES property tag:

John Krukoff wrote:
Hm, I'm on an ubuntu 6.06, python 2.4, libxml 2.6.24, lxml-1.0 branch from svn, and so far I cannot reproduce your problem by running your script. Trying the 1.0.3 release now, same platform, still cannot reproduce the crash. What platform are you on? I can find a problem I run this code using 'valgrind' to detect memory errors - I get exuberant warnings now. Looks like you're on to something.. valgrind doesn't report these warnings when the workaround is enabled instead. I'll try to look into this more deeply later. Regards, Martijn

Hi John, John Krukoff wrote:
Thanks for reporting this. It's a bug in the replace() method. The Python document reference (and thus the document itself) can be freed before copying the tail content from it. Here's a fix against the trunk that should also apply to 1.0.3. Please test it. Stefan Index: src/lxml/etree.pyx =================================================================== --- src/lxml/etree.pyx (Revision 31246) +++ src/lxml/etree.pyx (Arbeitskopie) @@ -797,9 +797,9 @@ c_new_node = new_element._c_node c_new_next = c_new_node.next tree.xmlReplaceNode(c_old_node, c_new_node) - moveNodeToDocument(new_element, self._doc) _moveTail(c_new_next, c_new_node) _moveTail(c_old_next, c_old_node) + moveNodeToDocument(new_element, self._doc) # PROPERTIES property tag:
participants (3)
-
John Krukoff
-
Martijn Faassen
-
Stefan Behnel