[lxml-dev] thread-related crash when using xslt

Hi there, Attached is a small tarball that demonstrates code that crashes when the code is run in a thread but doesn't crash when it is run stand-alone. I isolated the specific XSLT + XML combination that seems to trigger this crash. I suspect it has to do with passing an XSLT object to a thread. I run this with lxml 2.1.5 in Python 2.4, libxml2 2.6.32 and libxslt 1.1.24 By the way, the FAQ implies that passing an XSLT object into a thread will slow things down (probably as the XSLT would be re-interpreted). Is that still true in the current codebase? I had the impression from previous discussions that this would change. Regards, Martijn

Hi Martijn, Martijn Faassen wrote:
Attached is a small tarball that demonstrates code that crashes when the code is run in a thread but doesn't crash when it is run stand-alone. I isolated the specific XSLT + XML combination that seems to trigger this crash. I suspect it has to do with passing an XSLT object to a thread.
I've seen enough of these all over the place to consider this possible. ;) I'll look into this as soon as I get to it. I was about to release another beta anyway - the latest changelog has gotten longer than I expected, and I really love being able to say that lxml is now fully Py3 compatible. So I'll see if I can get this to work before putting out a 2.2beta4. The still-future Cython 0.11 has also matured a lot by now, so it's worth another release.
I run this with lxml 2.1.5 in Python 2.4, libxml2 2.6.32 and libxslt 1.1.24
Just in case, if the crash is related to transformation errors, you might want to try with 2.2beta3, or even with the trunk, if you also install the latest trunk Cython (sorry for that).
By the way, the FAQ implies that passing an XSLT object into a thread will slow things down (probably as the XSLT would be re-interpreted). Is that still true in the current codebase? I had the impression from previous discussions that this would change.
Yes, the (ugly) code section that this statement was referring to was killed somewhere in 2.1.x. I removed the paragraph from the FAQ and also clarified a couple of other things while at it. lxml now even has a working test case for passing trees along a thread pipeline, so the safety of threading really has improved a lot lately. It's impressively hard to get these things right. Threads are just plain evil. Their only excuse in lxml is that XML handling is often I/O expensive and can involve major time consuming operations inside libxml2 and libxslt (XSLT is really a great candidate for that). So freeing the GIL when we know we are about to do most of our work outside of the Python interpreter gets you pretty far. Stefan

Hi, one more note on this: Stefan Behnel wrote:
Martijn Faassen wrote:
By the way, the FAQ implies that passing an XSLT object into a thread will slow things down (probably as the XSLT would be re-interpreted). Is that still true in the current codebase? I had the impression from previous discussions that this would change.
Yes, the (ugly) code section that this statement was referring to was killed somewhere in 2.1.x. I removed the paragraph from the FAQ and also clarified a couple of other things while at it.
I should mention that there is /still/ some overhead involved when you mix documents from different threads here (as everywhere in lxml), including the stylesheet itself. However, as this also runs with the GIL released, your gain on multi-processor machines will still be higher than the overhead. YMMV, as usual, so profiling is always a good idea. :) Stefan

Hi, Martijn Faassen wrote:
Attached is a small tarball that demonstrates code that crashes when the code is run in a thread but doesn't crash when it is run stand-alone. I isolated the specific XSLT + XML combination that seems to trigger this crash. I suspect it has to do with passing an XSLT object to a thread.
Ok, this is plain evil. What you do here is this: ... <tr class="odd"> <xsl:attribute name="class">top-row</xsl:attribute> ... Note how the attribute value is changed after being set. In libxslt, this leads to a result tree update that removes the old attribute and replaces it by the new one. In your case, the stylesheet that was parsed outside the thread inherits the name dict from the main thread, while the input document inherits the one from the worker thread that executes this function: def render(id, xml, stylesheet): doc = etree.parse(StringIO(xml)) result_tree = stylesheet(doc) So the first "class" attribute name comes from the stylesheet dict and gets stored in the result document that inherits the thread dict of the input document. When it is overwritten and deleted, it is looked up in the thread dict, is not found there, and thus free()-ed, although it continues to 'live' in the stylesheet dict. This must really be the only place in XSLT where the result document is not only created incrementally but where its existing content gets overwritten. For now, I really do not know how to work around this. There can only be one dict for the result document, but the original attribute can come from the stylesheet or the input document (or even the current thread dict where the XSLT is executed), and the dict lookup happens from deep inside libxslt. I'm very open to ideas. Stefan

Stefan Behnel wrote:
Martijn Faassen wrote:
Attached is a small tarball that demonstrates code that crashes when the code is run in a thread but doesn't crash when it is run stand-alone. I isolated the specific XSLT + XML combination that seems to trigger this crash. I suspect it has to do with passing an XSLT object to a thread.
Ok, this is plain evil. What you do here is this:
... <tr class="odd"> <xsl:attribute name="class">top-row</xsl:attribute> ...
Note how the attribute value is changed after being set. In libxslt, this leads to a result tree update that removes the old attribute and replaces it by the new one.
Here is a minimal fix for the problem. There may be special cases where this might not work (my guess would be custom XSLT elements), but at least it works safely in this case. Stefan === src/lxml/xslt.pxi ================================================================== --- src/lxml/xslt.pxi (revision 5056) +++ src/lxml/xslt.pxi (local) @@ -486,7 +486,15 @@ _destroyFakeDoc(input_doc._c_doc, c_doc) python.PyErr_NoMemory() - initTransformDict(transform_ctxt) + # using the stylesheet dict is safer than using a possibly + # unrelated dict from the current thread. Almost all + # non-input tag/attr names will come from the stylesheet + # anyway. + if transform_ctxt.dict is not NULL: + xmlparser.xmlDictFree(transform_ctxt.dict) + transform_ctxt.dict = self._c_style.doc.dict + xmlparser.xmlDictReference(transform_ctxt.dict) + xslt.xsltSetCtxtParseOptions( transform_ctxt, input_doc._parser._parse_options)

Hey Stefan, On Fri, Feb 27, 2009 at 12:41 PM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Martijn Faassen wrote:
Attached is a small tarball that demonstrates code that crashes when the code is run in a thread but doesn't crash when it is run stand-alone. I isolated the specific XSLT + XML combination that seems to trigger this crash. I suspect it has to do with passing an XSLT object to a thread.
Ok, this is plain evil. What you do here is this:
... <tr class="odd"> <xsl:attribute name="class">top-row</xsl:attribute> ...
I didn't do it, or if I did do it it was years ago and I don't remember! :) [snip]
So the first "class" attribute name comes from the stylesheet dict and gets stored in the result document that inherits the thread dict of the input document. When it is overwritten and deleted, it is looked up in the thread dict, is not found there, and thus free()-ed, although it continues to 'live' in the stylesheet dict.
Ugh! FYI I've worked around the problem in the original application (Silva) by having a thread-local XSLT stylesheet for each thread now. This seems to resolve the actual crash in the application and has a minimal performance impact as far as I can see. Given Silva's history with thread-related issues with XSLT such a general workaround might be the best way forward, though it does mean you'll see less thread related bug reports coming from that direction. :) I see however that you thought up a fix in the reply, which is good news for people coming after me. :) Regards, Martijn
participants (2)
-
Martijn Faassen
-
Stefan Behnel