A possible problem with processing of XSLT
Good evening. I am manipulating XSLT files with LXML. Much of it works well. Yet, it appears, that there is an issue with LXML, or libxml2, or libxslt upon rare circumstances which I am not sure that I am eloquent enough to explain. I would appreciate help in analyzing this possible issue, if this is indeed a genuine issue. I have detailed and linked to the relevant code. https://git.xmpp-it.net/sch/Rivista/issues/6 Please advise. P.S. I will be available during Saturday or Sunday evening. Kind regards, Schimon
Good evening. It appears that this issue is easy to reproduce and probably is easy to fix as well. I would appreciate someone ot assist in writing and report about that issue. Kind regards, Schimon On Fri, 25 Jul 2025 19:13:58 +0300 Schimon Jehudah via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Good evening.
I am manipulating XSLT files with LXML.
Much of it works well.
Yet, it appears, that there is an issue with LXML, or libxml2, or libxslt upon rare circumstances which I am not sure that I am eloquent enough to explain.
I would appreciate help in analyzing this possible issue, if this is indeed a genuine issue.
I have detailed and linked to the relevant code.
https://git.xmpp-it.net/sch/Rivista/issues/6
Please advise.
P.S. I will be available during Saturday or Sunday evening.
Kind regards, Schimon _______________________________________________ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-leave@python.org https://mail.python.org/mailman3//lists/lxml.python.org Member address: sch@fedora.email
Good day. Previous subject: [lxml] Re: A possible problem with processing of XSLT I think, that I have found the cause to the issue, or, at least, now I know how to cause to the issue, and how to define it. Issue ----- Elements without text content would overlap. Samples to experiment with -------------------------- <nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"/> <span id="xslt-navigation-proceed"/> </nav> <nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"></span> <span id="xslt-navigation-proceed"></span> </nav> Result after processing ----------------------- <nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"> <span id="xslt-navigation-proceed"> </span></span></nav> Severness --------- The severness is high, because this would affect on embedding of CSS stylesheets and ECMAScript scripts and other elements that do not require or should not have textual content. Examples: <script src="/scripts/navigation.js" type="text/javascript"></script> <script src="/scripts/navigation.js" type="text/javascript" /> Note ---- This issue does not occur with client-side XSLT parsers of internet browsers such as Falkon and Otter Browser. Kind regagrds, Schimon On Thu, 21 Aug 2025 20:57:28 +0300 Schimon Jehudah via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Good evening.
It appears that this issue is easy to reproduce and probably is easy to fix as well.
I would appreciate someone ot assist in writing and report about that issue.
Kind regards, Schimon
On Fri, 25 Jul 2025 19:13:58 +0300 Schimon Jehudah via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Good evening.
I am manipulating XSLT files with LXML.
Much of it works well.
Yet, it appears, that there is an issue with LXML, or libxml2, or libxslt upon rare circumstances which I am not sure that I am eloquent enough to explain.
I would appreciate help in analyzing this possible issue, if this is indeed a genuine issue.
I have detailed and linked to the relevant code.
https://git.xmpp-it.net/sch/Rivista/issues/6
Please advise.
P.S. I will be available during Saturday or Sunday evening.
Kind regards, Schimon _______________________________________________ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-leave@python.org https://mail.python.org/mailman3//lists/lxml.python.org Member address: sch@fedora.email
lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-leave@python.org https://mail.python.org/mailman3//lists/lxml.python.org Member address: sch@fedora.email
Perhaps the corect description should be. Elements without textual content collapse their next element* into them. This would cause to collapase element "body" into element "link" or "script" of element "head"; which means that the resulted document would not be usable. * The word element is intentionally singular; as in, "an element". Schimon On Sun, 24 Aug 2025 13:06:42 +0300 Schimon Jehudah via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Good day.
Previous subject: [lxml] Re: A possible problem with processing of XSLT
I think, that I have found the cause to the issue, or, at least, now I know how to cause to the issue, and how to define it.
Issue -----
Elements without text content would overlap.
Samples to experiment with --------------------------
<nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"/> <span id="xslt-navigation-proceed"/> </nav>
<nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"></span> <span id="xslt-navigation-proceed"></span> </nav>
Result after processing -----------------------
<nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"> <span id="xslt-navigation-proceed"> </span></span></nav>
Severness ---------
The severness is high, because this would affect on embedding of CSS stylesheets and ECMAScript scripts and other elements that do not require or should not have textual content.
Examples:
<script src="/scripts/navigation.js" type="text/javascript"></script>
<script src="/scripts/navigation.js" type="text/javascript" />
Note ----
This issue does not occur with client-side XSLT parsers of internet browsers such as Falkon and Otter Browser.
Kind regagrds, Schimon
On Thu, 21 Aug 2025 20:57:28 +0300 Schimon Jehudah via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Good evening.
It appears that this issue is easy to reproduce and probably is easy to fix as well.
I would appreciate someone ot assist in writing and report about that issue.
Kind regards, Schimon
On Fri, 25 Jul 2025 19:13:58 +0300 Schimon Jehudah via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Good evening.
I am manipulating XSLT files with LXML.
Much of it works well.
Yet, it appears, that there is an issue with LXML, or libxml2, or libxslt upon rare circumstances which I am not sure that I am eloquent enough to explain.
I would appreciate help in analyzing this possible issue, if this is indeed a genuine issue.
I have detailed and linked to the relevant code.
https://git.xmpp-it.net/sch/Rivista/issues/6
Please advise.
P.S. I will be available during Saturday or Sunday evening.
Kind regards, Schimon _______________________________________________ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-leave@python.org https://mail.python.org/mailman3//lists/lxml.python.org Member address: sch@fedora.email
lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-leave@python.org https://mail.python.org/mailman3//lists/lxml.python.org Member address: sch@fedora.email
lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-leave@python.org https://mail.python.org/mailman3//lists/lxml.python.org Member address: sch@fedora.email
Good day. I intend to file a report at https://bugs.launchpad.net/lxml I would appreciate any help in detecting whether this issue was reported already? Kind regards, Schimon On Sun, 24 Aug 2025 13:13:55 +0300 Schimon Jehudah via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Perhaps the corect description should be.
Elements without textual content collapse their next element* into them.
This would cause to collapase element "body" into element "link" or "script" of element "head"; which means that the resulted document would not be usable.
* The word element is intentionally singular; as in, "an element".
Schimon
On Sun, 24 Aug 2025 13:06:42 +0300 Schimon Jehudah via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Good day.
Previous subject: [lxml] Re: A possible problem with processing of XSLT
I think, that I have found the cause to the issue, or, at least, now I know how to cause to the issue, and how to define it.
Issue -----
Elements without text content would overlap.
Samples to experiment with --------------------------
<nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"/> <span id="xslt-navigation-proceed"/> </nav>
<nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"></span> <span id="xslt-navigation-proceed"></span> </nav>
Result after processing -----------------------
<nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"> <span id="xslt-navigation-proceed"> </span></span></nav>
Severness ---------
The severness is high, because this would affect on embedding of CSS stylesheets and ECMAScript scripts and other elements that do not require or should not have textual content.
Examples:
<script src="/scripts/navigation.js" type="text/javascript"></script>
<script src="/scripts/navigation.js" type="text/javascript" />
Note ----
This issue does not occur with client-side XSLT parsers of internet browsers such as Falkon and Otter Browser.
Kind regagrds, Schimon
On Thu, 21 Aug 2025 20:57:28 +0300 Schimon Jehudah via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Good evening.
It appears that this issue is easy to reproduce and probably is easy to fix as well.
I would appreciate someone ot assist in writing and report about that issue.
Kind regards, Schimon
On Fri, 25 Jul 2025 19:13:58 +0300 Schimon Jehudah via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Good evening.
I am manipulating XSLT files with LXML.
Much of it works well.
Yet, it appears, that there is an issue with LXML, or libxml2, or libxslt upon rare circumstances which I am not sure that I am eloquent enough to explain.
I would appreciate help in analyzing this possible issue, if this is indeed a genuine issue.
I have detailed and linked to the relevant code.
https://git.xmpp-it.net/sch/Rivista/issues/6
Please advise.
P.S. I will be available during Saturday or Sunday evening.
Kind regards, Schimon _______________________________________________ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-leave@python.org https://mail.python.org/mailman3//lists/lxml.python.org Member address: sch@fedora.email
lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-leave@python.org https://mail.python.org/mailman3//lists/lxml.python.org Member address: sch@fedora.email
lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-leave@python.org https://mail.python.org/mailman3//lists/lxml.python.org Member address: sch@fedora.email
lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-leave@python.org https://mail.python.org/mailman3//lists/lxml.python.org Member address: sch@fedora.email
Hi, sorry for the late response. Schimon Jehudah via lxml - The Python XML Toolkit schrieb am 24.08.25 um 12:06:
I think, that I have found the cause to the issue, or, at least, now I know how to cause to the issue, and how to define it.
Issue -----
Elements without text content would overlap.
Samples to experiment with --------------------------
<nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"/> <span id="xslt-navigation-proceed"/> </nav>
<nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"></span> <span id="xslt-navigation-proceed"></span> </nav>
Result after processing -----------------------
<nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"> <span id="xslt-navigation-proceed"> </span></span></nav>
This suggests that it might be the parser making this change. Could you show us how you parse and process the data? A short code snippet would help. Is this parsed as HTML? With which options? How do you run the XSLT? And, most importantly, which versions of lxml, libxml2 and libxslt are you using? Does this occur with a binary wheel installed from PyPI or did you build lxml locally?
I have detailed and linked to the relevant code.
That's a problem description but doesn't show me the code that triggers the problem on your side. Stefan
Stefan. Good day. On Mon, 25 Aug 2025 14:09:55 +0200 Stefan Behnel via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Hi, sorry for the late response.
I appreciate your respond. Please read details further.
Schimon Jehudah via lxml - The Python XML Toolkit schrieb am 24.08.25 um 12:06:
I think, that I have found the cause to the issue, or, at least, now I know how to cause to the issue, and how to define it.
Issue -----
Elements without text content would overlap.
Samples to experiment with --------------------------
<nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"/> <span id="xslt-navigation-proceed"/> </nav>
<nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"></span> <span id="xslt-navigation-proceed"></span> </nav>
Result after processing -----------------------
<nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"> <span id="xslt-navigation-proceed"> </span></span></nav>
I have attached a new visual realization for this code from an XSLT stylesheet. <textarea id="message" maxlength="100" name="message" minlength="50" placeholder="Please input your message in English" required="" rows="10"> </textarea> Due to the issue in question, I have added white space with <xsl:text> </xsl:text>. <textarea id="message" maxlength="100" name="message" minlength="50" placeholder="Please input your message in English" required="" rows="10"> <xsl:text> </xsl:text> </textarea> As seen at. https://git.xmpp-it.net/sch/Rivista/src/branch/main/rivista/data/themes/vivi... It is important to mention that the results of these observations would be the same even with XSL code, as with raw XHTML. <xsl:element name="textarea"> <xsl:attribute name="maxlength"> <xsl:text>100</xsl:text> </xsl:attribute> <xsl:attribute name="name"> <xsl:text>message</xsl:text> </xsl:attribute> <!-- et cetera --> </xsl:element>
This suggests that it might be the parser making this change. Could you show us how you parse and process the data? A short code snippet would help.
Pleaes. Refer to instruction "ParserXslt.transform" of which there are several instances at. https://git.xmpp-it.net/sch/Rivista/src/branch/main/rivista/publish/xml.py Function is at. https://git.xmpp-it.net/sch/Rivista/src/branch/main/rivista/parser/xslt.py
Is this parsed as HTML? With which options?
Yes. I suppose so. <xsl:output encoding = 'UTF-8' indent = 'yes' media-type = 'text/xml' method = 'html' omit-xml-decleration='no' version = '4.01' />
How do you run the XSLT?
Server-side: With Python LXML. See first answer. Client-side: With Falkon internet browser. Please. Rephrase your question, if I did not answer to the relevant context.
And, most importantly, which versions of lxml, libxml2 and libxslt are you using? Does this occur with a binary wheel installed from PyPI or did you build lxml locally?
libxml2 2.14.5 libxslt 1.1.43 I suppose, that I utilize the recent packages from PyPI. https://git.xmpp-it.net/sch/Rivista/src/branch/main/pyproject.toml
I have detailed and linked to the relevant code.
That's a problem description but doesn't show me the code that triggers the problem on your side.
Yes. Indeed. Did I provide enough of information in this email message?
Stefan
Best, Schimon
Stefan. Good afternoon. Interestingly, it appears that there are exceptions; as empty element (i.e. an element without textual content) of type "input", does not disrupt rendering. So far, I have had the issue in question with elements "section", "textarea", "span", and perhaps more others. Kind regards, Schimon On Wed, 27 Aug 2025 10:19:00 +0300 Schimon Jehudah via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Stefan. Good day.
On Mon, 25 Aug 2025 14:09:55 +0200 Stefan Behnel via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Hi, sorry for the late response.
I appreciate your respond. Please read details further.
Schimon Jehudah via lxml - The Python XML Toolkit schrieb am 24.08.25 um 12:06:
I think, that I have found the cause to the issue, or, at least, now I know how to cause to the issue, and how to define it.
Issue -----
Elements without text content would overlap.
Samples to experiment with --------------------------
<nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"/> <span id="xslt-navigation-proceed"/> </nav>
<nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"></span> <span id="xslt-navigation-proceed"></span> </nav>
Result after processing -----------------------
<nav id="xslt-navigation-posts"> <span id="xslt-navigation-previous"> <span id="xslt-navigation-proceed"> </span></span></nav>
I have attached a new visual realization for this code from an XSLT stylesheet.
<textarea id="message" maxlength="100" name="message" minlength="50" placeholder="Please input your message in English" required="" rows="10"> </textarea>
Due to the issue in question, I have added white space with <xsl:text> </xsl:text>.
<textarea id="message" maxlength="100" name="message" minlength="50" placeholder="Please input your message in English" required="" rows="10"> <xsl:text> </xsl:text> </textarea>
As seen at.
https://git.xmpp-it.net/sch/Rivista/src/branch/main/rivista/data/themes/vivi...
It is important to mention that the results of these observations would be the same even with XSL code, as with raw XHTML.
<xsl:element name="textarea"> <xsl:attribute name="maxlength"> <xsl:text>100</xsl:text> </xsl:attribute> <xsl:attribute name="name"> <xsl:text>message</xsl:text> </xsl:attribute> <!-- et cetera --> </xsl:element>
This suggests that it might be the parser making this change. Could you show us how you parse and process the data? A short code snippet would help.
Pleaes. Refer to instruction "ParserXslt.transform" of which there are several instances at.
https://git.xmpp-it.net/sch/Rivista/src/branch/main/rivista/publish/xml.py
Function is at.
https://git.xmpp-it.net/sch/Rivista/src/branch/main/rivista/parser/xslt.py
Is this parsed as HTML? With which options?
Yes. I suppose so.
<xsl:output encoding = 'UTF-8' indent = 'yes' media-type = 'text/xml' method = 'html' omit-xml-decleration='no' version = '4.01' />
How do you run the XSLT?
Server-side: With Python LXML. See first answer.
Client-side: With Falkon internet browser.
Please. Rephrase your question, if I did not answer to the relevant context.
And, most importantly, which versions of lxml, libxml2 and libxslt are you using? Does this occur with a binary wheel installed from PyPI or did you build lxml locally?
libxml2 2.14.5
libxslt 1.1.43
I suppose, that I utilize the recent packages from PyPI.
https://git.xmpp-it.net/sch/Rivista/src/branch/main/pyproject.toml
I have detailed and linked to the relevant code.
That's a problem description but doesn't show me the code that triggers the problem on your side.
Yes. Indeed.
Did I provide enough of information in this email message?
Stefan
Best, Schimon
Hi, Schimon Jehudah schrieb am 27.08.25 um 09:19:
Function is at.
https://git.xmpp-it.net/sch/Rivista/src/branch/main/rivista/parser/xslt.py
Is this parsed as HTML? With which options?
Yes. I suppose so.
<xsl:output encoding = 'UTF-8' indent = 'yes' media-type = 'text/xml' method = 'html' omit-xml-decleration='no' version = '4.01' />
So, this is your Python code running the transformation: def transform(filepath_xml, filepath_xslt): tree = ET.parse(filepath_xml) xslt_stylesheet = ET.parse(filepath_xslt) xslt_transform = ET.XSLT(xslt_stylesheet) newdom = xslt_transform(tree) xml_data_bytes = ET.tostring(newdom, pretty_print=True) xml_data_str = xml_data_bytes.decode("utf-8") return xml_data_str Since you're apparently using "<xsl:output>" to configure the output, "tostring()" is the wrong way of serialising the result, because it does not know about your XSLT output configuration. Instead, use e.g. xml_data_bytes = memoryview(newdom) xml_data_str = str(xml_data_bytes, 'UTF-8') or, if you intend to write to a file: newdom.write_output("somefile.xml") You were using XML serialisation instead of HTML serialisation. That certainly makes a difference. If this doesn't solve your issue, I'd suggest trying to reproduce the misbehaviour with the "xsltproc" program that comes with libxslt and if you can make that show the same behaviour, report it to the libxslt project. It's probably not lxml that's responsible here. Stefan
Stefen. Good afternoon. On Tue, 9 Sep 2025 08:50:41 +0200 Stefan Behnel via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Hi,
Schimon Jehudah schrieb am 27.08.25 um 09:19:
Function is at.
https://git.xmpp-it.net/sch/Rivista/src/branch/main/rivista/parser/xslt.py
Is this parsed as HTML? With which options?
Yes. I suppose so.
<xsl:output encoding = 'UTF-8' indent = 'yes' media-type = 'text/xml' method = 'html' omit-xml-decleration='no' version = '4.01' />
So, this is your Python code running the transformation:
def transform(filepath_xml, filepath_xslt): tree = ET.parse(filepath_xml) xslt_stylesheet = ET.parse(filepath_xslt) xslt_transform = ET.XSLT(xslt_stylesheet) newdom = xslt_transform(tree) xml_data_bytes = ET.tostring(newdom, pretty_print=True) xml_data_str = xml_data_bytes.decode("utf-8") return xml_data_str
Since you're apparently using "<xsl:output>" to configure the output, "tostring()" is the wrong way of serialising the result, because it does not know about your XSLT output configuration. Instead, use e.g.
xml_data_bytes = memoryview(newdom) xml_data_str = str(xml_data_bytes, 'UTF-8')
Did you mean to write. xml_data_bytes = memoryview(newdom).tobytes() xml_data_str = str(xml_data_bytes, 'UTF-8')
or, if you intend to write to a file:
newdom.write_output("somefile.xml")
You were using XML serialisation instead of HTML serialisation. That certainly makes a difference.
If this doesn't solve your issue, I'd suggest trying to reproduce the misbehaviour with the "xsltproc" program that comes with libxslt and if you can make that show the same behaviour, report it to the libxslt project. It's probably not lxml that's responsible here.
Stefan
I appreciate your respond. I will try the string solution first. Kind regards, Schimon
_______________________________________________ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-leave@python.org https://mail.python.org/mailman3//lists/lxml.python.org Member address: sch@fedora.email
Schimon Jehudah schrieb am 09.09.25 um 13:16:
On Tue, 9 Sep 2025 08:50:41 +0200 Stefan Behnel via lxml - The Python XML Toolkit <lxml@python.org> wrote:
xml_data_bytes = memoryview(newdom) xml_data_str = str(xml_data_bytes, 'UTF-8')
Did you mean to write.
xml_data_bytes = memoryview(newdom).tobytes() xml_data_str = str(xml_data_bytes, 'UTF-8') No, that would force Python to create two intermediate copies of the bytes data. One is enough.
Stefan
On Tue, 9 Sep 2025 15:27:39 +0200 Stefan Behnel via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Schimon Jehudah schrieb am 09.09.25 um 13:16:
On Tue, 9 Sep 2025 08:50:41 +0200 Stefan Behnel via lxml - The Python XML Toolkit <lxml@python.org> wrote:
xml_data_bytes = memoryview(newdom) xml_data_str = str(xml_data_bytes, 'UTF-8')
Did you mean to write.
xml_data_bytes = memoryview(newdom).tobytes() xml_data_str = str(xml_data_bytes, 'UTF-8') No, that would force Python to create two intermediate copies of the bytes data. One is enough.
Corrected. https://git.xmpp-it.net/sch/Rivista/commit/b49c6ce24a6cad2d2b5a6ecb14321eb66... Thank you very much. Stefan, please kindly provide useful (i.e. promotional) text about project LXML to attach to the file README of project Rivista. I deem LXML to be a very important project which is worthy to promote. Schimon
Stefan. Good afternoon. I have implemented your solution at. https://git.xmpp-it.net/sch/Rivista/commit/0e2d0cc6e0476c3909db4015e89a93f28... I have also removed the past solution of "xsl:text" (i.e. so called "workaround") from XSLT stylesheets. Now, the appearance of forms is better. Thank you for your help. P.S. This is a further progress towards the Ace Specification, for distributing and publishing uniform Atom Syndication Format documents over the internet. Kind regards, Schimon On Tue, 9 Sep 2025 08:50:41 +0200 Stefan Behnel via lxml - The Python XML Toolkit <lxml@python.org> wrote:
Hi,
Schimon Jehudah schrieb am 27.08.25 um 09:19:
Function is at.
https://git.xmpp-it.net/sch/Rivista/src/branch/main/rivista/parser/xslt.py
Is this parsed as HTML? With which options?
Yes. I suppose so.
<xsl:output encoding = 'UTF-8' indent = 'yes' media-type = 'text/xml' method = 'html' omit-xml-decleration='no' version = '4.01' />
So, this is your Python code running the transformation:
def transform(filepath_xml, filepath_xslt): tree = ET.parse(filepath_xml) xslt_stylesheet = ET.parse(filepath_xslt) xslt_transform = ET.XSLT(xslt_stylesheet) newdom = xslt_transform(tree) xml_data_bytes = ET.tostring(newdom, pretty_print=True) xml_data_str = xml_data_bytes.decode("utf-8") return xml_data_str
Since you're apparently using "<xsl:output>" to configure the output, "tostring()" is the wrong way of serialising the result, because it does not know about your XSLT output configuration. Instead, use e.g.
xml_data_bytes = memoryview(newdom) xml_data_str = str(xml_data_bytes, 'UTF-8')
or, if you intend to write to a file:
newdom.write_output("somefile.xml")
You were using XML serialisation instead of HTML serialisation. That certainly makes a difference.
If this doesn't solve your issue, I'd suggest trying to reproduce the misbehaviour with the "xsltproc" program that comes with libxslt and if you can make that show the same behaviour, report it to the libxslt project. It's probably not lxml that's responsible here.
Stefan
_______________________________________________ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-leave@python.org https://mail.python.org/mailman3//lists/lxml.python.org Member address: sch@fedora.email
participants (2)
-
Schimon Jehudah -
Stefan Behnel