[lxml-dev] Url corruption during XSLT transformation
Hello again! I discovered a bug which is happening during XSLT transformation Consider this simple XSLT template: >>> from lxml import etree >>> xslt = etree.XSLT(etree.XML(''' ... <xsl:stylesheet version = '1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'> ... <xsl:output method="html" /> ... <xsl:template match="/"> ... <xsl:copy-of select="." /> ... </xsl:template> ... </xsl:stylesheet> ... ''')) The purpose of this template is just copying all document content as HTML instead of XML. But strange thing happened with urls: /test?a=1&b=2?c=3 becomes /test?a=1 What happens is all url content after the first '&' disappears >>> xml = etree.XML('<html><body><a href="/test?x=10&y=20&z=30">sample link</a></body></html>') >>> html = str(xslt(xml)) >>> print html <html><body><a href="/test?x=10">sample link</a></body></html> Probably it is not lxml, but libxslt2 bug, but I don't know, where to submit patch I dealt with this problem by replacing all the 'copy-of' elements with identity transformation, but probably it is less effective >>> xslt2 = etree.XSLT(etree.XML(''' ... <xsl:stylesheet version = '1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'> ... <xsl:output method="html" /> ... <xsl:template match="/ | @* | node()"> ... <xsl:copy> ... <xsl:apply-templates select="@* | node()" /> ... </xsl:copy> ... </xsl:template> ... </xsl:stylesheet> ... ''')) >>> html2 = str(xslt2(xml)) >>> print html2 <html><body><a href="/test?x=10&y=20&z=30">sample link</a></body></html> If it is libxslt2 bug, can you please submit bug instead of me :) ? I don't know anything about libxslt2... -- Best regards, Alexander mailto:alexander.kozlovsky@gmail.com
Hi, Alexander Kozlovsky wrote:
I discovered a bug which is happening during XSLT transformation Consider this simple XSLT template:
>>> from lxml import etree >>> xslt = etree.XSLT(etree.XML(''' ... <xsl:stylesheet version = '1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'> ... <xsl:output method="html" /> ... <xsl:template match="/"> ... <xsl:copy-of select="." /> ... </xsl:template> ... </xsl:stylesheet> ... '''))
The purpose of this template is just copying all document content as HTML instead of XML.
But strange thing happened with urls: /test?a=1&b=2?c=3 becomes /test?a=1 What happens is all url content after the first '&' disappears
>>> xml = etree.XML('<html><body><a href="/test?x=10&y=20&z=30">sample link</a></body></html>') >>> html = str(xslt(xml)) >>> print html <html><body><a href="/test?x=10">sample link</a></body></html>
I can't reproduce that, I get the expected output. This is on libxml2 2.6.27 and libxslt 1.1.20. Maybe your libxslt version is older? Stefan
I can't reproduce that, I get the expected output.
This is on libxml2 2.6.27 and libxslt 1.1.20. Maybe your libxslt version is older?
Hmm, probably I don't understand something. I use lxml for Windows (lxml-1.2.1.win32-py2.4.exe) I was expected it was necessary to install libxml2 in order to successfully import lxml, and have libxml2 installed on machine (http://users.skynet.be/sbi/libxml-python/libxml2-python-2.6.27.win32-py2.4.e...) I haven't found separate distributive of libxslt2 for Windows and expected it was bundled with libxml2, because after I installed it all lxml features worked just fine. But now I reinstalled Python and DIDN'T install libxml2 this time. I installed lxml only, and thought it wouldn't work standalone. Surprisingly, lxml works the same way and I can reproduce the bug mentioned before as well. Does lxml distributive for Windows contain both libxml2 and libxslt2 or I'm missing something here? -- Best regards, Alexander mailto:alexander.kozlovsky@gmail.com
Hi, Alexander Kozlovsky wrote:
I use lxml for Windows (lxml-1.2.1.win32-py2.4.exe)
Does lxml distributive for Windows contain both libxml2 and libxslt2 or I'm missing something here?
Yes, on Windows, they are statically linked into the lxml modules. http://codespeak.net/lxml/dev/build.html#static-linking-on-windows You can check which libxml2 version they were built with: from lxml import etree print "lxml.etree: ", etree.LXML_VERSION print "libxml used: ", etree.LIBXML_VERSION print "libxml compiled: ", etree.LIBXML_COMPILED_VERSION print "libxslt used: ", etree.LIBXSLT_VERSION print "libxslt compiled: ", etree.LIBXSLT_COMPILED_VERSION Stefan
Stefan Behnel wrote:
Yes, on Windows, they are statically linked into the lxml modules. You can check which libxml2 version they were built with:
from lxml import etree print "lxml.etree: ", etree.LXML_VERSION print "libxml used: ", etree.LIBXML_VERSION print "libxml compiled: ", etree.LIBXML_COMPILED_VERSION print "libxslt used: ", etree.LIBXSLT_VERSION print "libxslt compiled: ", etree.LIBXSLT_COMPILED_VERSION
lxml.etree: (1, 2, 1, 0) libxml used: (2, 6, 26) libxml compiled: (2, 6, 26) libxslt used: (1, 1, 17) libxslt compiled: (1, 1, 17) I install distributive with newer libxml2 and libxslt suggested by Sidnei da Silva and it is solve the problem! This is the version which is worked: lxml.etree: (1, 2, 1, 0) libxml used: (2, 6, 28) libxml compiled: (2, 6, 28) libxslt used: (1, 1, 19) libxslt compiled: (1, 1, 19) Thank you very much! -- Best regards, Alexander mailto:alexander.kozlovsky@gmail.com
Great to know! Stefan: I think I should re-upload this as 1.2.1.1 or something? To avoid other people hitting the same issue. What do you suggest? -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214
Hi Sidnei, Sidnei da Silva wrote:
Stefan: I think I should re-upload this as 1.2.1.1 or something? To avoid other people hitting the same issue. What do you suggest?
This is a bit tricky in cheeseshop. Having a new version means that others will see it and wonder why there are only Windows binaries. Then going back to the last version is not obvious in Cheeseshop. No, I think it would add less confusion if you replaced the existing versions with the new ones and added the libxml2/libxslt versions in the file comment. That way, it's more clear that lxml itself did not change. Stefan
Sounds good to me. I will do that ASAP. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214
On 6/12/07, Stefan Behnel <stefan_ml@behnel.de> wrote:
I just noticed that there are new versions of libxml2 and libxslt out today, with loads of bug fixes. Maybe you could use those right away - in case the windows binaries become available in time.
Too late, that will have to wait a little while. I will keep an eye out for when the libxml2 binaries are updated. Thanks! -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214
On 6/12/07, Alexander Kozlovsky <alexander.kozlovsky@gmail.com> wrote:
Does lxml distributive for Windows contain both libxml2 and libxslt2 or I'm missing something here?
Yes, it's compiled statically. I have a build with newer libxml2 and libxslt over here: https://houston.enfoldsystems.com/files/sidnei/lxml-1.2.1.win32-py2.4.exe Give that a try, it should fix your issue. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214
Sidnei da Silva wrote:
I have a build with newer libxml2 and libxslt over here:
https://houston.enfoldsystems.com/files/sidnei/lxml-1.2.1.win32-py2.4.exe
Give that a try, it should fix your issue.
Yes, it is works! Thank you very much! -- Best regards, Alexander mailto:alexander.kozlovsky@gmail.com
participants (3)
-
Alexander Kozlovsky -
Sidnei da Silva -
Stefan Behnel