[lxml-dev] document('') and custom resolver not working again/still

I am stumped. Probably doing something stupid but I just cannot get document('') to work with lxml, but it does work when the .xsl is sent to the browser (ff) and it does work with xsltproc I think there's been a change in libxslt at some point, but I don't know exactly what changed. This deployment works: RHEL4 linux with libxml2 2.6.27 and libxslt 1.1.18, with lxml 2 alpha4 this deployment does not work: Centos 4 with libxml2 2.6.28 and libxslt 1.1.19 with lxml 2 alpha 3 So I upgraded to libxml2 2.6.30, libxslt 1.1.22 and lxml 2 beta 2 But it still doesn't work correctly. I am passing a base_url when creating documents using etree.fromstring I've traced the execution of document('') via gdb, both in libxslt/functions.c and libxslt/documents.c For example, in libxslt/functions.c 142 idoc = xsltLoadDocument(tctxt, URI); (gdb) print URI $7 = (xmlChar *) 0x90e7c00 "/carrier_payables/view.htm" the correct uri (what I passed for base_url) is shown and Now in documents.c line 315:: 317 if ((ret->doc != NULL) && (ret->doc->URL != NULL) && (gdb) print ret->doc $11 = 0x90d6ef8 (gdb) print ret->doc->URL $12 = (const xmlChar *) 0x90dca08 "/carrier_payables/view.htm" (gdb) print URI $13 = (const xmlChar *) 0x90e7c00 "/carrier_payables/view.htm" so all seems to be correct, it is returning the expected document back to the xpath evaluator. I'm stumped that it works with older versions of libxslt, works with client-side transform, and works with xsltproc, but not through lxml with "newer libxslt" Can anyone suggest some other steps I can use to diagnose this problem? I'm sure I've done something wrong with how I am using lxml, but I can't figure it out. My .xsl looks (in part) like this. Its loaded using etree.fromstring with a base_url: <?xml version="1.0"?> <xsl_:stylesheet xmlns:xsl_="http://www.w3.org/1999/XSL/Transform" xmlns:metal="http://xml.zope.org/namespaces/metal" xmlns:tal="http://xml.zope.org/namespaces/tal" xmlns:const="const.uri" version="1.0" exclude-result-prefixes="tal metal const"> <xsl_:output encoding="utf-8" method="xml" omit-xml-declaration="no" cdata-section-elements="" doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"/> <const:javascript id="number_pad_javascript"> <const:file>MochiKit/MochiKit.js</const:file> <const:file>jstal/jstal.js</const:file> <const:file>global.js</const:file> <const:file>view.js</const:file> </const:javascript> <xsl_:template match="/"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title> View Carrier Payables </title> <link rel="stylesheet" type="text/css" href="/css/default.css"/> <link rel="stylesheet" media="print" type="text/css" href="/css/print.css"/> </head> <body> <div id="ajax-indicator" style="display:none" class="no-print"> <img title="some title" width="24" height="24" src="/images/ajax_indicator.gif"/> </div> <xsl_:for-each select="document('')//const:javascript/const:file"> <xsl_:variable name="file" select="."/> <script type="text/javascript"> <xsl_:attribute name="src"> <xsl_:value-of select="concat('/scripts/', $file)"/> </xsl_:attribute> </script> </xsl_:for-each> <div>more stuff</div> </body> </html> </xsl_:template> </xsl_:stylesheet> using xsltproc against a source xml file: <root /> produces this output <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title> View Carrier Payables </title> <link rel="stylesheet" type="text/css" href="/css/default.css" /> <link rel="stylesheet" media="print" type="text/css" href="/css/print.css" /> </head> <body> <div id="ajax-indicator" style="display:none" class="no-print"> <img title="some title" width="24" height="24" src="/images/ajax_indicator.gif" /> </div> <script type="text/javascript" src="/scripts/MochiKit/MochiKit.js"></script> <script type="text/javascript" src="/scripts/jstal/jstal.js"></script> <script type="text/javascript" src="/scripts/global.js"></script> <script type="text/javascript" src="/scripts/view.js"></script> <div>more stuff</div> But with lxml, I get: <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title> View Carrier Payables </title> <link rel="stylesheet" type="text/css" href="/css/default.css" /> <link rel="stylesheet" media="print" type="text/css" href="/css/print.css" /> </head> <body> <div id="ajax-indicator" style="display:none" class="no-print"> <img title="some title" width="24" height="24" src="/images/ajax_indicator.gif" /> </div> <div>more stuff</div> I am using a custom resolver. The resolver is not asked to resolve the .xsl stylesheet, nor is it asked to resolve '' p-code is like this: parser = etree.XMLParser(load_dtd=True) parser.resolvers.add(Resolver(resolver=my_resolver)) stylesheet_doc = etree.fromstring(xslt_src, parser, base_url=xsl_uri) stylesheet = etree.XSLT(stylesheet_doc) likewise the source xml is loaded in the same way parser = etree.XMLParser(load_dtd=True) parser.resolvers.add(Resolver(resolver=my_resolver)) xml_doc = etree.fromstring(xml_src, parser, base_url=xml_uri) finally return stylesheet(xml_doc, **params) -- Brad Clements, bkc@murkworks.com (315)268-1000 http://www.murkworks.com AOL-IM: BKClements

Hi, Brad Clements wrote:
I am stumped. Probably doing something stupid but I just cannot get document('') to work with lxml, but it does work when the .xsl is sent to the browser (ff) and it does work with xsltproc
I think there's been a change in libxslt at some point, but I don't know exactly what changed.
This deployment works:
RHEL4 linux with libxml2 2.6.27 and libxslt 1.1.18, with lxml 2 alpha4
this deployment does not work:
Centos 4 with libxml2 2.6.28 and libxslt 1.1.19 with lxml 2 alpha 3
So I upgraded to libxml2 2.6.30, libxslt 1.1.22 and lxml 2 beta 2 But it still doesn't work correctly.
Hmm, I can't see why it should be a problem with the libxslt version. The test cases we have for document('') work for all versions from 1.1.15-22.
I am passing a base_url when creating documents using etree.fromstring
That should not make a difference. The stylesheet is given its own name and resolved internally and thus outside the custom resolver scope.
I've traced the execution of document('') via gdb, both in libxslt/functions.c and libxslt/documents.c [...] I'm stumped that it works with older versions of libxslt, works with client-side transform, and works with xsltproc, but not through lxml with "newer libxslt"
This means you traced it in xsltproc, not in lxml, right? Could you run the debugger on your script in lxml and look at lines 78-82 in src/lxml/xslt.pxi (or somewhere around line 81023 in lxml.etree.c), where it says "# shortcut if we resolve the stylesheet itself"? I would like to know what "__pyx_v_c_uri" and "__pyx_v_c_doc.URL" (or c_uri and c_doc.URI respectively) are in your case. It would be great if you could figure that out, as I won't have much time to look into this next week. Stefan
participants (2)
-
Brad Clements
-
Stefan Behnel