Accessing variables from within an XSL file

Hi, I'm trying to extract from data from XSL files on an IBM DataPower system. I can't change the XSLT files - and therefore need to work around their current structure to get the info I need. I'm doing this from within Python 3.7 using etree.__version__ 4.3.3. Here are my imports, a dummy XML object and a cutdown version of the XSL file from the IBM system: from lxml import etree from textwrap import dedent xml = etree.XML("""<root></root>""") xsl = etree.XML(dedent( """<?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:dp="http://www.datapower.com/extensions" extension-element-prefixes="dp"> <xsl:template match="node()|@*"> <xsl:copy> <xsl:apply-templates select="node()|@*"/> </xsl:copy> <xsl:variable name="destination_url"> <xsl:value-of select="'https://my.domain.example.com/'" /> </xsl:variable> <!--Set the destination URL in the context variable--> <dp:set-variable name="'var://context/mpgw/Destination'" value="$destination_url" /> </xsl:template> </xsl:stylesheet> """)) My goal is to use lxml to process this file and extract the contents of the DataPower variable 'var://context/mpgw/Destination' - i.e. the contents of the xsl:variable $destination_url - in the above case ' https://my.domain.example.com/'. I've created an XSLT extension for the "dp:set-variable" element to try to capture this information: class MyExtElement(etree.XSLTExtension): def execute(self, context, self_node, input_node, output_parent): if self_node.tag == '{ http://www.datapower.com/extensions}set-variable': print(f"Setting variable of name {self_node.get('name')} to value {self_node.get('value')}") extensions = { ('http://www.datapower.com/extensions', 'set-variable') : MyExtElement() } result = etree.XSLT(xsl, extensions=extensions)(xml) print(f"Result: {result}") When this is being called; instead of the value of destination_url I'm getting the static "$destination_url" string instead - with the output of the above script being: Setting variable of name 'var://context/mpgw/Destination' to value $destination_url Result: <?xml version="1.0"?> <root/> I am rather suprised that the $destination_url xsl:variable was not being evaluated - does anyone know why this is the case? Given the above, I've been trying to find a way to access the $destination_url xsl:variable from within my XSLTExtension - but unfortunately with no luck. Some things I've tried: * Calling self.process_children(context) within the extension - this just results in an empty list. * I've used dir() to look through the context, self_node and input_node args to try to find somewhere the variables may be accessible but with no luck. * I've used deepcopy to clone self_node to get an xpath function on this argument and made some attempts at using xpath to access the xsl variable - all unsuccessful: node = deepcopy(self_node) node.xpath("""<xsl:value-of select="$destination_url">""") *** lxml.etree.XPathEvalError: Invalid expression node.xpath("""$destination_url""") *** lxml.etree.XPathEvalError: Undefined variable node.xpath("""xsl:$destination_url""") *** lxml.etree.XPathEvalError: Invalid expression node.xpath("""xsl:$destination_url""", namespaces={'xsl':'http://www.w3.org/1999/XSL/Transform'}) *** lxml.etree.XPathEvalError: Invalid expressio node.xpath("""$destination_url""", namespaces={'xsl':'http://www.w3.org/1999/XSL/Transform'}) *** lxml.etree.XPathEvalError: Undefined variable node.xpath("""/xsl:$destination_url""", namespaces={'xsl':'http://www.w3.org/1999/XSL/Transform'}) *** lxml.etree.XPathEvalError: Invalid expression node.xpath("""<xsl:value-of select="$destination_url">""", namespaces={'xsl':'http://www.w3.org/1999/XSL/Transform'}) *** lxml.etree.XPathEvalError: Invalid expression I'm lost now! Any suggestions greatly appreciated! Cheers, aid

Adrian Bool schrieb am 10.04.19 um 10:49:
Well, XSL is XML, so you can always post-process the XSL document after parsing it as XML, before you pass it into the XSLT() constructor.
I don't think I've ever tried this myself in lxml. I guess I would have expected value="{$destination_url}" rather than the plain variable reference here. Can't say how XSLT is meant to deal with this.
Did you try looking it up in the XSL document via XPath? Something like .xpath("//xsl:variable[@name=$name]", name="destination_url") (plus namespaces) Stefan

Hi Stefan, Many thanks for your thoughts. I hadn't considered just executing an XPath query directly against the XSL as simple XML – but that works! I combined this idea with your xpath expression and now have a function that will return the correct value: def get_variable_from_xsl(variable_name, xsl): xpath_results = xsl.xpath("//xsl:variable[@name=$name]", name=variable_name, namespaces={'xsl':'http://www.w3.org/1999/XSL/Transform' }) for xpath_result in xpath_results: if xpath_result.get('name') == variable_name: for xpath_child in xpath_result.getchildren(): if 'select' in xpath_child.keys(): return xpath_child.get('select') return None destination_url = get_variable_from_xsl('destination_url', xsl) print(f"destination_url = {destination_url}") Which returns: destination_url = 'https://my.domain.example.com/' Thanks again! aid On Wed, 10 Apr 2019 at 18:25, Stefan Behnel <stefan_ml@behnel.de> wrote:

Just for the record: Like Stefan suspected the way to dereference a variable in XSLT here is value="{$destination_url}". I.e. such stylesheet <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:variable name="my_var" select="'my value'"/> <xsl:template match="@*|node()"> <foo attr="$my_var">bar</foo> </xsl:template> </xsl:stylesheet> yields <?xml version="1.0" encoding="UTF-8"?><foo attr="$my_var">bar</foo> whereas <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:variable name="my_var" select="'my value'"/> <xsl:template match="@*|node()"> <foo attr="{$my_var}">bar</foo> </xsl:template> </xsl:stylesheet> yields <?xml version="1.0" encoding="UTF-8"?><foo attr="my value">bar</foo> Boy, am I getting rusty :-) Holger Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart HRA 4356, HRA 104 440 Amtsgericht Mannheim HRA 40687 Amtsgericht Mainz Die LBBW verarbeitet gemaess Erfordernissen der DSGVO Ihre personenbezogenen Daten. Informationen finden Sie unter https://www.lbbw.de/datenschutz.

Hi Adrian/Stefan, Apologies for dredging up a 2-year-old thread but I've hit the same issue and found that although the posted solution works for variables declared with simple string values, it fails when considering xsl:param values or "calculated" values. Consider: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:abc="bar" extension-element-prefixes="abc"> <xsl:output method="xml"/> <xsl:param name="param1">default param value</xsl:param> <xsl:template match="/"> <xsl:variable name="var1" select="concat( 'http://', $param1 )"/> <abc:foo attr1="{$var1}"/> <baz><xsl:value-of select="concat( 'XSLT ouput: ', $var1 )"/></baz> </xsl:template> </xsl:stylesheet> Now let's say I want to receive the value of "$var1" in my extension-element foo(): #!/usr/local/bin/python3 # -*- coding: utf-8 -*- import sys from lxml import etree from copy import deepcopy class FooExtensionElement( etree.XSLTExtension ): def execute( self, context, self_node, input_node, output_parent ): print( "attr1: {}".format( self_node.get( "attr1" ))) parser = etree.XMLParser( load_dtd=True ) extensions = { ( 'bar', 'foo' ) : FooExtensionElement() } transformer = etree.XSLT( etree.parse( sys.stdin, parser ), extensions = extensions ) xml_doc = etree.XML( '<dummy/>' ) result = transformer( xml_doc, param1=etree.XSLT.strparam( "passed param value" )) print( result ) When run, the output is: attr1: {$var1} <?xml version="1.0"?> <baz bif="http://passed param value">XSLT ouput: http://passed param value</baz> This shows that: * self_node.get( "attr1" ) returns the literal string "{$var1}" in execute(), i.e. the transformer has not evaluated {$var1} when passing it to abc:foo/@attr1 * baz/@bif gets the evaluated value of {$var1} as expected If I were to perform the solution suggested below and lookup //xsl:variable[ @name = 'var1']/@select from within execute(), it would return the string "concat( 'http://', $param1 )" I could attempt to evaluate this in self_node.xpath(), but I don't have access to the value of $param1, right? So this would result in: lxml.etree.XPathEvalError: Undefined variable What if I first looked up the value of $param1 in the same way as suggested below for xsl:variable? //xsl:param[ @name = 'param1']/text() would return "default param value", NOT the actual value ("passed param value") which was passed in: AFAIK, I have *no* way to access that value from execute(). This problem goes away if evaluation of {$xyz} -syntax is honored for extension-elements. As demonstrated above, this is already working perfectly for elements in the XSL which have no namespace (the "baz" element in my example). Is there a fundamental reason why extension-elements are excluded from these evaluations? If this contravenes XSLT convention, is it possible to place a map of variable/param names and their values within the 'context' parameter passed to the execute() function so that extension elements can perform their own lookup? Many thanks! Mike Adrian Bool wrote:

mike.shaw.nz+python.org@gmail.com schrieb am 01.04.21 um 13:11:
Hmm, interesting problem. Thanks for bringing it up.
This problem goes away if evaluation of {$xyz} -syntax is honored for extension-elements. As demonstrated above, this is already working perfectly for elements in the XSL which have no namespace (the "baz" element in my example). Is there a fundamental reason why extension-elements are excluded from these evaluations?
The evaluation of the content and attributes of an extension element (including their order and semantics) is specific to the element, and thus needs to be done explicitly.
If this contravenes XSLT convention, is it possible to place a map of variable/param names and their values within the 'context' parameter passed to the execute() function so that extension elements can perform their own lookup?
I think it could be solved by providing an XPath entry point that knows about the current XSLT context, i.e. variables and parameters. Could be a method on the extension class. However, I also see an advantage in providing access to the current name mappings. That would generally be nice to have in an extension element, and thus potentially cover a wider range of use cases. I'd be happy to receive a PR that implements this. Stefan

Adrian Bool schrieb am 10.04.19 um 10:49:
Well, XSL is XML, so you can always post-process the XSL document after parsing it as XML, before you pass it into the XSLT() constructor.
I don't think I've ever tried this myself in lxml. I guess I would have expected value="{$destination_url}" rather than the plain variable reference here. Can't say how XSLT is meant to deal with this.
Did you try looking it up in the XSL document via XPath? Something like .xpath("//xsl:variable[@name=$name]", name="destination_url") (plus namespaces) Stefan

Hi Stefan, Many thanks for your thoughts. I hadn't considered just executing an XPath query directly against the XSL as simple XML – but that works! I combined this idea with your xpath expression and now have a function that will return the correct value: def get_variable_from_xsl(variable_name, xsl): xpath_results = xsl.xpath("//xsl:variable[@name=$name]", name=variable_name, namespaces={'xsl':'http://www.w3.org/1999/XSL/Transform' }) for xpath_result in xpath_results: if xpath_result.get('name') == variable_name: for xpath_child in xpath_result.getchildren(): if 'select' in xpath_child.keys(): return xpath_child.get('select') return None destination_url = get_variable_from_xsl('destination_url', xsl) print(f"destination_url = {destination_url}") Which returns: destination_url = 'https://my.domain.example.com/' Thanks again! aid On Wed, 10 Apr 2019 at 18:25, Stefan Behnel <stefan_ml@behnel.de> wrote:

Just for the record: Like Stefan suspected the way to dereference a variable in XSLT here is value="{$destination_url}". I.e. such stylesheet <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:variable name="my_var" select="'my value'"/> <xsl:template match="@*|node()"> <foo attr="$my_var">bar</foo> </xsl:template> </xsl:stylesheet> yields <?xml version="1.0" encoding="UTF-8"?><foo attr="$my_var">bar</foo> whereas <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:variable name="my_var" select="'my value'"/> <xsl:template match="@*|node()"> <foo attr="{$my_var}">bar</foo> </xsl:template> </xsl:stylesheet> yields <?xml version="1.0" encoding="UTF-8"?><foo attr="my value">bar</foo> Boy, am I getting rusty :-) Holger Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart HRA 4356, HRA 104 440 Amtsgericht Mannheim HRA 40687 Amtsgericht Mainz Die LBBW verarbeitet gemaess Erfordernissen der DSGVO Ihre personenbezogenen Daten. Informationen finden Sie unter https://www.lbbw.de/datenschutz.

Hi Adrian/Stefan, Apologies for dredging up a 2-year-old thread but I've hit the same issue and found that although the posted solution works for variables declared with simple string values, it fails when considering xsl:param values or "calculated" values. Consider: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:abc="bar" extension-element-prefixes="abc"> <xsl:output method="xml"/> <xsl:param name="param1">default param value</xsl:param> <xsl:template match="/"> <xsl:variable name="var1" select="concat( 'http://', $param1 )"/> <abc:foo attr1="{$var1}"/> <baz><xsl:value-of select="concat( 'XSLT ouput: ', $var1 )"/></baz> </xsl:template> </xsl:stylesheet> Now let's say I want to receive the value of "$var1" in my extension-element foo(): #!/usr/local/bin/python3 # -*- coding: utf-8 -*- import sys from lxml import etree from copy import deepcopy class FooExtensionElement( etree.XSLTExtension ): def execute( self, context, self_node, input_node, output_parent ): print( "attr1: {}".format( self_node.get( "attr1" ))) parser = etree.XMLParser( load_dtd=True ) extensions = { ( 'bar', 'foo' ) : FooExtensionElement() } transformer = etree.XSLT( etree.parse( sys.stdin, parser ), extensions = extensions ) xml_doc = etree.XML( '<dummy/>' ) result = transformer( xml_doc, param1=etree.XSLT.strparam( "passed param value" )) print( result ) When run, the output is: attr1: {$var1} <?xml version="1.0"?> <baz bif="http://passed param value">XSLT ouput: http://passed param value</baz> This shows that: * self_node.get( "attr1" ) returns the literal string "{$var1}" in execute(), i.e. the transformer has not evaluated {$var1} when passing it to abc:foo/@attr1 * baz/@bif gets the evaluated value of {$var1} as expected If I were to perform the solution suggested below and lookup //xsl:variable[ @name = 'var1']/@select from within execute(), it would return the string "concat( 'http://', $param1 )" I could attempt to evaluate this in self_node.xpath(), but I don't have access to the value of $param1, right? So this would result in: lxml.etree.XPathEvalError: Undefined variable What if I first looked up the value of $param1 in the same way as suggested below for xsl:variable? //xsl:param[ @name = 'param1']/text() would return "default param value", NOT the actual value ("passed param value") which was passed in: AFAIK, I have *no* way to access that value from execute(). This problem goes away if evaluation of {$xyz} -syntax is honored for extension-elements. As demonstrated above, this is already working perfectly for elements in the XSL which have no namespace (the "baz" element in my example). Is there a fundamental reason why extension-elements are excluded from these evaluations? If this contravenes XSLT convention, is it possible to place a map of variable/param names and their values within the 'context' parameter passed to the execute() function so that extension elements can perform their own lookup? Many thanks! Mike Adrian Bool wrote:

mike.shaw.nz+python.org@gmail.com schrieb am 01.04.21 um 13:11:
Hmm, interesting problem. Thanks for bringing it up.
This problem goes away if evaluation of {$xyz} -syntax is honored for extension-elements. As demonstrated above, this is already working perfectly for elements in the XSL which have no namespace (the "baz" element in my example). Is there a fundamental reason why extension-elements are excluded from these evaluations?
The evaluation of the content and attributes of an extension element (including their order and semantics) is specific to the element, and thus needs to be done explicitly.
If this contravenes XSLT convention, is it possible to place a map of variable/param names and their values within the 'context' parameter passed to the execute() function so that extension elements can perform their own lookup?
I think it could be solved by providing an XPath entry point that knows about the current XSLT context, i.e. variables and parameters. Could be a method on the extension class. However, I also see an advantage in providing access to the current name mappings. That would generally be nice to have in an extension element, and thus potentially cover a wider range of use cases. I'd be happy to receive a PR that implements this. Stefan
participants (4)
-
Adrian Bool
-
Holger Joukl
-
mike.shaw.nz+python.org@gmail.com
-
Stefan Behnel