[lxml-dev] XSLT extensions?

Hi! I looked through the archives a bit and found that there was some work going on at the beginning of this year to implement XPath/XSLT extensions. I know the XPath extensions are available in etree, but what about XSLT? I'm asking because I'd like to give MathDOM an extension function that allows for tree-to-string serialization during XSLT processing, i.e. to include on-the-fly generated term literals in XSLT output. Something like <xsl:value-of select="string(mathdom:serialize(./math:math, 'sql'))"/> would be just perfect. I know, that's related to XPath, but the XPath and XSLT classes in etree don't share their support for extensions (and I think I saw a different API for this in libxslt). Has anyone looked into this since? Stefan

Hi Stefan, Stefan Behnel wrote:
I looked through the archives a bit and found that there was some work going on at the beginning of this year to implement XPath/XSLT extensions. I know the XPath extensions are available in etree, but what about XSLT?
I'm asking because I'd like to give MathDOM an extension function that allows for tree-to-string serialization during XSLT processing, i.e. to include on-the-fly generated term literals in XSLT output.
Something like <xsl:value-of select="string(mathdom:serialize(./math:math, 'sql'))"/> would be just perfect. I know, that's related to XPath, but the XPath and XSLT classes in etree don't share their support for extensions (and I think I saw a different API for this in libxslt).
Has anyone looked into this since?
No, I think this is an open area. Feel free to dive in. It's most important to me we get a simple, Pythonic, API right. I rather like the approach of using simple modules and/or dictionaries myself. Perhaps Marc-Antoine Parent would be so kind to contribute to this thread; he was the main developer of the XPath extension functions back then. Perhaps I should drop him a mail. :) Regards, Martijn

Martijn Faassen wrote:
Stefan Behnel wrote:
I looked through the archives a bit and found that there was some work going on at the beginning of this year to implement XPath/XSLT extensions. I know the XPath extensions are available in etree, but what about XSLT?
I'm asking because I'd like to give MathDOM an extension function that allows for tree-to-string serialization during XSLT processing, i.e. to include on-the-fly generated term literals in XSLT output.
Something like <xsl:value-of select="string(mathdom:serialize(./math:math, 'sql'))"/> would be just perfect. I know, that's related to XPath, but the XPath and XSLT classes in etree don't share their support for extensions (and I think I saw a different API for this in libxslt).
Has anyone looked into this since?
No, I think this is an open area. Feel free to dive in. It's most important to me we get a simple, Pythonic, API right. I rather like the approach of using simple modules and/or dictionaries myself.
Hmm, now that you mention it, that might work pretty well with the custom namespace classes patch that I have still pending. That could allow you to register not only subclasses of _Element but also extension functions with a specific namespace that would then automatically be available to the XSLT/XPath evaluation process. I guess it's worth making this a general feature. What about introducing a Namespace class at the module level that you can instantiate with a namespace URI as argument and then register different types of 'things' with it? ----------- ns1 = etree.Namespace('http://mynamespace/somewhere') ns1.register_classes(my_dict_with_subclasses_of_ElementBase) ns2 = etree.Namespace('my:extensions:to:xslt') ns2.register_extensions({'func1' : some_function}) ----------- This class would obviously have singleton characteristics for each namespace, so the API may actually provide it as function and keep a dictionary of dictionaries at the back. Then you could use the registered 'things' in XSLT: ----------- <xsl:stylesheet xmlns:ext="my:extensions:to:xslt"> ... <xsl:value-of select="string(ext:func1(*))"/> ----------- without having to reregister them at each call. As I said, I imagine this at the module level. XML namespaces were designed to separate different semantic areas, so it should be enough to register these things once and for all. The current support in the XPath class would then become an additional override rather than the prefered method of registering extensions. If you want it to look nices at the cost of loosing some semantics, you can also imagine the following usage: ----------- ns1 = etree.Namespace('http://mynamespace/somewhere') ns1.update(my_dict_with_subclasses_of_ElementBase) ns1['some_element'] = MyAdditionalElementImpl ... ns2 = etree.Namespace('my:extensions:to:xslt') ns2['func1'] = some_function ... ----------- This is less bad than you might think, as the namespaces are there to separate the semantics. You will most likely not register extension functions with a namespace that is used by an XML schema (and if you do: your fault), so this usage scheme should still be sufficient to avoid errors like trying to call element implementations instead of extension functions, simply by relying on the XML namespace feature. I think that would be a rather pythonic interface... BTW, citing from XIST: http://www.livinglogic.de/Python/xist/Howto.html ----------- class python(xsc.Element): ... class cool(xsc.Element): ... class __ns__(xsc.Namespace): · xmlname = "foo" · xmlurl = "http://www.example.com/foo" __ns__.update(vars()) ----------- Comes pretty close... Stefan

First of all: I start hating libxml2/libxslt. It's increadibly badly documented and the only way to implement something with it is to read the documentation, try hard to figure out what it may well mean (and which functions may get you where you want when applied in which order) and then still go for trial-and-error. Great. That said... Stefan Behnel wrote:
What about introducing a Namespace class at the module level that you can instantiate with a namespace URI as argument and then register different types of 'things' with it? [...] ----------- ns1 = etree.Namespace('http://mynamespace/somewhere') ns1.update(my_dict_with_subclasses_of_ElementBase) ns1['some_element'] = MyAdditionalElementImpl ...
ns2 = etree.Namespace('my:extensions:to:xslt') ns2['func1'] = some_function ... -----------
I've created a new branch scoder2 (from scoder1) and taken this approach. "Namespace" became a module level function that returns a dictionary-like object, a _NamespaceRegistry. That object internally splits its values up by inheritance/callable checking, so there currently is an internal dictionary for subclasses of ElementBase (i.e. XML elements), for callables (i.e. extension functions) and for superclasses of XSLTElement, i.e. XSLT element extensions, but the latter are currently not implemented/used. Fits perfectly into the interface, though. The update() method is forgiving, i.e. if things don't match the requirements, they are thrown away. This allows you to write implementation classes/modules and then run Namespace('http://something/else').update(vars(myclass())) to register the methods of that object or module (names starting with '_' are entirely ignored). I didn't test that yet, but it's intended. Kasimier (and possibly others) may hate me for this, but I think it's a nice feature. For compatibility, the XSLT and XPath classes still have the "extensions" keyword argument, but these are mixed with the globally registered ones (they take precedence, though). I'd personally argue for removing them since I largely prefer the new interface, but that would break backward compatibility (in case someone actually uses the old interface). Note that this is the only reason why I left them in. They are not needed IMHO and make things a bit more ugly. (I mean, really, why would you want to register extension functions for each call? What are namespaces for, hu?) Another problem is that XSLT doesn't currently check which namespaces are actually used in the stylesheet, it just registers all extension functions it finds. It *could* try to deduce the right namespace URIs from the ElemenTree/Document it works on, but that's not currently implemented. The current implementation is already ugly enough (although not toooooo bad) and the accompanying refactoring of the existing code was again rather extensive (the patch has some 900 lines against scoder1 - don't want to count the trunk). I may have to rework some things, but I'll still check it in for now, so that others can try it and comment on it. I will also start to use it in MathDOM, so we will see how it works out. Stefan

On Sat, Nov 12, 2005 at 03:02:54PM +0100, Stefan Behnel wrote:
First of all: I start hating libxml2/libxslt. It's increadibly badly documented and the only way to implement something with it is to read the documentation, try hard to figure out what it may well mean (and which functions may get you where you want when applied in which order) and then still go for trial-and-error. Great.
Maybe so. But, this makes lxml all the more needed and valuable. Dave
That said...
[snip] -- Dave Kuhlman http://www.rexx.com/~dkuhlman
participants (3)
-
Dave Kuhlman
-
Martijn Faassen
-
Stefan Behnel