Re: [lxml-dev] using modules for XPath/XSLT extensions

Even though I said I wouldn't consider XSLT extensions as I'm overwhelmed, Marc-Antoine brought up the module concept.
Thank you for looking at it ;-)
Sorry for the confusion, "module" is the libxslt terminology, which introduces confusion with the python modules. I was thinking of an instance object, with a class defining the xslt module. I am using an instance so as to allow users to add data to the instance. This is in line with libxslt-python usage, as I think it would be nice to allow people to reuse them with minimal changes. I am still making changes, however, like making the namespace URI accessible as a class-level method (it could be a read-only class attribute); this is because defining the URI exists at the same level as defining the extension functions that make up the module.
If you refer to the decorators, the point was mostly to allow them to identify which of the module functions are extension functions, and which are extension elements. Incidentally, I made the decorators take care of registration, but that is very much an implementation detail. In theory, registration should be handled by the styleInit method.
* a list of function names we want to register with XPath
Conceptually, yes. Each extension function and element has a name visible to XSLT which may have nothing to do with the name of the function (or method) as Python sees it. Again, I put that functionality in an optional decorator arguments, with a default behaviour that would simply extract the method name and do some name conversion (camel caps to hyphenated lowercase) that happens to be sensible most of the time. Again, I wanted the default case to be simple. It is entirely possible to have (class or even static) methods in the extensionModule baseclass instead, such as registerExtensionFunction(bound_method, name, uri) registerExtensionElement(bound_method, name, uri) that would be called in styleInit. but I liked the idea of adding the URI as a mandatory class attribute, that and default name conversion would allow a (imho, simpler) decorator syntax: @xmlExtensionFunction def myFunctionName(...): would the be defined as my-function-name on the module's URI.
I agree that it would be even better if we could make the syntactic sugar easy to access, but not automatic. So let us assume that @xmlExtensionFunction def myFunctionName(...): defines myFunctionName , but we allow @xmlExtensionFunction(translateName=True) def myFunctionName(...): to define my-function-name and of course @xmlExtensionFunction(name="whatever-name") def myFunctionName(...): (In all cases I assume the URI is taken from the module; but I guess the registration function that gets called by the decorator would also have a namespaceURI argument )
I haven't figured out yet where styleInit(self, stylesheet, URI) and friends fit in... Am not too clear on the use cases yet.
As I said, this is _normally_ where you register your module ext. ctxtInit, on the other hand, allows you to set up user data (like Kapil's) on your (xslt) module instance for a given application of the transformation.
Hm... Need to think more.
Almost sorry I brought this up before I was ready with a fuller implementation... Marc-Antoine

Marc-Antoine Parent wrote:
Lots more studying to do...
It might or make not make sense to make them Python modules too, though. Though the concept of a module interface is less usual in Python than an instance interface, it's certainly possible and it would become very clear how to bundle these. Possibly we could design it like this: * a module which contains a bunch of functions to be used as extension functions. Along with this, possible other functions and classes used to implement them. * a way to turn these modules into XPath/XSLT extension modules. Imagine a function call along these lines: from foo import module extension_module = makeExtensionModule(module, 'http://some/uri', {'function_name_in_module': 'function-name-in-xpath'}) e = XPathEvaluator(doc, namespaces, extension_module)
Allowing the reuse of python-libxslt code isn't a very big thing on my list of concerns. It helps that I don't have any codebase written against this yet. :) Anyway, I imagine the cost of adapting any existing code should be relatively low anyway.
My apologies, I typed 'import' where I meant 'implement'. I mean, the functions used to implement these functions; helper functions that are not exposed to XPath itself.
I'd like to consider a design without decorators for the time being, as I think we can make it still fairly nice and easy without them. It'll always be possible to build a decorator based approach on top of this, I think, anyway. makeExtensionModule() could be informed by information extracted by the module by inspecting decorators. [snip]
Need to study this. :)
Nah, no worries. I'll try to whip something up in the context of XPath, and you can see about how this fits with the XSLT APIs.. Then we'll combine ideas and flesh out how to make it all look pretty. Regards, Martijn

Ah... Not a bad idea in itself. A few points: a) It enforces one extension module per file, which might annoy some people. But that I see as very, very minor. b) You forget extension elements: We need two dictionaries. c) Also, I would spontaneously have applied the dictionary in reverse, and my tendency would be to use the function instances instead of names. Thus: extension_module = makeExtensionModule(module, 'http://some/uri', {'function-name-in-xpath': module.function_name_in_module}, {'extension-name-in-xpath': module.function_name_in_module} ) But let me think aloud: Why did you want the dictionary in that order? In other words: What happens to a module function that is not in the dictionary? Does it get registered with the function name? Or simply omitted? In the former case... Well, that does not work, because we cannot know if we should register it as function or element. In the latter case, I feel the python module buys us nothing: we could just use any function (or bound method.) So question: What does the module gain us over a class instance? A last point: Extension modules are supposed to be tied to their URI. In other words, there is no natural use case for extension_module1 = makeExtensionModule(module1, 'http://some/first/uri', ...) extension_module2 = makeExtensionModule(module1, 'http://some/other/uri', ...)
Fair enough.
I hope you realize that decorator _syntax_ was meant to be optional; what I was aiming for was a function that receives a function. So the following would still make sense: def moduleInstance(extensionModule): def myFunction(...): pass def styleInit(self, stylesheet, URI): self.URI = URI self.registerModuleExtensionFunction(myFunction, name="my-function-name") .... which is closer to traditional libxslt usage. The fact that xmlExtensionFunction would allow pie-syntax was pure sugar.
;-) Marc-Antoine

Hey, I just checked in a first stab at XPath extension functions based on your code. It is a first stab to figure out your code, and to try to simplify. Repeat: a first stab -- we likely can refactor this more and I haven't taken the XSLT extension API into extension yet -- this only works for the XPathEvaluator now. It seems to work, but I haven't really done extensive testing yet. I think I did something reasonably smart with Python exception handling. :) Marc-Antoine Parent wrote:
I've done a very very simple implementation of this, completely untested. There's an Extension factory defined now, and this turns a Python module into an Extension object (which happens to be a dictionary; I started building an Extension class but couldn't see the point yet, so I went back to dictionaries for the time being).
a) It enforces one extension module per file, which might annoy some people. But that I see as very, very minor.
Agreed. In the current codebase you can come up with other ways to construct Extensions which do something else.
b) You forget extension elements: We need two dictionaries.
I forget these as I don't understand them yet. I'm sure you'll help me understand them eventually. :)
Having them be names allows me not to have to repeat 'module.' all the time. :)
It's not registered at all. Explicit is better than implicit, and we only register those functions explicitly mentioned in the dictionary.
True, you can easily go around this API, which I've done in the few tests I created today.
So question: What does the module gain us over a class instance?
I don't even know whether it buys us much above a dictionary right now, so I don't feel qualified in answering that question. :)
True. Um...my best answer right now would be: "Don't do that then"? :) [snip]
I hope you realize that decorator _syntax_ was meant to be optional;
[snip]
The fact that xmlExtensionFunction would allow pie-syntax was pure sugar.
I understand, I'm just not ready yet to use decorators, and it's easy enough to build a decorator based approach on top of what I've built so far after all. I just want to think about the problem without being distracted by sugar first. :)
The code's been checked in and ready for your feedback. I've strived for simplicity to start out with. We can always add complexity later as needed. :) Regards, Martijn

Marc-Antoine Parent wrote:
Lots more studying to do...
It might or make not make sense to make them Python modules too, though. Though the concept of a module interface is less usual in Python than an instance interface, it's certainly possible and it would become very clear how to bundle these. Possibly we could design it like this: * a module which contains a bunch of functions to be used as extension functions. Along with this, possible other functions and classes used to implement them. * a way to turn these modules into XPath/XSLT extension modules. Imagine a function call along these lines: from foo import module extension_module = makeExtensionModule(module, 'http://some/uri', {'function_name_in_module': 'function-name-in-xpath'}) e = XPathEvaluator(doc, namespaces, extension_module)
Allowing the reuse of python-libxslt code isn't a very big thing on my list of concerns. It helps that I don't have any codebase written against this yet. :) Anyway, I imagine the cost of adapting any existing code should be relatively low anyway.
My apologies, I typed 'import' where I meant 'implement'. I mean, the functions used to implement these functions; helper functions that are not exposed to XPath itself.
I'd like to consider a design without decorators for the time being, as I think we can make it still fairly nice and easy without them. It'll always be possible to build a decorator based approach on top of this, I think, anyway. makeExtensionModule() could be informed by information extracted by the module by inspecting decorators. [snip]
Need to study this. :)
Nah, no worries. I'll try to whip something up in the context of XPath, and you can see about how this fits with the XSLT APIs.. Then we'll combine ideas and flesh out how to make it all look pretty. Regards, Martijn

Ah... Not a bad idea in itself. A few points: a) It enforces one extension module per file, which might annoy some people. But that I see as very, very minor. b) You forget extension elements: We need two dictionaries. c) Also, I would spontaneously have applied the dictionary in reverse, and my tendency would be to use the function instances instead of names. Thus: extension_module = makeExtensionModule(module, 'http://some/uri', {'function-name-in-xpath': module.function_name_in_module}, {'extension-name-in-xpath': module.function_name_in_module} ) But let me think aloud: Why did you want the dictionary in that order? In other words: What happens to a module function that is not in the dictionary? Does it get registered with the function name? Or simply omitted? In the former case... Well, that does not work, because we cannot know if we should register it as function or element. In the latter case, I feel the python module buys us nothing: we could just use any function (or bound method.) So question: What does the module gain us over a class instance? A last point: Extension modules are supposed to be tied to their URI. In other words, there is no natural use case for extension_module1 = makeExtensionModule(module1, 'http://some/first/uri', ...) extension_module2 = makeExtensionModule(module1, 'http://some/other/uri', ...)
Fair enough.
I hope you realize that decorator _syntax_ was meant to be optional; what I was aiming for was a function that receives a function. So the following would still make sense: def moduleInstance(extensionModule): def myFunction(...): pass def styleInit(self, stylesheet, URI): self.URI = URI self.registerModuleExtensionFunction(myFunction, name="my-function-name") .... which is closer to traditional libxslt usage. The fact that xmlExtensionFunction would allow pie-syntax was pure sugar.
;-) Marc-Antoine

Hey, I just checked in a first stab at XPath extension functions based on your code. It is a first stab to figure out your code, and to try to simplify. Repeat: a first stab -- we likely can refactor this more and I haven't taken the XSLT extension API into extension yet -- this only works for the XPathEvaluator now. It seems to work, but I haven't really done extensive testing yet. I think I did something reasonably smart with Python exception handling. :) Marc-Antoine Parent wrote:
I've done a very very simple implementation of this, completely untested. There's an Extension factory defined now, and this turns a Python module into an Extension object (which happens to be a dictionary; I started building an Extension class but couldn't see the point yet, so I went back to dictionaries for the time being).
a) It enforces one extension module per file, which might annoy some people. But that I see as very, very minor.
Agreed. In the current codebase you can come up with other ways to construct Extensions which do something else.
b) You forget extension elements: We need two dictionaries.
I forget these as I don't understand them yet. I'm sure you'll help me understand them eventually. :)
Having them be names allows me not to have to repeat 'module.' all the time. :)
It's not registered at all. Explicit is better than implicit, and we only register those functions explicitly mentioned in the dictionary.
True, you can easily go around this API, which I've done in the few tests I created today.
So question: What does the module gain us over a class instance?
I don't even know whether it buys us much above a dictionary right now, so I don't feel qualified in answering that question. :)
True. Um...my best answer right now would be: "Don't do that then"? :) [snip]
I hope you realize that decorator _syntax_ was meant to be optional;
[snip]
The fact that xmlExtensionFunction would allow pie-syntax was pure sugar.
I understand, I'm just not ready yet to use decorators, and it's easy enough to build a decorator based approach on top of what I've built so far after all. I just want to think about the problem without being distracted by sugar first. :)
The code's been checked in and ready for your feedback. I've strived for simplicity to start out with. We can always add complexity later as needed. :) Regards, Martijn
participants (2)
-
Marc-Antoine Parent
-
Martijn Faassen