Re: [lxml-dev] Re: SimpleXMLWriter vs. lxml performance
Paul Everitt wrote:
On Dec 21, 2005, at 1:10 PM, Stefan Behnel wrote:
If you call the xslt processor from lxml (at least in the current scoder2 branch), it will use extension functions just as in XPath itself. It uses the same infrastructure behind the scenes.
Hmmm, wait, this might be all that I need.
In XSLT, I don't need the dynamically-pulled-in-nodes in the original tree. I just need to grab them and apply rules, generating new stuff for the output tree.
The only mild issue, under-the-hood: will the XSLT processor care if it suddenly starts dealing with new nodes, presumably in their own ElementTree/document?
It shouldn't, as XSLT has the document() function which can also do this.
Would it also be possible, in XSLT, to have arguments? For example, at the top of an XSLT:
<xsl:variable "model" select="f:query('some xquery statement here')"/>
And finally...do you know if EXSLT extensions are available in lxml?
Only if they are available in libxslt by default, but I'm not sure. Anyway, I don't know if it's a good idea to use XSLT here, since there may be too many 'ifs' and 'maybes'. But using element classes, you could do something like this (untested): --------------------------- class MyDataFiller(ElementBase): def _init(self): child = self[0] if child.tag == 'sqlquery': query = child.text del self[0] # remove sqlquery element to prevent running this twice # run SQL query, generate result child nodes for it Namespace('ns')['mydata'] = MyDataFiller xml = XML(""" <myroot xmlns='ns'> <mydata> <sqlquery>SELECT from ...</sqlquery> </mydata> </myroot> """) data = xml[0] # will call _init() for data_child in data: # do something with data --------------------------- I said there was no way to have a constructor, but there actually is an _init() method that can be overridden and that will be called after instance initialization (which is a bit after instance creation, so this is different from __init__). But you have to assure that it is either harmless to call it multiple times or that it does something that is reflected in the underlying XML to assure it is only executed once (like I do above). Stefan
Stefan Behnel wrote:
Paul Everitt wrote:
On Dec 21, 2005, at 1:10 PM, Stefan Behnel wrote:
If you call the xslt processor from lxml (at least in the current scoder2 branch), it will use extension functions just as in XPath itself. It uses the same infrastructure behind the scenes. Hmmm, wait, this might be all that I need.
In XSLT, I don't need the dynamically-pulled-in-nodes in the original tree. I just need to grab them and apply rules, generating new stuff for the output tree.
The only mild issue, under-the-hood: will the XSLT processor care if it suddenly starts dealing with new nodes, presumably in their own ElementTree/document?
It shouldn't, as XSLT has the document() function which can also do this.
Would it also be possible, in XSLT, to have arguments? For example, at the top of an XSLT:
<xsl:variable "model" select="f:query('some xquery statement here')"/>
And finally...do you know if EXSLT extensions are available in lxml?
Only if they are available in libxslt by default, but I'm not sure.
Anyway, I don't know if it's a good idea to use XSLT here, since there may be too many 'ifs' and 'maybes'.
Ahh, too bad. I was hoping to avoid an approach that only worked with lxml. But, if that's the way it is, that's the way it is. Out of curiosity, is it not a good idea based on the current state, or the long-term plans as well? Essentially, I'm looking for a way to bring new nodes into a document. Similar to how XInclude does, but under programmatic control, and with the ability to do arguments. I'd prefer to keep the integration point in a declarative document-oriented style, instead of a script-oriented style. But it sounds like I might not be able to get there from here and I have to take what I can get. :^)
But using element classes, you could do something like this (untested):
--------------------------- class MyDataFiller(ElementBase): def _init(self): child = self[0] if child.tag == 'sqlquery': query = child.text del self[0] # remove sqlquery element to prevent running this twice # run SQL query, generate result child nodes for it
Namespace('ns')['mydata'] = MyDataFiller
xml = XML(""" <myroot xmlns='ns'> <mydata> <sqlquery>SELECT from ...</sqlquery> </mydata> </myroot> """)
data = xml[0] # will call _init() for data_child in data: # do something with data ---------------------------
I said there was no way to have a constructor, but there actually is an _init() method that can be overridden and that will be called after instance initialization (which is a bit after instance creation, so this is different from __init__). But you have to assure that it is either harmless to call it multiple times or that it does something that is reflected in the underlying XML to assure it is only executed once (like I do above).
In the example above, is the _init called when the document is parsed, or when the element is traversed? Stated differently, if I want to evaluate the query and get the new nodes, do I have to write some script to grab each node and "evaluate" it? --Paul
Paul Everitt wrote:
Stefan Behnel wrote:
Anyway, I don't know if it's a good idea to use XSLT here, since there may be too many 'ifs' and 'maybes'.
Ahh, too bad. I was hoping to avoid an approach that only worked with lxml. But, if that's the way it is, that's the way it is. Out of curiosity, is it not a good idea based on the current state, or the long-term plans as well?
As I said, if you want to test it, feel free. That way, we know what works and what may have to be fixed. It may enable us to decide if it's worth implementing.
Essentially, I'm looking for a way to bring new nodes into a document. Similar to how XInclude does, but under programmatic control, and with the ability to do arguments.
XSLT supports arguments in functions, just like XPath does.
I'd prefer to keep the integration point in a declarative document-oriented style, instead of a script-oriented style. But it sounds like I might not be able to get there from here and I have to take what I can get. :^)
But using element classes, you could do something like this (untested):
--------------------------- class MyDataFiller(ElementBase): def _init(self): child = self[0] if child.tag == 'sqlquery': query = child.text del self[0] # remove sqlquery element to prevent running this twice # run SQL query, generate result child nodes for it
Namespace('ns')['mydata'] = MyDataFiller
xml = XML(""" <myroot xmlns='ns'> <mydata> <sqlquery>SELECT from ...</sqlquery> </mydata> </myroot> """)
data = xml[0] # will call _init() for data_child in data: # do something with data ---------------------------
In the example above, is the _init called when the document is parsed, or when the element is traversed?
I updated the documentation regarding this point, but for a short answer:
Stated differently, if I want to evaluate the query and get the new nodes, do I have to write some script to grab each node and "evaluate" it?
Yes. This is only done when a Python object is instantiated. Which means that you may have to do something like this: tree.xpath('//XPath expression to find all complex elements') to instantiate them first. Stefan
participants (2)
-
Paul Everitt
-
Stefan Behnel