[lxml-dev] An intriguing behaviour of xpath in lxml
Hi! Just a question. Assume the next code from lxml import etree from StringIO import StringIO xmlText = "<test>This is a test</test>" doc = etree.parse(StringIO(xmlText)) root = doc.xpath("/") The last line throws the next exception Not yet implemented result node type: 9 Traceback (most recent call last): File "C:\Archivos de programa\ActiveState Komodo 3.5\lib\support\dbgp\pythonlib\dbgp\client.py", line 1843, in runMain self.dbg.runfile(debug_args[0], debug_args) File "C:\Archivos de programa\ActiveState Komodo 3.5\lib\support\dbgp\pythonlib\dbgp\client.py", line 1538, in runfile h_execfile(file, args, module=main, tracer=self) File "C:\Archivos de programa\ActiveState Komodo 3.5\lib\support\dbgp\pythonlib\dbgp\client.py", line 596, in __init__ execfile(file, globals, locals) File "C:\dev\projects\python\python-xpath-example.py", line 7, in __main__ root = doc.xpath("/") File "c:\python24\lib\site-packages\lxml\etree.pyx", line 485, in etree._ElementTree.xpath File "c:\python24\lib\site-packages\lxml\xpath.pxi", line 75, in etree._XPathEvaluatorBase.evaluate File "c:\python24\lib\site-packages\lxml\xpath.pxi", line 212, in etree.XPathDocumentEvaluator.__call__ File "c:\python24\lib\site-packages\lxml\xpath.pxi", line 108, in etree._XPathEvaluatorBase._handle_result File "c:\python24\lib\site-packages\lxml\extensions.pxi", line 269, in etree._unwrapXPathObject File "c:\python24\lib\site-packages\lxml\extensions.pxi", line 317, in etree._createNodeSetResult NotImplementedError My question is: which is the reason behind this behaviour (if is there one)? (I already know that xpath(".") in the document node works, but is beyond my understanding why xpath("/") is not implemented. Cheers Agustin
Hi Agustin, Agustín Villena wrote:
I already know that xpath(".") in the document node works, but is beyond my understanding why xpath("/") is not implemented.
Well, what would you expect it to return? The XPath spec says: """ / selects the document root (which is always the parent of the document element) """ The document element is returned by "/*", so it's the root element of the document in ElementTree. The "document root" itself is not available in the tree model provided by lxml. It /could/ be a possibility to deliberately diverge from the spec here and return the root element instead. So, maybe you can enlighten us with your use case, so that we can decide what implementation would fit here. Stefan
Stefan Behnel wrote:
Hi Agustin,
Agustín Villena wrote:
I already know that xpath(".") in the document node works, but is beyond my understanding why xpath("/") is not implemented.
Well, what would you expect it to return? The XPath spec says:
""" / selects the document root (which is always the parent of the document element) """
The document element is returned by "/*", so it's the root element of the document in ElementTree. The "document root" itself is not available in the tree model provided by lxml.
It /could/ be a possibility to deliberately diverge from the spec here and return the root element instead.
What about returning a root ElementTree? Then again, that is not the parent of the document element at present in our tree model, right? Or is it? Changing the getparent() behavior will have consequences we need to consider carefully.
So, maybe you can enlighten us with your use case, so that we can decide what implementation would fit here.
Yes, that would indeed be helpful. Regards, Martijn
Hi Martijn, Martijn Faassen wrote:
Stefan Behnel wrote:
Hi Agustin,
Agustín Villena wrote:
I already know that xpath(".") in the document node works, but is beyond my understanding why xpath("/") is not implemented.
Well, what would you expect it to return? The XPath spec says:
""" / selects the document root (which is always the parent of the document element) """
The document element is returned by "/*", so it's the root element of the document in ElementTree. The "document root" itself is not available in the tree model provided by lxml.
It /could/ be a possibility to deliberately diverge from the spec here and return the root element instead.
What about returning a root ElementTree?
Then that would be the only special case that returns an ElementTree from an XPath expression, although there is currently no way to get an ElementTree passed /into/ an XPath expression. And XPath extension functions would have to start caring about this, too.
Then again, that is not the parent of the document element at present in our tree model, right? Or is it?
No. ElementTrees and Elements are different things that serve different purposes.
Changing the getparent() behavior will have consequences we need to consider carefully.
I dislike the idea of having different (incompatible) return values only to match a single special case. If we say we return an Element from a function, having a special case that can return an ElementTree is far from intuitive and pretty error prone. So, depending on the use case, we may consider a) leaving it as is b) raise a different exception to make the problem more understandable c) return None to avoid the exception (not really a good idea, but would match the behaviour of the getparent() function) d) return a node set with the root element (thus diverging from the spec) Stefan
participants (3)
-
Agustín Villena
-
Martijn Faassen
-
Stefan Behnel