[lxml-dev] xpath going crazy
data:image/s3,"s3://crabby-images/1e373/1e3739ec38676099eeb06a8927097c924eadf1d1" alt=""
Hi, Consider the following XML document: http://pastebin.ca/1520331 This is an ODF presentation produced by OpenOffice.org, assumed to be a valid XML document. Now I type this:
from lxml import etree t = etree.parse('content.xml') ns = {'draw': "urn:oasis:names:tc:opendocument:xmlns:drawing:1.0"} t.xpath('//draw:frame', namespaces=ns) [<Element {urn:oasis:names:tc:opendocument:xmlns:drawing:1.0}frame at 7f604469ee68>, <Element {urn:oasis:names:tc:opendocument:xmlns:drawing:1.0}frame at 7f604469eec0>]
There are indeed two frames in the document.
t.xpath('//draw:frame[0]', namespaces=ns) []
The position counting starts at 1 in XPath so this is expected.
t.xpath('//draw:frame[1]', namespaces=ns) [<Element {urn:oasis:names:tc:opendocument:xmlns:drawing:1.0}frame at 7f604469eec0>, <Element {urn:oasis:names:tc:opendocument:xmlns:drawing:1.0}frame at 7f604469ee68>]
I get the two elements at once.
t.xpath('//draw:frame[2]', namespaces=ns) []
print "lxml.etree: ", etree.LXML_VERSION lxml.etree: (2, 2, 2, 0) print "libxml used: ", etree.LIBXML_VERSION
print "libxml compiled: ", etree.LIBXML_COMPILED_VERSION
print "libxslt used: ", etree.LIBXSLT_VERSION
print "libxslt compiled: ", etree.LIBXSLT_COMPILED_VERSION
I can't get the second element. The same thing happens when asking the root instead of the tree. I know my XPath knowledge is limited by I don't think I'm doing any wrong assumption. libxml used: (2, 7, 3) libxml compiled: (2, 7, 3) libxslt used: (1, 1, 24) libxslt compiled: (1, 1, 24) Thanks for your lights, Hervé
data:image/s3,"s3://crabby-images/f456d/f456d99adf8976ed9e43b908659d2775041cec72" alt=""
On 06.08.2009, at 12:32, Hervé Cauwelier wrote:
Hi,
Consider the following XML document: http://pastebin.ca/1520331
This is an ODF presentation produced by OpenOffice.org, assumed to be a valid XML document. The position counting starts at 1 in XPath so this is expected.
t.xpath('//draw:frame[1]', namespaces=ns) [<Element {urn:oasis:names:tc:opendocument:xmlns:drawing:1.0}frame at 7f604469eec0>, <Element {urn:oasis:names:tc:opendocument:xmlns:drawing:1.0}frame at 7f604469ee68>]
I get the two elements at once.
t.xpath('//draw:frame[2]', namespaces=ns) []
I can't get the second element.
You ask for all draw:frame-Elements that are the first in their specific context. //draw:frame[1] only omits all draw:frame that have a draw:frame in their preceding-siblings. If you look at the XPath-results in e.g. Oxygen, it is easy to see. (//draw:frame)[1] should do what you want. (only the first of all //draw:frame in the document)
data:image/s3,"s3://crabby-images/1e373/1e3739ec38676099eeb06a8927097c924eadf1d1" alt=""
Jens Quade a écrit :
You ask for all draw:frame-Elements that are the first in their specific context. //draw:frame[1] only omits all draw:frame that have a draw:frame in their preceding-siblings.
If you look at the XPath-results in e.g. Oxygen, it is easy to see.
(//draw:frame)[1]
should do what you want. (only the first of all //draw:frame in the document)
Thanks for the quick reply. I fixed my expressions. Hervé
participants (2)
-
Hervé Cauwelier
-
Jens Quade