Re: [lxml-dev] Some XPath questions...
data:image/s3,"s3://crabby-images/9b726/9b72613785319981a8800f418b99740492b56b75" alt=""
Mike Meyer wrote:
Without CC'ing people won't know you've already answered my questions.
Here's a small example:
So when I use // it works. Huh. I prefer descendant-or-self, because I find it peculiar to do a search from the root when you've called the method on some particular element (that may not be at the root).
Cool, that seems to work right. One query I'm realizing might be really hard (maybe too hard in XPath) is *:first-of-type, *:last-of-type, and *:only-of-type, since they match in a funny sort of way. You can't really do: *[count(../*[name() = name()) = 1] But it's kind of what *:only-of-type means. Or: *[count(following-sibling::name()) = 0 and count(previous-sibling::name()) = 0] You just can't use name() that way. Hmm... well, it's not that important of a query to me, I guess, so maybe I'll just catch it and give an error. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org | Write code, do good | http://topp.openplans.org/careers
data:image/s3,"s3://crabby-images/4cf20/4cf20edf9c3655e7f5c4e7d874c5fdf3b39d715f" alt=""
Hi Ian, if this is supposed to go into lxml.html (or maybe something like lxml.css) please don't call your function "xpath()". That's the XPath evaluation method in etree. Consider calling it "build_xpath()", "css_to_xpath()" or something, depending on the context you provide it in. Ian Bicking wrote:
And without CC'ing the list, the mail won't get archived, people won't be able to find the discussion later and will keep asking the same questions over and over again. :) Oh, and: people won't even be able to comment on what you (Mike) propose as a solution and you won't be able to learn anything either, in case there's a better solution.
There's also ".//*".
What about "e[not(*) and not(normalize-space())]" ?
You need two expressions here, one to find the node and one to compare it to others (note that name() can also take an argument) - but those are really trick, you're right. They may already touch the borders of what XPath can express.
But you can call "name()" with an argument - although not with a node-set (it will just work on the first entry and ignore the rest in that case). Stefan
data:image/s3,"s3://crabby-images/9b726/9b72613785319981a8800f418b99740492b56b75" alt=""
Stefan Behnel wrote:
That seems to be equivalent to //*, i.e., // goes directly to the root regardless of context.
Yes, that works too.
I could probably do it by adding a new function, I suppose; css:last-of-type() for instance. It's not that hard to do in Python, after all. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org | Write code, do good | http://topp.openplans.org/careers
data:image/s3,"s3://crabby-images/4cf20/4cf20edf9c3655e7f5c4e7d874c5fdf3b39d715f" alt=""
Hi Ian, if this is supposed to go into lxml.html (or maybe something like lxml.css) please don't call your function "xpath()". That's the XPath evaluation method in etree. Consider calling it "build_xpath()", "css_to_xpath()" or something, depending on the context you provide it in. Ian Bicking wrote:
And without CC'ing the list, the mail won't get archived, people won't be able to find the discussion later and will keep asking the same questions over and over again. :) Oh, and: people won't even be able to comment on what you (Mike) propose as a solution and you won't be able to learn anything either, in case there's a better solution.
There's also ".//*".
What about "e[not(*) and not(normalize-space())]" ?
You need two expressions here, one to find the node and one to compare it to others (note that name() can also take an argument) - but those are really trick, you're right. They may already touch the borders of what XPath can express.
But you can call "name()" with an argument - although not with a node-set (it will just work on the first entry and ignore the rest in that case). Stefan
data:image/s3,"s3://crabby-images/9b726/9b72613785319981a8800f418b99740492b56b75" alt=""
Stefan Behnel wrote:
That seems to be equivalent to //*, i.e., // goes directly to the root regardless of context.
Yes, that works too.
I could probably do it by adding a new function, I suppose; css:last-of-type() for instance. It's not that hard to do in Python, after all. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org | Write code, do good | http://topp.openplans.org/careers
participants (2)
-
Ian Bicking
-
Stefan Behnel