
On Fri, Nov 24, 2017 at 12:42 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Chris Jerdonek schrieb am 23.11.2017 um 14:55:
I have a seemingly simple lxml.etree use case, but the API doesn't seem to support it.
Say I have an Element "root" at the root of a tree, and say I have an element "element" inside the tree. Is there an efficient way to get the element **after** "element" (in document order), and matching given tags?
Your use case isn't clear to me. How do you get at that element? Could you provide some more details?
Thanks. I'll provide more details. Given a tree, I want to perform some processing on every subtree in the tree having certain properties. The properties are (1) that the root element of the subtree is one of a number of tags, say "a" and "b", and (2) the root doesn't have any ancestors with those tags. So these can be thought of as the "maximal" subtrees with a root having tag "a" or "b". It's straightforward to get the first subtree using iter(): element = root.iter('a', 'b') However, to get the next subtree to process, it's not as straightforward. The approach I'm using is first to use lxml's API to get the element in the tree that follows the subtree (if it exists). Call that element "next_element". This next element doesn't necessarily satisfy the property I'm looking for, so here is where my question comes in. What I'm looking for is the first element in root.iter('a', 'b') that is equal to or after next_element. This is why it would be useful to be able to "start" iterating over root.iter('a', 'b') from an arbitrary starting element. What I'm suggesting / asking for is a bit like adding a "start" argument to str.find() if it initially only supported searching from start index 0: https://docs.python.org/3/library/stdtypes.html#str.find The alternative solution I've come up with doesn't seem as efficient or elegant. Thank you, --Chris