
Hi,
I'd like to extract the lowest level of the match. Is possible with xpath()? Thanks.
Assuming you mean only selecting the inner <div>...
tree = etree.parse(StringIO('<div><div>abc</div></div>'))
r = tree.xpath('//div[starts-with(., "a")]')
... use text() instead of . to only select the text content of each element. The reason for you also getting the outer <div> when using . lies within the determination of an element's string-value (https://www.w3.org/TR/xpath/#dt-string-value, emphasis by me): "The string-value of an element node is the concatenation of the string-values of all text node *descendants* of the element node in document order." Whereas text() selects only the text node *children*. Holger Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart