data:image/s3,"s3://crabby-images/68281/682811131061ddf0a8ae288d02efca5f138e45a0" alt=""
Hi, I'd like to extract the lowest level of the match. Is possible with xpath()? Thanks. $ cat main.py #!/usr/bin/env python # vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8: from lxml import etree import sys from StringIO import StringIO tree = etree.parse(StringIO('<div><div>abc</div></div>')) r = tree.xpath('//div[starts-with(., "a")]') for x in r: print etree.tostring(x) $ ./main.py <div><div>abc</div></div> <div>abc</div> -- Regards, Peng
data:image/s3,"s3://crabby-images/8bbe6/8bbe681f08550d13b35a459376ee85cf203c1262" alt=""
Hi,
I'd like to extract the lowest level of the match. Is possible with xpath()? Thanks.
Assuming you mean only selecting the inner <div>...
tree = etree.parse(StringIO('<div><div>abc</div></div>'))
r = tree.xpath('//div[starts-with(., "a")]')
... use text() instead of . to only select the text content of each element. The reason for you also getting the outer <div> when using . lies within the determination of an element's string-value (https://www.w3.org/TR/xpath/#dt-string-value, emphasis by me): "The string-value of an element node is the concatenation of the string-values of all text node *descendants* of the element node in document order." Whereas text() selects only the text node *children*. Holger Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart
data:image/s3,"s3://crabby-images/8bbe6/8bbe681f08550d13b35a459376ee85cf203c1262" alt=""
Hi,
I'd like to extract the lowest level of the match. Is possible with xpath()? Thanks.
Assuming you mean only selecting the inner <div>...
tree = etree.parse(StringIO('<div><div>abc</div></div>'))
r = tree.xpath('//div[starts-with(., "a")]')
... use text() instead of . to only select the text content of each element. The reason for you also getting the outer <div> when using . lies within the determination of an element's string-value (https://www.w3.org/TR/xpath/#dt-string-value, emphasis by me): "The string-value of an element node is the concatenation of the string-values of all text node *descendants* of the element node in document order." Whereas text() selects only the text node *children*. Holger Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart
participants (2)
-
Holger Joukl
-
Peng Yu