
Note that I have not tested any of these, but they should at least be close.
Let say I have, this xml file
<?xml version="1.0"?> <catalog> <book id="bk101"> <genre><s>Computer</s></genre> <price><f>44.95</f></price> <publish_date><d>2000-10-01</d></publish_date> <description><s>An in-depth look at creating applications with XML.</s></description> </book> </catalog>
How can I extract only the price of a book which has a genre of "computers"?
# //book find all book tags # /genre ...that contain genre subtags # /s ...that contain s subtags # = "computers" ...that contain the text "computers" books = rootElem.xpath('//book/genre/s = "computers"')
How can I extract, price, description of book id"bk101" ?
For this, I would use separate queries to get the book nodes and the subnodes, though there may be a better way using xpath's "or" operator. It might also be faster to use lxml to get the subnodes you want than to use the second and third xpath calls: # Similar to the previous query with one addition # [@id=...] where the id attribute is... books = rootElem.xpath(//book[@id="bk101"]) for book in books: print book.xpath(./price/f)[0].text print book.xpath(./description/s)[0].text There is a great xpath tutorial and reference here. It should give you all the information you need: http://www.w3schools.com/xpath/default.asp Good luck, --Brad