[Tutor] Python XML for newbie

Mon Jul 2 09:57:27 CEST 2012

Sean Carolan wrote:

>> Thank you, this is helpful.  Minidom is confusing, even the
>> documentation confirms this:
>> "The name of the functions are perhaps misleading...."
>>
>>> But I'd start with the etree tutorial (of which
>>> there are many variations on the web):
> 
> Ok, so I read through these tutorials and am at least able to print
> the XML output now.  I did this:
> 
> doc = etree.parse('computer_books.xml')
> 
> and then this:
> 
> for elem in doc.iter():
>     print elem.tag, elem.text
> 
> Here's the data I'm interested in:
> 
> index 1
> field 11
> value 9780596526740
> datum
> 
> How do you say, "If the field is 11, then print the next value"?  The
> raw XML looks like this:
> 
> <datum>
> <index>1</index>
> <field>11</field>
> <value>9780470286975</value>
> </datum>
> 
> Basically I just want to pull all these ISBN numbers from the file.

With http://lxml.de/ you can use xpath:

$ cat computer_books.xml 
<foo>
    <bar>
        <datum>
            <index>1</index>
            <field>11</field>
            <value>9780470286975</value>
        </datum>
    </bar>
</foo>
$ cat read_isbn.py
from lxml import etree

root = etree.parse("computer_books.xml")
print root.xpath("//datum[field=11]/value/text()")
$ python read_isbn.py 
['9780470286975']
$