XML parsing with python

Stefan Behnel stefan_ml at behnel.de
Mon Aug 17 12:09:14 CEST 2009


inder wrote:
> I am new to xml . I need to parse the xml file . After reading and
> browsing on the web , I could get much help .
> 
> I guess SAX would be better suited for my requirement .

That's a common misconception.


> Could some juct provide me a sample python code so that I can execute
> it and see how the parsing actually happens .
> 
> Lets say my xml file -
> <?xml version="1.0"?>
> <library>
> <category code="SciFi" room="1"> <!--if you want to test invalid
> document against schema you can just cut the mandatory id attribute --
> <book id="ISBN001">
> <title>I,Robot</title>
> <pages>100</pages>
> <author>Isaac Asimov</author>
> </book>
> <book id="ISBN001" damaged="true">
> <title>Blade Runner</title>
> <pages>400</pages>
> <author>Philip K. Dick</author>
> </book>
> </category>
> <category code="Boring" room="2">
> <book id="ISBN003">
> <title>Lord Of The Rings</title>
> <pages>20000</pages>
> <author>Tolkien</author>
> </book>
> <book id="ISBN004" damaged="true">
> <title>XML-Schema Specification</title>
> <pages>5000</pages>
> <author>W3C</author>
> </book>
> </category>
> <category code="Fantasy">
> <book id="ISBN005" damaged="true">
> <title>Aladin</title>
> <pages>150</pages>
> <author>Don't know</author>
> </book>
> </category>
> </library>
> 
> 
> --------------------------
> 
> I need the output to be - (elements containing 'title' )
> 
> I,Robot
> Blade Runner
> Lord Of The Rings
> XML-Schema Specification
> Aladin

Use the iterparse() function of the xml.etree.ElementTree package.

http://effbot.org/zone/element-iterparse.htm
http://codespeak.net/lxml/parsing.html#iterparse-and-iterwalk

Stefan



More information about the Python-list mailing list