[Tutor] xml parsing without a root element

Peter Otten __peter__ at web.de
Tue Aug 30 20:20:46 CEST 2011


rail shafigulin wrote:

> hello everyone.
> 
> i need to parse a an xml-like file. the problem that i'm facing is that
> this file doesn't have the root element but in all other terms it is the
> same as xml, i.e
> 
> <tag1>
> </tag1>
> 
> <tag2>
> </tag2>
> 
> <tag3/>
> 
> does anybody know if there is a module in python that allows to process an
> xml file without a root element? i tried ElementTree but it didn't work.

There may be more sophisticated ways, but I'd start with a simple idea: add 
a root element to your data and have ElementTree parse the result.

$ cat almost.xml
<a>foo</a>
<a>bar</a>
<a>baz</a>
$ cat xml_no_root.py
from StringIO import StringIO
from xml.etree.ElementTree import ElementTree

filename = "almost.xml"
tree = ElementTree()
with open(filename, "rb") as f:
    data = f.read()
pseudo_file = StringIO("<root>%s</root>" % data )
tree.parse(pseudo_file)

for link in tree.getiterator("a"):
    print link.text
$ python xml_no_root.py
foo
bar
baz

If the file is large you can read the file in smaller chunks. Have a look at 
the ElementTree.parse() source code to see how to do that.



More information about the Tutor mailing list