[Tutor] xml parsing without a root element
Peter Otten
__peter__ at web.de
Tue Aug 30 20:20:46 CEST 2011
rail shafigulin wrote:
> hello everyone.
>
> i need to parse a an xml-like file. the problem that i'm facing is that
> this file doesn't have the root element but in all other terms it is the
> same as xml, i.e
>
> <tag1>
> </tag1>
>
> <tag2>
> </tag2>
>
> <tag3/>
>
> does anybody know if there is a module in python that allows to process an
> xml file without a root element? i tried ElementTree but it didn't work.
There may be more sophisticated ways, but I'd start with a simple idea: add
a root element to your data and have ElementTree parse the result.
$ cat almost.xml
<a>foo</a>
<a>bar</a>
<a>baz</a>
$ cat xml_no_root.py
from StringIO import StringIO
from xml.etree.ElementTree import ElementTree
filename = "almost.xml"
tree = ElementTree()
with open(filename, "rb") as f:
data = f.read()
pseudo_file = StringIO("<root>%s</root>" % data )
tree.parse(pseudo_file)
for link in tree.getiterator("a"):
print link.text
$ python xml_no_root.py
foo
bar
baz
If the file is large you can read the file in smaller chunks. Have a look at
the ElementTree.parse() source code to see how to do that.
More information about the Tutor
mailing list