[Tutor] parsing xml as lines

richard kappler richkappler at gmail.com
Wed Nov 4 13:36:07 EST 2015


I have an xml file that get's written to as events occur. Each event writes
a new 'line' of xml to the file, in a specific format, eg: sometthing like
this:

<heresmydataline  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="Logging.xsd" version="1.0"><child of
heresmydata/><anotherchildofheresmydata/><grandchild>somestuff</grandchild></heresmydata>

and each 'line' has that same structure or format.

I've written a script that parses out the needed data and forwards it on
using regex's, but think it might be better to use an xml parser. I can
parse out what I need to if I have just one line in the file, but when
there are number of lines as there actually are, I can't figure out how to
get it to work.

In other words, with a one line file, this works fine and I understand it:

import xml.etree.cElementTree as ET
tree = ET.ElementTree(file='1lineTest.log'
grandchild = tree.find('grandchild')
print grandchild.tag, grandchild.text

and I get the output I desire:

grandchild Sally

But if I have several lines in the file try to run a loop:

import xml.etree.cElementTree as ET
f1 = open('5lineTest.log', 'r')
lineList = f1.readlines()
Imax = len(lineList)

i = 0
while i <= Imax:
    tree = ET.ElementTree(lineList[i])
    grandchild = tree.find('grandchild')
    print grandchild.tag, grandchild.txt
    i += 1

Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
AttributeError: 'int' object has no attribute 'tag'

and yet I can do:

print lineList[0] and it will print out the first line.

I get why (I think), I just can't figure out a way around it.

Guidance please?


More information about the Tutor mailing list