how to use structured markup tools

Fredrik Lundh fredrik at
Sat Mar 19 11:47:05 CET 2005

Sean McIlroy wrote:

> I'm dealing with XML files in which there are lots of tags of the
> following form: <a><b>x</b><c>y</c></a> (all of these letters are being
> used as 'metalinguistic variables') Not all of the tags in the file are
> of that form, but that's the only type of tag I'm interested in. (For
> the insatiably curious, I'm talking about a conversation log from MSN
> Messenger.) What I need to do is to pull out all the x's and y's in a
> form I can use. In other words, from...
> .
> <a><b>x1</b><c>y1</c></a>
> .
> <a><b>x2</b><c>y2</c></a>
> .
> <a><b>x3</b><c>y3</c></a>
> .
> ...I would like to produce, for example,...
> [ (x1,y1), (x2,y2), (x3,y3) ]

how about:

from elementtree import ElementTree

TEXT = """\

tree = ElementTree.XML(TEXT)

data = []

for elem in tree.findall(".//a"):
    data.append((elem.findtext("b"), elem.findtext("c")))

print data

=> [('x1', 'y1'), ('x2', 'y2'), ('x3', 'y3')]

more here:


More information about the Python-list mailing list