how to use structured markup tools
Fredrik Lundh
fredrik at pythonware.com
Sat Mar 19 05:47:05 EST 2005
Sean McIlroy wrote:
> I'm dealing with XML files in which there are lots of tags of the
> following form: <a><b>x</b><c>y</c></a> (all of these letters are being
> used as 'metalinguistic variables') Not all of the tags in the file are
> of that form, but that's the only type of tag I'm interested in. (For
> the insatiably curious, I'm talking about a conversation log from MSN
> Messenger.) What I need to do is to pull out all the x's and y's in a
> form I can use. In other words, from...
> .
> <a><b>x1</b><c>y1</c></a>
> .
> <a><b>x2</b><c>y2</c></a>
> .
> <a><b>x3</b><c>y3</c></a>
> .
> ...I would like to produce, for example,...
>
> [ (x1,y1), (x2,y2), (x3,y3) ]
how about:
from elementtree import ElementTree
TEXT = """\
<doc>
<a><b>x1</b><c>y1</c></a>
<a><b>x2</b><c>y2</c></a>
<a><b>x3</b><c>y3</c></a>
</doc>
"""
tree = ElementTree.XML(TEXT)
data = []
for elem in tree.findall(".//a"):
data.append((elem.findtext("b"), elem.findtext("c")))
print data
=> [('x1', 'y1'), ('x2', 'y2'), ('x3', 'y3')]
more here:
http://effbot.org/zone/element-index.htm
</F>
More information about the Python-list
mailing list