[Tutor] xml
Danny Yoo
dyoo at hkn.eecs.berkeley.edu
Wed May 25 01:36:27 CEST 2005
On Tue, 24 May 2005, D. Hartley wrote:
> I looked at the page for ElementTree that Max sent out, but I can't
> understand what it's even talking about.
Hello Denise,
ElementTree is a third-party module by the Effbot for handling some of the
drudgery that is XML parsing:
http://effbot.org/zone/element.htm
it makes XML documents look like a bunch of nested lists. Let's work
through a small example with it; that may help to clear some confusion.
If we have something like a small HTML document:
######
>>> testtext = """
... <html><body>hello world. <i>foo!</i>
... </body></html>"""
######
then we can use ElementTree to get a data structure out of this string:
#######
>>> from elementtree import ElementTree
>>> tree = ElementTree.fromstring(testtext)
#######
'tree' here is our root node, and the tree itself has a single child, the
'body' of the text, which we can get at by just indexing it:
######
>>> len(tree)
1
>>> tree[0]
<Element body at 403c7a6c>
>>> tree[0].text
'hello world. '
######
The body has some text, as well as a child (that italicized node):
######
>>> tree[0][0]
<Element i at 403c79ec>
>>> tree[0][0].text
'foo!'
######
One reason why this whole parsing thing is nice is because we can ask the
tree things like: "Give me all the italicized nodes, anywhere in the
document."
######
>>> for italicNode in tree.findall('.//i'):
... print italicNode.text
...
foo!
######
No need to worry about regular expressions at all. *grin*
We can also start mutating the tree and add more things. For example,
let's add a "goodbye world" at the tail end of the body.
######
>>> tree[0].tail
>>> tree[0].tail = "goodbye!"
>>>
>>> ElementTree.tostring(tree)
'<html><body>hello world. <i>foo!</i>\n</body>goodbye!</html>'
######
Does this make sense?
> Looking through the python modules it seems like I need xmlrpclib - I
> created a serverproxy instance, which I want to use to talk to a server
Out of curiosity, which server?
xmlrpclib is customized to talk to servers that speak the 'xmlrpc'
protocol:
http://www.xmlrpc.com/
so it might or might not be appropriate to use it, depending on what
you're trying to connect to.
More information about the Tutor
mailing list