[Tutor] A somewhat easier way to parse XML

Wed Jan 19 18:17:48 CET 2005

David Rock wrote:
> * Max Noel <maxnoel_fr at yahoo.fr> [2005-01-19 11:48]:
> 
>>On Jan 19, 2005, at 03:58, David Rock wrote:
>>
>>
>>>For me, it seems that the way you are supposed to interact with an XML
>>>DOM is to already know what you are looking for, and in theory, you
>>>_should_ know ;-)
>>
>>	Indeed. The problem is, even if I know what I'm looking for, the  
>>problem remains that given the following document,
>>
>><foo>
>>	<bar>baz</bar>
>></foo>
>>
>>	If I want to get "baz", the command is (assuming a DOM object has 
>>	been  created):
>>
>>doc.documentElement.getElementsByTagName("bar")[0].childNodes[0].nodeVal 
>>ue
>>
>>	Quoting from memory there, it may not be entirely correct. However,  
>>the command has more characters than the document itself. Somehow I  
>>feel it'd be a bit more elegant to use:
>>
>>doc["bar"]
>>
>>(or depending on the implementation, doc["foo"]["bar"])
>>
>>	Don't you think?
> 
> 
> Absolutely. That is exactly what I was hoping for, too. ElementTree
> comes close, but even that can be a bit unwieldy because of the
> multi-dimentional array you end up with. Still, if you know the data,
> 
> doc[0][0] is a lot easier than doc.documentElement...nodeValue

Use the XPath support in ElementTree. Something like
doc.find('foo/bar')

If I understand correctly Amara allows something like
doc.foo.bar

I'll try to find the time to write up a full example using ElementTree, Amara and dom4j. Meanwhile 
see http://www.oreillynet.com/pub/wlg/6225 and http://www.oreillynet.com/pub/wlg/6239

Kent

> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor