[Tutor] A somewhat easier way to parse XML

Kent Johnson kent37 at tds.net
Wed Jan 19 06:11:50 CET 2005


Max Noel wrote:
> Hi everyone,
> 
>     I've just spent the last few hours learning how to use the DOM XML 
> API (to be more precise, the one that's in PyXML), instead of revising 
> for my exams :p. My conclusion so far: it sucks (and so does SAX because 
> I can't see a way to use it for OOP or "recursive" XML trees).
>     I'm certain it can be used to do extremely powerful stuff, but as 
> far as usability is concerned, it's ridiculously verbose and naming is 
> inconsistent. I've had a look at Java DOM as well, and it's apparently 
> the same.

I share your opinion that DOM is a pita. It's the same in Java because it is a 'language-neutral' 
spec - i.e. it sucks equally in every language :-)

For Python, take a look at ElementTree, it is way easier to use. Amara looks interesting too.
http://effbot.org/zone/element-index.htm
http://uche.ogbuji.net/uche.ogbuji.net/tech/4Suite/amara/

For Java, try dom4j. http://www.dom4j.org

Many people have tried to make more Pythonic XML libraries, you might want to look around before you 
write your own.

Kent

> 
>     This afternoon, I read a bit about YAML and its basic philosophy 
> that everything can be represented as a mix of lists, dictionaries and 
> scalars. Now, setting aside the fact that one look at YAML made me want 
> to ditch XML for data storage purposes completely (which I can't do 
> since there's no Java YAML parser that I know of so far), it came to my 
> mind once again that this is the one thing I want to be able to do in 
> XML. Chances are that's all what 9 out of 10 programmers want to do with 
> XML.
>     In fact, I find it appalling that none of the "standard" XML parsers 
> (DOM, SAX) provides an easy way to do that (yeah, I know that's what 
> more or less what the shelve module does, but I want a 
> language-independent way).
> 
>     So, to wrap my head around DOM, I set out to write a little script 
> that does just that. Introducing xmldict.py and the DataNode class.
>     For example, given the following XML file:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> 
> <character>
>     <attribute key="BOD">
>         <name>Body</name>
>         <rating>6</rating>
>     </attribute>
>     <attribute key="QCK">
>         <name>Quickness</name>
>         <rating>9</rating>
>     </attribute>
> </character>
> 
> 
>     ...the DataNode class (yeah, I think I may have implemented that in 
> a slightly bizarre fashion) will produce the following dictionary:
> 
> {u'attribute': [{u'@key': u'BOD', u'name': u'Body', u'rating': u'6'}, 
> {u'@key': u'QCK', u'name': u'Quickness', u'rating': u'9'}]}
> 
>     As you can see, everything is represented in a mix of dictionaries, 
> lists and unicode strings, and can now be used by a normal human being 
> to write a program that uses this data.
>     Comments, criticism, improvements, suggestions, [whatever]... Would 
> be appreciated. Feel free to use it if you wish.
> 
>     Thanks for your attention.
> 
> 
> 
> 
> 
> 
> -- Max
> maxnoel_fr at yahoo dot fr -- ICQ #85274019
> "Look at you hacker... A pathetic creature of meat and bone, panting and 
> sweating as you run through my corridors... How can you challenge a 
> perfect, immortal machine?"
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor



More information about the Tutor mailing list