[Tutor] A somewhat easier way to parse XML
Kent Johnson
kent37 at tds.net
Wed Jan 19 06:11:50 CET 2005
Max Noel wrote:
> Hi everyone,
>
> I've just spent the last few hours learning how to use the DOM XML
> API (to be more precise, the one that's in PyXML), instead of revising
> for my exams :p. My conclusion so far: it sucks (and so does SAX because
> I can't see a way to use it for OOP or "recursive" XML trees).
> I'm certain it can be used to do extremely powerful stuff, but as
> far as usability is concerned, it's ridiculously verbose and naming is
> inconsistent. I've had a look at Java DOM as well, and it's apparently
> the same.
I share your opinion that DOM is a pita. It's the same in Java because it is a 'language-neutral'
spec - i.e. it sucks equally in every language :-)
For Python, take a look at ElementTree, it is way easier to use. Amara looks interesting too.
http://effbot.org/zone/element-index.htm
http://uche.ogbuji.net/uche.ogbuji.net/tech/4Suite/amara/
For Java, try dom4j. http://www.dom4j.org
Many people have tried to make more Pythonic XML libraries, you might want to look around before you
write your own.
Kent
>
> This afternoon, I read a bit about YAML and its basic philosophy
> that everything can be represented as a mix of lists, dictionaries and
> scalars. Now, setting aside the fact that one look at YAML made me want
> to ditch XML for data storage purposes completely (which I can't do
> since there's no Java YAML parser that I know of so far), it came to my
> mind once again that this is the one thing I want to be able to do in
> XML. Chances are that's all what 9 out of 10 programmers want to do with
> XML.
> In fact, I find it appalling that none of the "standard" XML parsers
> (DOM, SAX) provides an easy way to do that (yeah, I know that's what
> more or less what the shelve module does, but I want a
> language-independent way).
>
> So, to wrap my head around DOM, I set out to write a little script
> that does just that. Introducing xmldict.py and the DataNode class.
> For example, given the following XML file:
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> <character>
> <attribute key="BOD">
> <name>Body</name>
> <rating>6</rating>
> </attribute>
> <attribute key="QCK">
> <name>Quickness</name>
> <rating>9</rating>
> </attribute>
> </character>
>
>
> ...the DataNode class (yeah, I think I may have implemented that in
> a slightly bizarre fashion) will produce the following dictionary:
>
> {u'attribute': [{u'@key': u'BOD', u'name': u'Body', u'rating': u'6'},
> {u'@key': u'QCK', u'name': u'Quickness', u'rating': u'9'}]}
>
> As you can see, everything is represented in a mix of dictionaries,
> lists and unicode strings, and can now be used by a normal human being
> to write a program that uses this data.
> Comments, criticism, improvements, suggestions, [whatever]... Would
> be appreciated. Feel free to use it if you wish.
>
> Thanks for your attention.
>
>
>
>
>
>
> -- Max
> maxnoel_fr at yahoo dot fr -- ICQ #85274019
> "Look at you hacker... A pathetic creature of meat and bone, panting and
> sweating as you run through my corridors... How can you challenge a
> perfect, immortal machine?"
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
More information about the Tutor
mailing list