XML

Stuart Bishop zen at shangri-la.dropbear.id.au
Sun Jun 29 06:23:11 CEST 2003


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Tuesday, June 24, 2003, at 02:49  AM, Roman Suzi wrote:

> ------------------
> foo = "123"
> bar = "456"
> zoo = "la\"lala"
> ------------------
>
> And it's not very hard to parse that.
> In case of XML I will need something like
>
> <?xml version="1.0"?>
> <foo>123</foo><bar>456</bar><zoo>la"lala</zoo>

> - not a big deal, but it's harder to parse. And also XML software keeps
> changing (or so it seems), and this gives a sense of instability.

Are you just assuming this? The following works happily under
both Python2.1.3 and Python2.2.3, and probably every version
since minidom appeared.

import xml.dom.minidom
import pprint

my_xml = '''
<settings>
     <set name="foo">123</set>
     <set name="bar">456</set>
     <set name="zoo">la"lala</set>
     <set name="baz">Rene&#xe9; or Rene\xc3\xa9</set>
     <set name="quux">"""oops="moo"</set>
</settings>
''' # UTF8 by defailt, or add a <?xml encoding="foo"?> for other 
encodings

settings = {}
d = xml.dom.minidom.parseString(my_xml)
for set_node in d.getElementsByTagName('set'):
     name = set_node.getAttribute('name')
     value = [t.wholeText for t in set_node.childNodes]
     settings[name] = value
pprint.pprint(settings)

> XML always gives me a feeling that I do not fully master it 
> (especially it's
> DTD part)! And this is after two years of trying to understand it. 
> (Cf: with
> Python felt at home after a week or two!)

XML gets pretty hairy, generally when you are trying to *do* something
hairy. If you can program Python, you probably have no need for XSLT.
DTD's and schemas are only needed if you need to validate your data.
The advantage is, that these tools and many more are available if you
*do* need them.

XML can also be dead easy if you don't get carried away.
Using a better interface than the DOM (Elementree or pyRXP)
makes it even easier, since you end up with nice pythonic
lists'n'stuff to deal with instead of having to lookup method
names like getElementsByTagName or wholeText.

> P.S. Just look at the neighboor thread:
> Subject: minidom toxml() not emitting attribute namespace qualifier

I notice that your example code isn't emitting attribute namespace
qualifier's either :-) One more XML feature that you probably never
have to worry about, if you are able to use as simplistic a format as
you describe above. I notice elsewhere in this thread that this bug has
also been fixed by someone already, whereas if there is a bug in your
parsing code *you* would have had to fix it.

- -- 
Stuart Bishop <zen at shangri-la.dropbear.id.au>
http://shangri-la.dropbear.id.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (Darwin)

iD8DBQE+/mm0h8iUz1x5geARArAZAJ0R/2hXXSLH+2KlohhlrWAEWK+MzACgny5v
Go5NZMMvzWjAADkeb3kr6Cg=
=Lmhu
-----END PGP SIGNATURE-----






More information about the Python-list mailing list