[Tutor] program that processes tokenized words in xml
pan@uchicago.edu
pan@uchicago.edu
Tue May 6 19:34:01 2003
=A4=DE=A5=CE Alan Gauld <alan.gauld@blueyonder.co.uk>:
> I haven't checked but how does it handle recursive definitions?
> Like this, say:
>=20
> <person>
> <name>Jon</name>
> <son><person>
> <name>Fred</name>
> <son>None</son>
> </person>
> </son>
> </person>
>=20
> That's usually where regex based parsing of XML falls flat.
>=20
> Alan g.
>=20
Try these:
>>> from panXmlParser import c_panXmlParser
>>> data =3D '''<person>
.. <name>Jon</name>
.. <son><person>
.. <name>Fred</name>
.. <son>None</son>
.. </person> =20
.. </son> =20
.. </person>
.. '''
>>> person =3D c_panXmlParser(data)
>>> person.son[0].person[0].name
['Fred']
>>> person.son[0].person[0].son
['None']
And these:
>>> data =3D '''<person>
.. <name>Jon</name>
.. <son>
.. <person>
.. <name>Fred</name>
.. <son>many</son>
.. </person> =20
.. <person>
.. <name>Pan</name>
.. <son>NotYet</son>
.. </person> =20
.. </son> =20
.. </person>
.. '''
..=20
>>> person =3D c_panXmlParser(data)
>>> person.son[0].person[0].name
['Fred']
>>> person.son[0].person[0].son
['many']
>>> person.son[0].person[1].name
['Pan']
>>> person.son[0].person[1].son
['NotYet']
>>> [x.name for x in person.son[0].person]
[['Fred'], ['Pan']]
>>> [x.son for x in person.son[0].person]
[['many'], ['NotYet']]
pan