xml.sax module documentation
Martin von Loewis
loewis at informatik.hu-berlin.de
Mon Nov 13 09:26:34 EST 2000
"S. Hendry" <shendry at usa.capgemini.com> writes:
> Actually I should be more specific. I may be confused what a parser is.
A parser is an algorithm that splits a text into pieces (called
tokens), and combines these pieces according to a grammar. It
eventually decides whether the input text is correct according to the
grammar (i.e. it accepts the text); in the process, it tells the application
what parts of the text is has seen.
Somebody already explained how a SAX parser works; the things it
reports are start and endtags, and the characters in-between.
> if PO.Approver = "John Smith":
> special_discount = 10
> ...
>
> Am I far off?
Somewhat off, yes. An XML processor can't work that way. For example,
there may be multiple Approver elements in a PO element; the API you
propose couldn't tell-apart the various instances. Likewise, it is not
clear how such an API would take attributes into account, e.g.
<Qty unit="kg">10</Qty>
Since you'd use Python attributes already for the subelements, it'd be
hard to merge the XML attributes into that.
> If not, then how do I use the xml modules to do what I intend to do?
I suggest to build a DOM tree, using xml.dom.minidom.parse.
When I parse your example into a variable d, I can do
>>> s = """your document"""
>>> from xml.dom.minidom import parseString
>>> d = parseString(x)
>>> d
<xml.dom.minidom.Document instance at 2afa9c>
>>> d.getElementsByTagName("Approver")
[<DOM Element: Approver at 2832780>]
>>> d.getElementsByTagName("Approver")[0].firstChild
<DOM Text node "John Smith">
>>> d.getElementsByTagName("Approver")[0].firstChild.data
u'John Smith'
Regards,
Martin
More information about the Python-list
mailing list