xml.sax module documentation

Steve Holden sholden at holdenweb.com
Fri Nov 10 10:54:22 EST 2000


"S. Hendry" wrote:
> 
> >I'm new to programming, but have successfully created a few working python
> >codes.  Now I need to process XML files as an input to my program (my data
> >is in xml format).  However, reading the Python documentation and/or source
> >codes on xml.sax (or any xml module) gives me headache.  It is as-if they
> >were written so newbies cannot understand them (I'm sure it's not
> >intentional).  Could anyone point me to good articles / docs that can help
> >me understand how to use the xml.sax modules please.  Working examples
> would
> >be a Godsend, if you don't mind sharing.
> >
> >I use ActiveState's Python 2.0 (win95).  Many many thanks in advance.
> 
> Actually I should be more specific.  I may be confused what a parser is.
> Maybe if I understand this, it will help me read the docs and available
> articles better.
> 
> Here's what I thought.  I have a data file, for example:
> ....
> <PO>
>   <POnum>123</POnum>
>   <Approver>John Smith</Approver>
>   <Items>
>     <Num>1</Num>
>       <SKU>34A</SKU>
>       <Qty>10</Qty>
>     <Num>2</Num>
> ... etc, etc ...
> 
> I was under the impression that I can use a parser to process the data file,
> so I can check, for example:
> ...
> if PO.Approver = "John Smith":
>    special_discount = 10
> ...
> 
> Am I far off?  If not, then how do I use the xml modules to do what I intend
> to do?  Thanks in advance.
> 
Well, SAX on its own won't encourage you to do this.  There are a
couple of schools of thought about the "best" way to process XML.
SAX leans towards the event-driven approach, where the Parser interacts
with a DocumentHandler, calling its methods when certain events occur.

A dcoument handler is therefore a bit liek a stream-of-consciousness
handler for the Parser, whcih sort of mumbles to itself by means of
DocumentHandler method calls.  E.g.:

"This is the start of a document"
"Here's the start of an 'A' element with attributes 'HREF' (whose
	value is '...') and "OnMouseOver" (whose value is '...')"
"Here's some character data, '...'"
"Here's some more character data, '...'"
"Here's a processing instruction whose target and data are '...' and
	'...'"
"Here's an ending 'A' element"

The advantage of this approach is that you can process large documents
without the Parser having to build a complete parse tree.  Of course,
the disadvantage is you end up writing special-purpose processing rather
than using XSLT and XPath to transform the XML into the form you want.

Hope this helps.  I found the SAX documentation a little opaque the first
time I read it (and the second, and the third...) but it does make sense
in the end.

regards
 Steve
> - Slamet

-- 
Helping people meet their information needs with training and technology.
703 967 0887      sholden at bellatlantic.net      http://www.holdenweb.com/





More information about the Python-list mailing list