[XML-SIG] xmllib and easysax

Paul Prescod paul@prescod.net
Tue, 25 May 1999 08:23:08 -0500


Fred asked me to outline an easysax compliant xmllib extension. Well,
xmllib has only one export: xmllib.XMLParser. So my idea is that all we
need to do is either build on that or deprecate it. 

Option 1: Build on it:

We can extend xmllib.XMLParser (and sgmlop!) to be SAX-compliant parsers
just by adding setDocumentHandler, etc. When XMLParser.parse() is called,
it would behave just like a SAX parser.

If setDocumentHandler is never called then the default document handler
would be an undocumented helper class that would redirect the events BACK
to the xmllib.XMLParser (because xmllib.XMLParser plays the roles of both
parser and event handler).

All of the non-SAX methods of xmllib.XMLParser would be deprecated.

Option 2: Deprecate it:

Maybe it is better to deprecate all of xmllib.XMLParser instead of
deprecating individual methods. If we deprecated it we would replace it
with an xmllib.Parser (not the shorter name) that was SAX compliant.

Other stuff:

Now that xmllib has a SAX-compliant parser (one way or the other), we can
make a class called xmllib.handler which is a base class that implements
all of the SAX methods and redirects start_FOO, text_FOO, pi_FOO, to a
subclassed client (if it cares to override them) and also allows
overriding of error, fatalerror, warning and so forth.

I could live with the default behavior for errors and warnings being to
throw an exception, I guess.

We wouldn't really need to use the term "easysax" anymore. Easysax was
never really an API in that we didn't expect multiple implementations for
it. It was just a convenient handler base class (or adapter).

I would also like the initialization of the XMLParser and handler classes
to be integrated somehow. "Ordinary" sax takes too many steps in my
opinion. We need to have a single line of user code that sets ALL of the
sax handlers, creates the parser and parses. Perhaps

class handler:
	def Parse( streamOrFile, parser=None ):
		parser = parser or XMLParser()
		XMLParser.setThis()
		XMLParser.setThat()
		if isFile( streamOrFile ):
			XMLParser.parse( open( "file", "rb" ) )
		else:
			XMLParser.parse( streamOrFile )

This would be used like so:

class MyHandler( xmllib.handler ):
	def text_TITLE( self, text ):
		#blah

h=MyHandler()
h.Parse( "/myfile.xml" )

One neat thing about this is that we could change the Parse()
implementation one day so that it used a parser that knew a lot about
easysax and did not (for instance) report text and elements that we aren't
going to work with *at all*. If you don't specifically ask for a parser
you get the blazingly fast one. But if you want choice you've got it:

h=MyHandler()
h.Parse( "/myfile.xml", MyFavParser() )

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

Alabama's constitution is 100 years old, 300 pages long and has more than
600 amendments. Highlights include "Amendment 393: Amendment of Amendment
No.  351", "Validation of Laws Regulating Court Costs in Randolph County",
"Miscegenation laws", "Bingo Games in Russell County", "Suppression
of dueling".  - http://www.legislature.state.al.us/ALISHome.html