[XML-SIG] HtmlBuilder - uses sgmllib, can it use sax/pyexpat?

Jeff.Johnson@icn.siemens.com Jeff.Johnson@icn.siemens.com
Thu, 1 Apr 1999 15:38:15 -0500


Now that I've delivered my Beta CD's to reproduction, I can take a breath and
try to optimize my conversion programs.  I was reading Greg Stein's quick XML
parser and started to see if I could use it.  That was when I realized that most
of what I do is read HTML files via xml.dom.html_builder.HtmlBuilder and it uses
sgmllib.  Assuming that pure python sgmllib is slower than pyexpat which uses C
code, I wondered if there was a way to make HtmlBuilder use SAX and the default
pyexpat parser.  After taking a *very* quick look at the SAX and sgmllib parser
interfaces, it seems like a trivial matter to modify HtmlBuilder to use SAX.  Is
this true and would it be faster?  I know very little about these parsers so
forgive me if my suggestion is just plain stupid :)

To Greg: Most of my code uses DOM so I'm not sure if I could use your parser.
Would it be possible to add a DOM interface (or subset) to the objects it
creates?

To Andrew:
I've found a bug in the XML 0.5.1 package:  The xml/CREDITS file lists me (which
I was pleasantly surprised to see) and ONLY me.  I figure the guys that wrote
the library (you included) might also be included in the credits.  Thanks for
putting me in there though :)

Cheers,
Jeff