[XML-SIG] sgmllib has problems with dots in tag names

Andreas Jung ajung@sz-sb.de
Fri, 16 Jul 1999 15:33:24 +0200

On Fri, Jul 16, 1999 at 09:17:03AM -0400, Fred L. Drake wrote:
>   Ok, I've poked at the standard sgmllib a bit to see what the problem
> is.  The parser is recognizing the start and end tags.  Once
> recognized, it is looking for the handler methods start_*() / end_*()
> or do_*().  Since there's a dot in the name, these methods are not
> defined, and the unknown_*tag() methods are called instead of the
> handle_*tag() methods.

>   It should be easy to override the unknown_*tag() methods to use a
> table-based dispatcher or performs some form or name mangling, then
> passes known tags through to the handle_*tag() methods or whatever.
> This seems to be the easiest way to deal with the situation in the
> short term.

That's a solution that works with the standard sgmllib from the Python
distribution. However this solution does not work with sgmllib from 
xml.parsers. I'm not sure if tag names with dots in their names are valid in XML 
or not. So this might explain the different behaviour however I don't think
that's the reason. Maybe I'll find the real reason over the weekend.


                               (' O-O ')
   Andreas Jung, Saarbrücker Zeitung Verlag und Druckerei GmbH
   Saarbrücker Daten-Innovations-Center
   Gutenbergstr. 11-23, D-66103 Saarbrücken, Germany
   Phone: +49-(0)681-502-1563, Fax: +49-(0)681-502-1509
   Email: ajung@sz-sb.de (PGP key available)