[XML-SIG] How to get SAX to parse not well formed HTML doc?
Martin v. Loewis
martin@loewis.home.cs.tu-berlin.de
Wed, 18 Jul 2001 01:17:14 +0200
> Another possibility would be to use the HTMLParser module, which is
> new in Python 2.2. It was originally developed for another project
> and is stable and well-tested. Feel free to extract the module from
> the Python CVS repository.
Of course, a "true" HTML parser should get the DTD right,
i.e. generate closing elements where they are missing, expand entities
(to unicode strings), etc.
Regards,
Martin