[Python-Dev] htmllib vs. HTMLParser

amk at amk.ca amk at amk.ca
Tue Oct 28 07:53:50 EST 2003

On Mon, Oct 27, 2003 at 04:53:32PM -0800, Bill Janssen wrote:
> But IMO simply adding some handler methods won't really do it.  You
> also need to introduce some knowledge about the semantics of the
> syntax.  For example, a new "block"-level element should close all
> "in-line" elements that are currently open.  Etc.

Perhaps, but it might be a mug's game.  I was on the Lynx developer list for
a while, and bad HTML requires many, many hacks to be processed sensibly.
Given that XHTML use is slowly rising, that work may not be necessary, but
I'll keep it in mind.


More information about the Python-Dev mailing list