[Python-Dev] htmllib vs. HTMLParser
amk at amk.ca
amk at amk.ca
Tue Oct 28 07:53:50 EST 2003
On Mon, Oct 27, 2003 at 04:53:32PM -0800, Bill Janssen wrote:
> But IMO simply adding some handler methods won't really do it. You
> also need to introduce some knowledge about the semantics of the
> syntax. For example, a new "block"-level element should close all
> "in-line" elements that are currently open. Etc.
Perhaps, but it might be a mug's game. I was on the Lynx developer list for
a while, and bad HTML requires many, many hacks to be processed sensibly.
Given that XHTML use is slowly rising, that work may not be necessary, but
I'll keep it in mind.
--amk
More information about the Python-Dev
mailing list