[Tutor] Re: [newbie] sanitizing HTML

A.M. Kuchling amk at amk.ca
Fri Nov 14 15:03:21 EST 2003


On Fri, Nov 14, 2003 at 08:44:21PM +0100, Andrei wrote:
> You can subclass sgmllib.SGMLParser. By providing unknown_starttag,
> unknown_endtag, handle_entityref and handle_data implementations, you can
> "trap" every tag and analyze/modify/delete it.

I think these days you may want to start with HTMLParser.HTMLParser, which
can also handle XML syntax (<br/>).

--amk



More information about the Tutor mailing list