[Python-Dev] htmllib vs. HTMLParser

amk at amk.ca amk at amk.ca
Mon Oct 27 13:54:52 EST 2003

On Mon, Oct 27, 2003 at 08:52:53AM -0800, Guido van Rossum wrote:
> I'm unclear on what you plan to do -- repeal sgmllib an rewrite
> htmllib to use HTMLParser internally for a backwards compatible
> interface?

Correct; that's what your initial checkin message for HTMLParser.py suggests
doing, and if I'm touching htmllib.py to add the HTML 4.01 stuff, I may as
well make the other change, too.  

> I'm okay with deprecating sgmllib faster than htmllib.

sgmllib gets deprecated; htmllib never gets deprecated.  HTMLParser is a
barebones HTML parser that provides no default handlers (handle_head,
handle_title, etc.), and htmllib extends it, adding default handlers for the
various things in HTML 4.01.


More information about the Python-Dev mailing list