Parsing complex web pages safely with htmllib.HTMLParser

A.M. Kuchling akuchlin at
Thu Jan 24 10:39:28 EST 2002

In article <mailman.1011873335.21639.python-list at>, 
	montanaro at wrote:
> I'm not sure how XHTML will solve the problem.  Instead of broken HTML we'll
> have to contend with broken XHTML.  Browser manufacturers will still attempt
> to do something reasonable with syntactically incorrect pages, thus making
> it unlikely that people will fix them...

I don't think so.  Mozilla doesn't accept invalid XHTML, and neither
does IE.  For example, when I point either Mozilla or IE 6 at, I get this page:

	XML Parsing Error: mismatched tag. Expected: </to>.
	Line Number 3, Column 13:  <to>Tove</To>

Surprisingly Opera is the least clear; it reports 'Transmission Stopped',
with no indication that it's actually an XML and not a network problem.

--amk                                                  (
Every time you sound confident nowadays, something terrible seems to
    -- Peri, in "Vengeance on Varos"

More information about the Python-list mailing list