HTMLParser.HTMLParseError: EOF in middle of construct

John Nagle nagle at
Wed Jun 20 06:19:09 CEST 2007

none wrote:
> Gabriel Genellina wrote:
>> En Mon, 18 Jun 2007 16:38:18 -0300, Sergio Monteiro Basto 
>> <sergio at> escribió:
>>> Can someone explain me, what is wrong with this site ?
>>> python > test

> ok but my problem is not understand what is the specific problem at line 
> 1173
>> HTMLParser expects valid HTML - try a different tool, like 
>> BeautifulSoup, which is specially designed to handle malformed pages.
>> --Gabriel Genellina

    Yes, you almost have to use BeautfulSoup on real-world web pages.
Even that may not be enough; I have my own even more robust version of
BeautifulSoup.  (I've sent the fixes, which are small, to the author.)

    The usual BeautifulSoup killer is improperly terminated HTML comments. The
default action is to suck up the rest of the entire document into
the comment, which is usually not what you want.  I have a fix for that

				John Nagle

More information about the Python-list mailing list