[XML-SIG] HTML parse error

Stefan Behnel stefan_ml at behnel.de
Mon Feb 22 15:46:27 CET 2010


sharifah ummu kulthum, 22.02.2010 14:24:
>   File "grabmy.py", line 63, in get_html
>     return BeautifulSoup(content)
>   File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1499, in __init__
>   File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1230, in __init__
>   File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1263, in _feed
>   File "/usr/lib/python2.6/HTMLParser.py", line 108, in feed
>     self.goahead(0)
>   File "/usr/lib/python2.6/HTMLParser.py", line 148, in goahead
>     k = self.parse_starttag(i)
>   File "/usr/lib/python2.6/HTMLParser.py", line 226, in parse_starttag
>     endpos = self.check_for_whole_start_tag(i)
>   File "/usr/lib/python2.6/HTMLParser.py", line 301, in
> check_for_whole_start_tag
>     self.error("malformed start tag")
>   File "/usr/lib/python2.6/HTMLParser.py", line 115, in error
>     raise HTMLParseError(message, self.getpos())
> HTMLParser.HTMLParseError: malformed start tag, at line 830, column 36

Just noticed this now - you seem to be using BeautifulSoup, likely version
3.1. This version does not support parsing broken HTML any well, so use
version 3.0.8 instead, or switch to the tools I indicated.

Note that switching tools means that you need to change your code to use
them. Just installing them is not enough.

Stefan



More information about the XML-SIG mailing list