HTMLParser chokes on bad end tag in comment
reply.in.the.newsgroup at my.address.is.invalid
Mon May 29 09:05:17 CEST 2006
>[end tag in html comment in script element]
>The end tag it chokes on is in comment, isn't it?
>no. STYLE and SCRIPT elements contain character data, not parsed
>character data, so comments are treated as characters, and the first
>"</" ends the element.
Ah, I see. I'll report the problem to the application that's generating
this broken code (vBulletin forum)...
>if you have broken documents, you can tweak this by setting the
>CDATA_CONTENT_ELEMENTS parser attribute before you start parsing.
... and in the mean time that's a good workaround.
Thank you very much Fredrik.
More information about the Python-list