<div dir="ltr">Greetings,<div><br></div><div>There's a minor mistake on the <a href="http://docs.python.org/2/library/htmlparser.html">doc page for the HTMLParser module</a>.</div><br>The last snippet labeled: "<i>Parsing invalid HTML (e.g. unquoted attributes) also works:</i>" is mistaken in assuming that unquoted attributes are invalid.<div>
<br></div><font face="courier new, monospace"> >>> parser.feed('<p><a class=link href=#main>tag soup</p ></a>')</font><div><font face="courier new, monospace"> ...</font><br><div>
<div><br></div><div>According to the <a href="http://www.w3.org/TR/REC-html40/intro/sgmltut.html#h-3.2.2">HTML4</a> and <a href="http://dev.w3.org/html5/markup/syntax.html#syntax-attr-unquoted">HTML5</a> attribute syntax spec, the example you provide are actually perfectly valid attribute definitions. You could add a space or other breaking/invalid character to the attribute value to correct it.</div>
<div><br></div><div>Thanks!</div><div>-Eric Higgins</div></div></div></div>