<br><br><div class="gmail_quote">On Fri, Jul 29, 2011 at 13:16, Glyph Lefkowitz <span dir="ltr"><<a href="mailto:glyph@twistedmatrix.com">glyph@twistedmatrix.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div style="word-wrap:break-word"><div class="im"><div><div>On Jul 29, 2011, at 3:00 PM, Matt wrote:</div><br><blockquote type="cite"><span style="border-collapse:separate;font-family:Menlo;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;font-size:medium">I don't see any real reason to drop a decent piece of code (HTMLParser, that is) in favor of a third party library when only relatively minor updates are needed to bring it up to speed with the latest spec.</span></blockquote>
</div><br></div><div>I am not really one to throw stones here, as Twisted contains a lenient pseudo-XML parser which I still maintain - one which decidedly does <i>not</i> agree with html5's requirements for dealing with invalid data, but just a bunch of ad-hoc guesses of my own.</div>
<div><br></div><div>My impression of HTML5 is that HTMLParser would require significant modifications and possibly a drastic re-architecture in order to really do HTML5 "right"; especially the parts that the html5lib authors claim makes HTML5 streaming-unfriendly, i.e. subtree reordering when encountering certain types of invalid data.</div>
</div></blockquote><div><br>We could also have the code live side-by-side for a while (or indefinitely if that was really desired) by bringing html5lib in as either a separate module or having the relevant classes live in htmllib under different names.<br>
<br>But all of this is just hypothetical until someone decides to do the legwork to actually make a proposal and get the coding done.<br><br>-Brett<br> </div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<div style="word-wrap: break-word;"><div><br></div><div>But if I'm wrong about that, and there are just a few spec updates and bugfixes that need to be applied, by all means, ignore my comment.</div><div><br></div><font color="#888888"><div>
-glyph</div><div><br></div><div><br></div></font></div><br>_______________________________________________<br>
Python-Dev mailing list<br>
<a href="mailto:Python-Dev@python.org">Python-Dev@python.org</a><br>
<a href="http://mail.python.org/mailman/listinfo/python-dev" target="_blank">http://mail.python.org/mailman/listinfo/python-dev</a><br>
Unsubscribe: <a href="http://mail.python.org/mailman/options/python-dev/brett%40python.org" target="_blank">http://mail.python.org/mailman/options/python-dev/brett%40python.org</a><br>
<br></blockquote></div><br>