[pypy-dev] HTMLParser compatibility with cPython 2.7.3
robert.zaremba at zoho.com
Mon Jun 18 13:02:23 CEST 2012
Hi, I would like to import changes from:
The problem is that HTMLParser from 2.7.2 is not lenient and likes to throw
exceptions, when html document is not well formed:
This often involves exception from BeautifoulSoup, which gains great speed up
when using from pypy + HTMLParser from stdlib:
"RuntimeWarning: Python's built-in HTMLParser cannot parse the given
document. This is not a bug in Beautiful Soup. The best solution is to install
an external parser (lxml or html5lib), and use Beautiful Soup with that
parser. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-
a-parser for help."
However lxml is not compatibile with PyPy, and html5lib is slow.
Can I port the HTMLParser.py from python 2.7.3 to PyPy?
More information about the pypy-dev