[issue14251] HTMLParser decode issue
Ezio Melotti
report at bugs.python.org
Tue Mar 13 01:11:31 CET 2012
Ezio Melotti <ezio.melotti at gmail.com> added the comment:
I test this again and indeed a bare s.decode() is not enough to fix the problem. The attribute might contain non-ascii characters, and that will result in an error (see for example the "test.py" script attached to #3932). The correct solution is to decode the page before passing it to the parser.
----------
resolution: -> duplicate
stage: -> committed/rejected
status: open -> closed
superseder: -> HTMLParser cannot handle '&' and non-ascii characters in attribute names
versions: -Python 3.2
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14251>
_______________________________________
More information about the Python-bugs-list
mailing list