[New-bugs-announce] [issue15156] Refactor HTMLParser.unescape to use html.entities.html5

Ezio Melotti report at bugs.python.org
Sun Jun 24 04:45:43 CEST 2012

New submission from Ezio Melotti <ezio.melotti at gmail.com>:

HTMLParser has an internal method called unescape [0] used to convert named character references to the equivalent characters, and it does so by using html.entities.name2codepoint to recreate the equivalent of html.entities.entityrefs with the addition of &apos;.
Now that the html5 entities have been added to html.entities, the parser should use them instead of name2codepoint.

[0]: see Lib/html/parser.py:500

assignee: ezio.melotti
components: Library (Lib)
messages: 163702
nosy: eric.araujo, ezio.melotti, r.david.murray
priority: normal
severity: normal
stage: needs patch
status: open
title: Refactor HTMLParser.unescape to use html.entities.html5
type: enhancement
versions: Python 3.3

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list