html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text (issue 23144)

Sun Mar 8 00:11:54 CET 2015

File Lib/html/parser.py (right):

Lib/html/parser.py:146: # this is the case before proceding by looking
for an
proceeding [double E]

Lib/html/parser.py:148: amppos = rawdata.rfind('&', max(i, n-34))
Where does the -34 come from? I guess you are trying to optimize how far
rfind() searches (“looking for an ampersand _near_ the end”), based on
some maximum character reference length, but this at least needs a
comment explaining the magic number.


