[docs] html.parser.HTMLParser: setting 'convert_charrefs = True' leads to dropped text (issue 23144)

vadmium+py at gmail.com vadmium+py at gmail.com
Sun Mar 8 00:11:54 CET 2015


https://bugs.python.org/review/23144/diff/14120/Lib/html/parser.py
File Lib/html/parser.py (right):

https://bugs.python.org/review/23144/diff/14120/Lib/html/parser.py#newcode146
Lib/html/parser.py:146: # this is the case before proceding by looking
for an
proceeding [double E]

https://bugs.python.org/review/23144/diff/14120/Lib/html/parser.py#newcode148
Lib/html/parser.py:148: amppos = rawdata.rfind('&', max(i, n-34))
Where does the -34 come from? I guess you are trying to optimize how far
rfind() searches (“looking for an ampersand _near_ the end”), based on
some maximum character reference length, but this at least needs a
comment explaining the magic number.

https://bugs.python.org/review/23144/


More information about the docs mailing list