[Python-ideas] Add "htmlcharrefreplace" error handler
Serhiy Storchaka
storchaka at gmail.com
Fri Jun 14 17:09:16 CEST 2013
14.06.13 11:49, Antoine Pitrou написав(ла):
> I'd like to know which good reasons there are to not use utf-8 for HTML
> pages in 2013.
Russian text requires 2 bytes per character in utf-8 (not counting
spaces, punctuation and markup) and only 1 byte per character in any
special encoding (cp1251/cp866/koi8-r). Same for other European non
latin-based alphabets. Some old databases contains data in one of this
8-bit encoding and generating html page in the same encoding does not
requires encoding/decoding at all.
> "Keeping the HTML source ASCII-only" is just silly IMO, and it doesn't
> warrant special support in Python's codec error handlers.
"xmlcharrefreplace" is so good as "htmlentityreplace" and even better
for this purpose.
More information about the Python-ideas
mailing list