On 14.06.2013 10:49, Antoine Pitrou wrote:
On Fri, 14 Jun 2013 09:44:09 +0200 "M.-A. Lemburg" <mal@egenix.com> wrote:
IMHO character references (named or numerical) should never be used in HTML (with the exception of " > and <). They exist mainly for three reasons: 1) provide a way to include characters that are not available in the used encoding (e.g. if you are using an obsolete encoding like windows-1252 but still want to use "fancy" characters); 2) to keep the HTML source ASCII-only;
This is the main reason for using them. HTML's default encoding is Latin-1, unlike XML.
I'd like to know which good reasons there are to not use utf-8 for HTML pages in 2013. "Keeping the HTML source ASCII-only" is just silly IMO, and it doesn't warrant special support in Python's codec error handlers.
Ezio and I gave reasons, but you've cut them away ;-) Note that error handlers can be registered in the codec registry. You don't need to add support for them to each and every codec, so the added code is minimal. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 14 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2013-07-01: EuroPython 2013, Florence, Italy ... 17 days to go 2013-07-16: Python Meeting Duesseldorf ... 32 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/