On 2016-01-07 13:59, Steven D'Aprano wrote:
On Thu, Jan 07, 2016 at 08:49:55PM +1100, Chris Angelico wrote:
It makes sense, but I disagree with the suggestion. Having "Latin-1 or UTF-8" as the effective default encoding is not a good idea, IMO;
I'm curious what your reasoning is. That seems to be fairly common behavious with some email clients, for example I seem to recall that Thunderbird will try encoding emails as US-ASCII, if that fails, Latin-1, and only send UTF-8 if the other two don't work.
I'm not defending this tactic, but wondering what you have against it.
I'm fine with either tactic, either defaulting to utf-8 or trying them one after the other. The important thing for me is that the API works as expected by many. My main reason for not changing the default was that it would break backwards compatibility, but only for the case that people sent latin-1 strings as if they where unicode strings. If the reading of the spec that led to using latin-1 is incorrect that really makes we question if having latin-1 there is a good idea from the start. So I'm definitely pro switching to utf-8 as default as it would make the API work like many (including me) would expect. /Emil