On Thu, Jan 7, 2016, at 07:59, Steven D'Aprano wrote:
On Thu, Jan 07, 2016 at 08:49:55PM +1100, Chris Angelico wrote:
It makes sense, but I disagree with the suggestion. Having "Latin-1 or UTF-8" as the effective default encoding is not a good idea, IMO;
I'm curious what your reasoning is. That seems to be fairly common behavious with some email clients, for example I seem to recall that Thunderbird will try encoding emails as US-ASCII, if that fails, Latin-1, and only send UTF-8 if the other two don't work.
Sure, but it includes a content-type header with a charset parameter. I think the behavior of encoding text but not including a charset parameter is fundamentally broken. If the user supplies a charset parameter, it should try to use the matching encoding, otherwise it should pick an encoding (whether that is "always UTF-8" or some other rule) and add the charset parameter.