[Python-ideas] Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client
Emil Stenström
em at kth.se
Thu Jan 7 08:11:01 EST 2016
On 2016-01-07 13:59, Steven D'Aprano wrote:
> On Thu, Jan 07, 2016 at 08:49:55PM +1100, Chris Angelico wrote:
>
>> It makes sense, but I disagree with the suggestion. Having "Latin-1 or
>> UTF-8" as the effective default encoding is not a good idea, IMO;
>
> I'm curious what your reasoning is. That seems to be fairly common
> behavious with some email clients, for example I seem to recall that
> Thunderbird will try encoding emails as US-ASCII, if that fails,
> Latin-1, and only send UTF-8 if the other two don't work.
>
> I'm not defending this tactic, but wondering what you have against it.
I'm fine with either tactic, either defaulting to utf-8 or trying them
one after the other. The important thing for me is that the API works as
expected by many.
My main reason for not changing the default was that it would break
backwards compatibility, but only for the case that people sent latin-1
strings as if they where unicode strings.
If the reading of the spec that led to using latin-1 is incorrect that
really makes we question if having latin-1 there is a good idea from the
start.
So I'm definitely pro switching to utf-8 as default as it would make the
API work like many (including me) would expect.
/Emil
More information about the Python-ideas
mailing list