[Python-Dev] Bytes path related questions for Guido
walter at livinglogic.de
Fri Aug 29 12:09:54 CEST 2014
On 28 Aug 2014, at 19:54, Glenn Linderman wrote:
> On 8/28/2014 10:41 AM, R. David Murray wrote:
>> On Thu, 28 Aug 2014 10:15:40 -0700, Glenn Linderman
>> <v+python at g.nevcal.com> wrote:
>> Also for
>> cases where the data stream is *supposed* to be in a given encoding,
>> contains undecodable bytes. Showing the stuff that incorrectly
>> as whatever it decodes to is generally what you want in that case.
> Sure, people can learn to recognize mojibake for what it is, and maybe
> even learn to recognize it for what it was intended to be, in limited
> domains. But suppressing/replacing the surrogates doesn't help with
> that... would it not be better to replace the surrogates with an
> escape sequence that shows the original, undecodable, byte value?
> Like \xNN ?
For that we could extend the "backslashreplace" codec error callback, so
that it can be used for decoding too, not just for encoding. I.e.
More information about the Python-Dev