[Python-ideas] Python 3000 TIOBE -3%

MRAB python at mrabarnett.plus.com
Wed Feb 15 01:44:19 CET 2012

On 14/02/2012 23:35, Steven D'Aprano wrote:
> MRAB wrote:
>>  On 14/02/2012 21:43, Jim Jewett wrote:
>>>  On Tue, Feb 14, 2012 at 6:39 AM, Carl M. Johnson
>>>  <cmjohnson.mailinglist at gmail.com>   wrote:
>>>>   OK, so concrete proposals: update the docs and maybe make a
>>>>   synonym for Latin-1 that makes it more semantically obvious that
>>>>   you're not really using it as Latin-1, just as a easy to pass through
>>>>   encoding. Anything else? Any bike shedding on the synonym?
>>>  encoding="ascii-ish"  # gets the sloppyness right
>>>  encoding="passthrough"  # I would like "ignore", if it wouldn't cause
>>>  confusion with the errorhandler
> "Ignore" won't do. Ignore what? Everything? Don't actually run an encoder?
> That doesn't even make sense!
> "Passthrough" is bad too, because it perpetrates the idea that ASCII
> characters are "plain text" which are bytes. Unicode strings, even those that
> are purely ASCII, are not strings of bytes (except in the sense that every
> data structure is a string of bytes). You can't just "pass bytes through" to
> turn them into Unicode.
>>>  encoding="binpass"
>>>  encoding="rawbytes"
>>  encoding="mojibake" # :-)
> You have a smiley, but I think that's the best name I've seen yet. It's
> explicit in what you get -- mojibake.
> The only downside is that it's a little obscure. Not everyone knows what
> mojibake is called, or calls it mojibake, although I suppose we could add
> aliases to other terms such as Buchstabensalat and Krähenfüße if German users
> complain<wink>
Alternatively, "vreemdetekens" or "alfabetsoep"...

> But remind me again, why are we doing this? If you have to teach people the
> recipe
>       open(filename, encoding='mojibake')
> why not just teach them the very slightly more complex recipe
>       open(filename, encoding='ascii', errors='surrogateescape')
> which captures the user's intent ("I want ASCII, with some way of escaping
> errors so I don't have to deal with them") much more accurately. Sometimes
> brevity is *not* a virtue.

More information about the Python-ideas mailing list