[Python-ideas] Python 3000 TIOBE -3%
steve at pearwood.info
Wed Feb 15 00:35:11 CET 2012
> On 14/02/2012 21:43, Jim Jewett wrote:
>> On Tue, Feb 14, 2012 at 6:39 AM, Carl M. Johnson
>> <cmjohnson.mailinglist at gmail.com> wrote:
>>> OK, so concrete proposals: update the docs and maybe make a
>>> synonym for Latin-1 that makes it more semantically obvious that
>>> you're not really using it as Latin-1, just as a easy to pass through
>>> encoding. Anything else? Any bike shedding on the synonym?
>> encoding="ascii-ish" # gets the sloppyness right
>> encoding="passthrough" # I would like "ignore", if it wouldn't cause
>> confusion with the errorhandler
"Ignore" won't do. Ignore what? Everything? Don't actually run an encoder?
That doesn't even make sense!
"Passthrough" is bad too, because it perpetrates the idea that ASCII
characters are "plain text" which are bytes. Unicode strings, even those that
are purely ASCII, are not strings of bytes (except in the sense that every
data structure is a string of bytes). You can't just "pass bytes through" to
turn them into Unicode.
> encoding="mojibake" # :-)
You have a smiley, but I think that's the best name I've seen yet. It's
explicit in what you get -- mojibake.
The only downside is that it's a little obscure. Not everyone knows what
mojibake is called, or calls it mojibake, although I suppose we could add
aliases to other terms such as Buchstabensalat and Krähenfüße if German users
But remind me again, why are we doing this? If you have to teach people the
why not just teach them the very slightly more complex recipe
open(filename, encoding='ascii', errors='surrogateescape')
which captures the user's intent ("I want ASCII, with some way of escaping
errors so I don't have to deal with them") much more accurately. Sometimes
brevity is *not* a virtue.
More information about the Python-ideas