[Python-ideas] Py3 unicode impositions

Carl M. Johnson cmjohnson.mailinglist at gmail.com
Sun Feb 12 03:27:27 CET 2012

On Feb 11, 2012, at 12:40 AM, Paul Moore wrote:

> In Python 2, I can ignore the issue. Sure, I can end up with mojibake,
> but for my uses, that's not a disaster. Mostly-readable works. But in
> Python 3, I get an error and can't process the file.
> I can just use latin-1, or surrogateescape. But that doesn't come
> naturally to me yet. Maybe it will in time... Or maybe there's a
> better solution I don't know about yet.

I'm confused what you're asking for. Setting errors to surrogateescape or encoding to Latin-1 causes Python 3 to behave the exact same way as Python 2: it's doing the "wrong" thing and may result in mojibake, but at least it isn't screwing up anything new so long as the stuff you add to the file is in ASCII. The only way to make Python 3 slightly more like Python 2 would be to set errors="surrogateescape" by default instead of asking the programmer to know to use it. I think that would be going too far, but it could be done. I think it would be simpler though to just publicize errors="surrogateescape" more. 

"Dear people who don't care about encodings and don't want to take the time to get them right, just put errors='surrogateescape' into your open commands and Python 3 will behave almost exactly like Python 2. The end." 

Is that really so hard? I'm confused about what else people want. 

More information about the Python-ideas mailing list