[Python-Dev] Python3 "complexity" (was RFC: PEP 460: Add bytes...)

Dan Stromberg drsalists at gmail.com
Thu Jan 9 05:15:54 CET 2014


On Wed, Jan 8, 2014 at 2:04 PM, Kristján Valur Jónsson
<kristjan at ccpgames.com> wrote:
>
> Believe it or not, sometimes you really don't care about encodings.
> Sometimes you just want to parse text files.  Python 3 forces you to think about abstract concepts like encodings when all you want is to open that .txt file on the drive and extract some phone numbers and merge in some email addresses.  What encoding does the file have?  Do I care?  Must I care?

If computers had taken off in China before the USA, you'd probably be
wondering why some Chinese refuse to care about encodings, when the
rest of the world clearly needs them.

Yes, you really should care about encodings.  No, it's not quite as
simple as it once was for English speakers as it once was.  It was
formerly simple (for us) because we were effectively pressing everyone
else to read and write English.

If you want to keep things close to what you're used to, use latin-1
as your encoding.  It's still a choice, and not a great one for
user-facing text, but if you want to be simplistic about it, that's a
way to do it.

That said, there will be some text that isn't user-facing, EG in a
network protocol.  This is probably what all the fuss is about.  But
like I said, this can be done with latin-1.


More information about the Python-Dev mailing list