[Python-ideas] Fix default encodings on Windows

Chris Barker chris.barker at noaa.gov
Tue Aug 16 12:12:32 EDT 2016


Thanks for the clarity, Steve, a couple questions/thoughts:

The choices are:
>
> * don't represent them at all (remove bytes API)
>

Would the bytes API be removed on *nix also?


> * convert and drop characters not in the (legacy) active code page
> * convert and fail on characters not in the (legacy) active code page
>

"Failure is not an option" -- These two seem like a plain old bad idea.

* convert and fail on invalid surrogate pairs
>

where would an invalid surrogate pair come from? never from a file system
API call, yes?

* represent them as UTF-16-LE in bytes (with embedded '\0' everywhere)
>

would this be doing anything -- or just keeping whatever the Windows API
takes/returns? i.e. exactly what is done on *nix?


> The fifth option is the best for round-tripping within Windows APIs.
>

How is it better? only performance (i.e. no encoding/decoding required) --
or would it be more reliable as well?

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20160816/21b52429/attachment.html>


More information about the Python-ideas mailing list