[Python-ideas] PEP 540: Add a new UTF-8 mode
Victor Stinner
victor.stinner at gmail.com
Thu Jan 5 21:42:26 EST 2017
2017-01-06 3:10 GMT+01:00 Stephen J. Turnbull
<turnbull.stephen.fw at u.tsukuba.ac.jp>:
> The point of this, I suppose, is that piping to xargs works by
> default.
Please read the second version (latest) version of my PEP 540 which
contains a new "Use Cases" section which helps to define issues and
the behaviour of the different modes.
> I haven't read the PEPs (don't have time, mea culpa), but my ideal
> would be three options:
>
> --transparent -> errors=surrogateescape on input and output
> --postel -> errors=surrogateescape on input, =strict on output
> --unicode-me-harder -> errors=strict on input and output
PEP 540:
--postel is the default
--transparent is the UTF-8 mode
--unicode-me-harder is the UTF-8 configured to strict
The POSIX locale enables --transparent.
> with --postel being default. Unix afficianados with lots of xargs use
> can use --transparent. Since people have different preferences, I
> guess there should be an envvar for this.
The PEP adds new -X utf8 command line option and PYTHONUTF8
environment variable to configure the UTF-8 mode.
> Others probably should configure open() by open().
My PEP 540 does change the encoding used by open() by default:
https://www.python.org/dev/peps/pep-0540/#encoding-and-error-handler
Obviously, you can still explicitly set the encoding when calling open().
> I'll try to get to the PEPs over the weekend but can't promise.
Please read at least the abstract of my PEP 540 ;-)
Victor
More information about the Python-ideas
mailing list