[Python-3000] PEP 3138- String representation in Python 3000

Guido van Rossum guido at python.org
Fri May 23 16:22:00 CEST 2008


On Fri, May 23, 2008 at 12:28 AM, Atsuo Ishimoto <ishimoto at gembook.org> wrote:
> On Fri, May 23, 2008 at 1:30 PM, Guido van Rossum <guido at python.org> wrote:
>
>>> One point still remains is default error handler for sys.stdout. I can
>>> live with 'strict' error handler, but I think raising exceptions for
>>> evenry un-supported characters by default is too exacting.
>>
>> I think to avoid exceptions you should arrange for the encoding to be
>> capable of encoding all characters (e.g. utf8 or utf16).
>
> The utf-8 console is fine for my personal development style, I'm
> afraid it doesn't work for you. Whether your console is capable to
> display Japanese characters or not, you will want to see Japanese
> characters in hex-escaped characters, don't you?

Personally, I can live with it. I rarely generate Japanese text so I
doubt it'll be a problem. I can also change the console encoding and
error handler.

>> IMO it's important to trust that you didn't write garbage, unless you
>> specifically asked for it.
>
> Is this requested by users? With Python 2, we can always print strings
> containing garbage without exceptions. Python 3 is much stricter in
> this respect. To get meaningful information instead of tracebacks, we
> need to know encoding of output device and characters to be printed
> whenever we print strings. This is hard to be accomplished in
> practice.

Tracebacks should always go to stderr.

What I meant by "not writing garbage" was for some app that e.g. acts
like a filter or otherwise produces output (on stdout) for another
program to consume. The other program might not understand \u escapes.
I'd rather trap this when writing, not when reading the garbage
several stages later.

IOW:

- stderr (and probably also interactive stdout): set backslashreplace
- stdout (if not interactive): strict

Default encoding taken from environment in all cases.

>> PS> I couldn't get backslashescape to work -- is this just a proposal?
>
> No. Works for me without any modifications. I tried with latest source form svn.
>
> Python 3.0a5+ (py3k:63546, May 23 2008, 13:42:06) [MSC v.1500 32 bit (Intel)] on
>  win32
>>>> "パイソン".encode("ascii", "backslashreplace")
> b'\\u30d1\\u30a4\\u30bd\\u30f3'
> [39364 refs]

Ah, backspashreplace, not backslashescape. :-)


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list