[Python-Dev] [Python-3000] PEP 3138- String representation in Python 3000

Paul Moore p.f.moore at gmail.com
Fri May 16 09:49:22 CEST 2008


2008/5/15 Atsuo Ishimoto <ishimoto at gembook.org>:

> With my proposal, print("Hello\u03C8") prints "Hello\u03C8" instead of
> raising an exception. And print(repr("Hello\u03C8")) prints
> "'Hello\u03C8'", so no garbage are printed.
>
> Now, let's say you are Greek and working on Greek version of XP.
> print("Hello\u03C8") prints "Hello"+collect Greek character(GREEK
> SMALL LETTER PSI). And print(repr("Hello\u03C8")) prints
> "'Hello"+collect Greek character+"'". If you have Greek font, you can
> try this if you swich your command prompt by "chcp 1253"  (change
> codepage to 1253) on your command prompt.
>
[...]
> Python detects user's capabilities, since Python 2.x(or 1.6? I forgot.)
> On Windows, Python detects user's encoding from codepage. On Unix,
> locale is used to detect encoding.

Ah, thanks. I hadn't realised this - I've had trouble printing Unicode
in the past, and assumed it was a result of Windows' strange console
handling (OEM code pages vs Windows code pages confuse me). I use
Unicode so rarely that it wasn't worth worrying about it, though.

I guess the problem was my understanding, rather than code page
detection not working. Sorry for the confusion.

>> Like it or not, a large proportion of Python's users still work in
>> environments where much of the Unicode character space is not
>> displayed readably.
>>
>
> I agree. So rejecting my proposal as "Not common use-case" might be
> reasonable. But I should argue to get sympathy, anyway:).

As Oleg pointed out, my comment "a large proportion" was a guess, and
an unfounded one at that. And regardless, you definitely have my
sympathy, this is an issue that needs solving :-) (Heck, just the fact
that you have to write your emails to this group in a foreign language
is enough to get you my sympathy!!!)

> I can understand your aware. Perhaps you don't want see your terminal
> flash by escape sequence, beep, endless graphic characters, etc. For
> legacy byte-string applications(whether written in C or Python),
> printing arbitrary string can cause such mess. But this is unlikely to
> happen by printing the Unicode string, since the characters your
> terminal cannot understand will be escaped or be converted to
> character such as '?'.

Ah, that's what the switching of the error mode is for. I understand
more clearly now.

> Hope this helps.

It does - thanks for being patient with me.

Paul.


More information about the Python-Dev mailing list