Re: [Python-Dev] [Python-3000] PEP 3138- String representation in Python 3000
2008/5/15 Atsuo Ishimoto <ishimoto@gembook.org>:
With my proposal, print("Hello\u03C8") prints "Hello\u03C8" instead of raising an exception. And print(repr("Hello\u03C8")) prints "'Hello\u03C8'", so no garbage are printed.
Now, let's say you are Greek and working on Greek version of XP. print("Hello\u03C8") prints "Hello"+collect Greek character(GREEK SMALL LETTER PSI). And print(repr("Hello\u03C8")) prints "'Hello"+collect Greek character+"'". If you have Greek font, you can try this if you swich your command prompt by "chcp 1253" (change codepage to 1253) on your command prompt.
[...]
Python detects user's capabilities, since Python 2.x(or 1.6? I forgot.) On Windows, Python detects user's encoding from codepage. On Unix, locale is used to detect encoding.
Ah, thanks. I hadn't realised this - I've had trouble printing Unicode in the past, and assumed it was a result of Windows' strange console handling (OEM code pages vs Windows code pages confuse me). I use Unicode so rarely that it wasn't worth worrying about it, though. I guess the problem was my understanding, rather than code page detection not working. Sorry for the confusion.
Like it or not, a large proportion of Python's users still work in environments where much of the Unicode character space is not displayed readably.
I agree. So rejecting my proposal as "Not common use-case" might be reasonable. But I should argue to get sympathy, anyway:).
As Oleg pointed out, my comment "a large proportion" was a guess, and an unfounded one at that. And regardless, you definitely have my sympathy, this is an issue that needs solving :-) (Heck, just the fact that you have to write your emails to this group in a foreign language is enough to get you my sympathy!!!)
I can understand your aware. Perhaps you don't want see your terminal flash by escape sequence, beep, endless graphic characters, etc. For legacy byte-string applications(whether written in C or Python), printing arbitrary string can cause such mess. But this is unlikely to happen by printing the Unicode string, since the characters your terminal cannot understand will be escaped or be converted to character such as '?'.
Ah, that's what the switching of the error mode is for. I understand more clearly now.
Hope this helps.
It does - thanks for being patient with me. Paul.
participants (1)
-
Paul Moore