[New-bugs-announce] [issue5110] Printing Unicode chars from the interpreter in a non-UTF8 terminal (Py3)

Ezio Melotti report at bugs.python.org
Fri Jan 30 16:26:49 CET 2009


New submission from Ezio Melotti <ezio.melotti at gmail.com>:

In Py2.x
>>> u'\2620'
outputs u'\2620' whereas
>>> print u'\2620'
raises an error.

Instead, in Py3.x, both
>>> '\u2620'
and
>>> print('\u2620')
raise an error if the terminal doesn't use an encoding able to display
the character (e.g. the windows terminal used for these examples).

This is caused by the new string representation defined in the PEP3138[1].

Consider also the following example:
Py2:
>>> [u'\u2620']
[u'\u2620']
Py3:
>>> ['\u2620']
UnicodeEncodeError: 'charmap' codec can't encode character '\u2620' in
position 9: character maps to <undefined>

This means that there is no way to print lists (or other objects) that
contain characters that can't be encoded.
Two workarounds may be:
1) encode all the elements of the list, but it's not practical;
2) use ascii(), but it adds extra "" around the output and escape
backslashes and apostrophes (and it won't be possible to use _[0] in the
next line).
 
Also note that in Py3
>>> ['\ud800']
['\ud800']
>>> _[0]
'\ud800'
works, because U+D800 belongs to the category "Cs (Other, Surrogate)"
and it is escaped[2].

The best solution is probably to change the default error-handler of the
Python3 interactive interpreter to 'backslashreplace' in order to avoid
this behavior, but I don't know if it's possible only for ">>> foo" and
not for ">>> print(foo)" (print() should still raise an error as it does
in Py2).

This proposal has already been refused in the PEP3138[3] but there are
no links to the discussion that led to this decision.

I think this should be rediscussed and possibly changed, because, even
if can't see the "listOfJapaneseStrings"[4], I still prefer to see a
sequence of escaped chars than a UnicodeEncodeError.

[1]: http://www.python.org/dev/peps/pep-3138/
[2]: http://www.python.org/dev/peps/pep-3138/#specification
[3]: http://www.python.org/dev/peps/pep-3138/#rejected-proposals
[4]: http://www.python.org/dev/peps/pep-3138/#motivation

----------
components: Unicode
messages: 80820
nosy: ezio.melotti
severity: normal
status: open
title: Printing Unicode chars from the interpreter in a non-UTF8 terminal (Py3)
type: behavior
versions: Python 3.0

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue5110>
_______________________________________


More information about the New-bugs-announce mailing list