[Python-Dev] Shouldn't I be able to print Unicode objects?

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Tue, 5 Jun 2001 22:30:18 +0200

> How about if print calls the .encode("latin1") method for me it gets an
> ASCII encoding error?  If "latin1" isn't a reasonable default choice, it
> could pick an encoding based on the current locale.

These are both bad ideas. First, there is no guarantee that your
terminal is capable of displaying the circle at all. Maybe the
typewriter connected to your computer doesn't even have a degree type.

Further, maybe it does support displaying the degree sign, but then it
likely fails for

>>> print u"\N{EURO SIGN}"

Or, worse, instead of displaying the EURO SIGN, it may just display
the CURRENCY SIGN (since it may chose to use ISO-8859-15, but the
terminal assumes ISO-8859-1).

So unless you can come up with a really good way to find out what the
terminal is capable of displaying (plus finding out how to make it
display these things), I think Python is better off raising an
exception than producing garbage output.

In addition, what you see is the "default encoding", i.e. it doesn't
just apply to print; it also applies to all places where Unicode
objects are converted into byte strings.

Assuming any default other than ASCII has been considered as a bad
idea by the authors of the Unicode support. IMO, the next-most
reasonable default would have been UTF-8, *not* Latin-1, since UTF-8
can represent the EURO SIGN and every other character in Unicode. Most
likely, you terminal will have difficulties producing a circle symbol
when it gets the UTF-8 representation of the DEGREE SIGN, though.

So the best thing is still to give it into the hands of the
application author. As MAL points out, the administrator can give a
different default encoding in site.py. Since the default default is
ASCII, applications assuming that the default is ASCII won't break on
your system. OTOH, applications developed on your system may then
break elsewhere, since the default in site.py might be different.