print() and unicode strings (python 3.1)
Ned Deily
nad at acm.org
Tue Aug 25 00:09:53 EDT 2009
In article
<e5e2ec2e-2b4a-4ca8-8c0f-109e5f4eb542 at v23g2000pro.googlegroups.com>,
7stud <bbxx789_05ss at yahoo.com> wrote:
> On Aug 24, 2:41 pm, "Martin v. Löwis" <mar... at v.loewis.de> wrote:
> > > I can't figure out a way to programatically set the encoding for
> > > sys.stdout. So where does that leave me?
> >
> > You should be setting the terminal encoding administratively, not
> > programmatically.
> >
>
> The terminal encoding has always been utf-8. It was not set
> programmatically.
>
> It seems to me that python 3.1's string handling is broken.
> Apparently, in python 3.1 I am unable to explicitly set the encoding
> of a string and print() it out with the result being human readable
> text. On the other hand, if I let python do the encoding implicitly,
> python uses a codec I don't want it to.
If you are running on a Unix-y system, check your locale settings (LANG,
LC.*, et al). I think you'll likely find that your locale is really not
UTF-8. The following was on Python 3.1 on OS X 10.5, similar results
on Debian Linux:
$ cat t3.py
import sys
print(sys.stdout.encoding)
s = "¤"
print(s.encode("utf-8"))
print(s)
$ export LANG=en_US.UTF-8
$ python3.1 t3.py
UTF-8
b'\xe2\x82\xac'
¤
$ export LANG=C
$ python3.1 t3.py
US-ASCII
b'\xe2\x82\xac'
Traceback (most recent call last):
File "t3.py", line 7, in <module>
print(s)
UnicodeEncodeError: 'ascii' codec can't encode character '\u20ac' in
position 0: ordinal not in range(128)
--
Ned Deily,
nad at acm.org
More information about the Python-list
mailing list