Getting the encoding of sys.stdout and sys.stdin, and changing it properly

velle at velle.dk velle at velle.dk
Tue Jan 3 10:08:46 EST 2006


My headache is growing while playing arround with unicode in Python,
please help this novice. I have chosen to divide my problem into a few
questions.

Python 2.3.4 (#1, Feb  2 2005, 12:11:53)
[GCC 3.4.2 20041017 (Red Hat 3.4.2-6.fc3)] on linux2

1)
Does " >>>print 'hello' " simply write to sys.stdout?

2)
Exactly what does the following line return?

>>> sys.stdout.encoding
'ISO-8859-1'

Is it the encoding of the terminal? I think not, because when I change
the encoding in my terminal the result is still the same.

Is it the encoding of the string python "hands over" to the terminal? I
think not. In the following code i am pretty confident that the second
command changes that, and still sys.stdout.encoding is the same value.

>>> import sys,codecs
>>> sys.stdout.encoding
'ISO-8859-1'
>>> sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
>>> sys.stdout.encoding
'ISO-8859-1'

Then what?

3)
Does raw_input() come from sys.stdin?

4)
The following script is not working, can you please tell me how to do
it right.

>>> import codecs,sys
>>> sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
>>> sys.stdin = codecs.getreader('utf-8')(sys.stdin)
>>> x = raw_input('write this unicode letter, Turkish che, unicode 0x00E7\t')
write this unicode letter, Turkish che, unicode 0x00E7  ç
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/lib/python2.3/codecs.py", line 295, in readline
    return self.decode(line, self.errors)[0]
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-1:
unexpected end of data

When prompted, I simply enter the che with my Turkish keyboard layout.

velle, Denmark




More information about the Python-list mailing list