the stupid encoding problem to stdout

Laurent Claessens moky.math at gmail.com
Fri Jun 10 01:47:46 EDT 2011


Le 09/06/2011 04:18, Sérgio Monteiro Basto a écrit :
 > hi,
 > cat test.py
 > #!/usr/bin/env python
 > #-*- coding: utf-8 -*-
 > u = u'moçambique'
 > print u.encode("utf-8")
 > print u
 >
 > chmod +x test.py
 > ../test.py
 > moçambique
 > moçambique


The following tries to encode before to print. If you pass an already 
utf-8 object, it just print it; if not it encode it. All the "print" 
statements pass by MyPrint.write

#!/usr/bin/env python
#-*- coding: utf-8 -*-

import sys

class MyPrint(object):
     def __init__(self):
         self.old_stdout=sys.stdout
         sys.stdout=self
     def write(self,text):
         try:
             encoded=text.encode("utf8")
         except UnicodeDecodeError:
             encoded=text
         self.old_stdout.write(encoded)


MyPrint()

u = u'moçambique'
print u.encode("utf-8")
print u

TEST :

$ ./test.py
moçambique
moçambique

$ ./test.py > test.txt
$ cat test.txt
moçambique
moçambique


By the way, my code will not help for error message. I think that the 
errors are printed by sys.stderr.write. So if you want to do
raise "moçambique"
you should think about add stderr to the class MyPrint


If you know French, I strongly recommend "Comprendre les erreurs 
unicode" by Victor Stinner :
http://dl.afpy.org/pycon-fr-09/Comprendre_les_erreurs_unicode.pdf

Have a nice day
Laurent



More information about the Python-list mailing list