[Tutor] Python and unicode

Ferry Dave Jäckel dave.jaeckel at arcor.de
Sat Mar 11 18:06:16 CET 2006


Hi Michael and Kent,

thanks to your tips I was able to solve my problems! It was quite easy at 
last.

For those interested and struggling with utf-8, ascii and unicode:

After knowing the right way of
   - string.decode() upon input (if in question)
   - string.encode() upon output (more often then not)
   where input and output are reading and writing to files, file-like 
   objects, databases... and functions of some not unicode-proof modules
I got rid of all calls to encode() and decode() I made by trial and error  
and which messed it all up. Now I have just a few calls to encode() and 
voilá! xml.sax seems to read and decode the utf-8 encoded xml-file 
perfectly right, so do ZipFile.read() and file.write() - no encding oder 
decoding.

To me it was very important to stress out that utf-8 ist *not* unicode, 
although I have already read about this topic (and you can read this advise 
often here at this list).

On my system sys.stdout and sys.stderr seem to have a utf-8 and a None 
encoding, respectively (Kubuntu Linux, python2.4, ipython and konsole as 
terminal).

The wrapper suggested by Kent
  sys.stdout = codecs.getwriter('utf-8')(sys.stdout, 'backslashreplace')
  sys.stderror = codecs.getwriter('ascii')(sys.stderror, 'backslashreplace')
solves all my output problems regarding debugging.

Thank you for your help!
  Dave

P.s.: The quotations in my signature are by chance, really. Normally I'm not 
the kind of guy believing in prevision... ;)

-- 
I never realized it before, but having looked that over I'm certain I'd 
rather
have my eyes burned out by zombies with flaming dung sticks than work on a
conscientious Unicode regex engine.
      -- Tim Peters, 3 Dec 1998
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/tutor/attachments/20060311/232256e7/attachment.pgp 


More information about the Tutor mailing list