q: how to output a unicode string?
Frank Stajano
usenet423.4.fms at neverbox.com
Tue Apr 24 12:32:01 EDT 2007
A simple unicode question. How do I print?
Sample code:
# -*- coding: utf-8 -*-
s1 = u"héllô wórld"
print s1
# Gives UnicodeEncodeError: 'ascii' codec can't encode character
# u'\xe9' in position 1: ordinal not in range(128)
What I actually want to do is slightly more elaborate: read from a text
file which is in utf-8, do some manipulations of the text and print the
result on stdout. I understand I must open the file with
f = codecs.open("input.txt", "r", "utf-8")
but then I get stuck as above.
I tried
s2 = s1.encode("utf-8")
print s2
but got
héllô wórld
Then, in the hope of being able to write the string to a file if not to
stdout, I also tried
import codecs
f = codecs.open("out.txt", "w", "utf-8")
f.write(s2)
but got
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1:
ordinal not in range(128)
So I seem to be stuck.
I have checked several online python+unicode pages, including
http://boodebr.org/main/python/all-about-python-and-unicode#WHYNOPRINT
http://evanjones.ca/python-utf8.html
http://www.reportlab.com/i18n/python_unicode_tutorial.html
http://www.amk.ca/python/howto/unicode
http://www.example-code.com/python/python-charset.asp
http://docs.python.org/lib/csv-examples.html
but none of them was sufficient to make me understand how to deal with
this simple problem. I'm sure it's easy, maybe too easy to be worth
explaining in a tutorial...
Help gratefully received.
More information about the Python-list
mailing list