WTF? Printing unicode strings
Robert Kern
robert.kern at gmail.com
Thu May 18 19:35:07 EDT 2006
Ron Garret wrote:
> I'm using an OS X terminal to ssh to a Linux machine.
Click on the "Terminal" menu, then "Window Settings...". Choose "Display" from
the combobox. At the bottom you will see a combobox title "Character Set
Encoding". Choose "Unicode (UTF-8)".
> But what about this:
>
>>>>f2=open('foo','w')
>>>>f2.write(u'\xFF')
>
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xff' in
> position 0: ordinal not in range(128)
>
> That should have nothing to do with my terminal, right?
Correct, that is a different problem. f.write() expects a string of bytes, not a
unicode string. In order to convert unicode strings to byte strings without an
explicit .encode() method call, Python uses the default encoding which is
'ascii'. It's not easily changeable for a good reason. Your modules won't work
on anyone else's machine if you hack that setting.
> I just found http://www.amk.ca/python/howto/unicode, which seems to be
> enlightening. The answer seems to be something like:
>
> import codecs
> f = codecs.open('foo','w','utf-8')
>
> but that seems pretty awkward.
<shrug> About as clean as it gets when dealing with text encodings.
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
More information about the Python-list
mailing list