'ascii' codec can't encode character u'\xf3'
"Martin v. Löwis"
martin at v.loewis.de
Tue Aug 17 14:17:41 EDT 2004
Martin Slouf wrote:
>>- print a repr() of the unicode object instead of
>> the unicode object itself. This will work on all
>> terminals, and show hex escapes of non-ASCII characters.
>
>
> just to make sure:
>
> override the object's __repr__(self) method to st. like:
>
> class my_string(string):
> def __repr__(self)
> tmp = unicode(self.attribute1 + " " + self.attribute2)
> return tmp
>
> and use 'my_string' class without any worries instead of classical
> string?
No. Assume yyy is a Unicode object which potentially contains
non-printable characters. Instead of doing
print yyy
do
print repr(yyy)
> my system is debian GNU/Linux stable, im using it for a very, very long
> time, though i did not changed any terminal settings but the very
> basics. My locales are properly set, im using LC_* environment
> variables to set default locale to czech environment with ISO-8859-2
> charset. Terminal is capable of displaying 8bit charsets, im not sure
> about unicode charsets -- never tried, never needed.
I see. Could it be that you are using Python 2.1, then? Because in
Python 2.3, printing Czech characters to the terminal should work
just fine. Please do
Python 2.3.4 (#2, Aug 5 2004, 09:33:45)
[GCC 3.3.4 (Debian 1:3.3.4-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.stdout.encoding
'ISO-8859-15'
> if 0:
> # Enable to support locale aware default string encodings.
> import locale
> loc = locale.getdefaultlocale()
> if loc[1]:
> encoding = loc[1]
>
> so i guess it is never done :(
You don't need to change the default encoding. Instead,
sys.stdout.encoding is used for printing to the terminal (in 2.3 and
later).
> did you yourself changed it?
No. It will work out of the box.
> well, if a piece of information like you gave to me was contained in
> standard python documentation, probably there will be less
> misunderstanding about this issue.
What piece specifically are you referring to? It is all mentioned
in the standard Python documentation.
> #! /usr/bin/env python
> # -*- coding: UTF-8 -*-
> at the begginnig of my every script, the example above still has to
> be converted -- because of the iso-8859-1 you use in "Löwis"?
Yes, and no. Yes, it still has to be converted. UTF-8 is *not*
Unicode; it is a byte encoding, and you cannot mix Unicode
strings and byte strings. No, if I use UTF-8 in my source code,
then "Löwis" will be encoded in UTF-8, not in ISO-8859-1.
> can i ommit the conversion (ie. is it done automatically for me as if
> i write
> u"Martin v. " + unicode("Löwis", "ISO-8859-1")
> )?
You can, but you shouldn't. So I won't tell you how you could do that.
> dont understand -- which library?
The ODBC library, for example, or PyQt.
Regards,
Martin
More information about the Python-list
mailing list