handling unicode data

Marc 'BlackJack' Rintsch bj_666 at gmx.net
Wed Jun 28 16:46:05 EDT 2006


In <1151516742.596423.157450 at b68g2000cwa.googlegroups.com>, Filipe wrote:

> The error I'm getting is beeing thrown when I print the value to the
> console. If I just convert it to unicode all seems ok (except for not
> beeing able to show it in the console, that is... :).
> 
> For example, when I try this:
> print unicode("Fran\xd8a", "iso-8859-1")
> 
> I get the error:
> Traceback (most recent call last):
>   File "a.py", line 1, in ?
>     print unicode("Fran\xd8a", "iso-8859-1")
>   File "c:\Program Files\Python24\lib\encodings\cp437.py", line 18, in
> encode
>     return codecs.charmap_encode(input,errors,encoding_map)
> UnicodeEncodeError: 'charmap' codec can't encode character u'\xd8' in
> position 4
> : character maps to <undefined>

The `unicode()` call doesn't fail here but the ``print`` because printing
unicode strings means they have to be encoded into a byte string again. 
And whatever encoding the target of the print (your console) uses, it
does not contain the unicode character u'\xd8'.  From the traceback it
seems your terminal uses `cp437` as encoding.

As you can see here: http://www.wordiq.com/definition/CP437 there's no Ø
in that character set.

Ciao,
	Marc 'BlackJack' Rintsch



More information about the Python-list mailing list