unicode issue

Piet van Oostrum piet at cs.uu.nl
Wed Sep 30 10:35:11 EDT 2009


>>>>> Dave Angel <davea at dejaviewphoto.com> (DA) wrote:

>DA> Works for me:

>DA> rrr = downcode(u"Žabovitá zmiešaná kaša")
>DA> print repr(rrr)
>DA> print rrr

>DA> prints out:

>DA> u'Zabovita zmiesana kasa'
>DA> Zabovita zmiesana kasa

>DA> I did have to add an encoding declaration as line 2 of the file:

>DA> #-*- coding: latin-1 -*-

>DA> and I had to convince my editor (Komodo) to save the file in utf-8.

*Seems to work*.
If you save in utf-8 the coding declaration also has to be utf-8.
Besides, many of these characters won't be representable in latin-1.
The reason it worked is that these characters were translated into two-
or more-bytes sequences and replace did work with these. But it's
dangerous, as they are then no longer the unicode characters they were
intended to be. 
-- 
Piet van Oostrum <piet at vanoostrum.org>
WWW: http://pietvanoostrum.com/
PGP key: [8DAE142BE17999C4]



More information about the Python-list mailing list