Problem with national characters
Leif B. Kristensen
abuse at solumslekt.org
Thu Mar 31 18:02:30 EST 2005
Leif B. Kristensen skrev:
>Is there something else I have to do?
Please forgive me for talking with myself here :-) I should have looked
up Unicode in "Learning Python" before I asked. This seems to work:
>>> u'før'.upper()
u'F\xd8R'
>>> u'FØR'
u'F\xd8R'
>>> 'FØR'
'F\xd8R'
So far, so good. Note that the Unicode representation of the uppercase
version is identical to the default. But when I try the builtin
function unicode(), weird things happen:
>>> s='FØR'
>>> s
'F\xd8R'
>>> unicode(s)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1-2:
invalid data
The ActivePython 2.3.2 doesn't even seem to understand the 'u' prefix.
So even if I can get this to work on my own Linux machine, it hardly
looks like a portable solution.
Seems like the "solution" is to keep away from letters above ASCII-127,
like we've done since the dawn of computing ...
--
Leif Biberg Kristensen
http://solumslekt.org/
More information about the Python-list
mailing list