[Tutor] encodings
Denis Dzyubenko
shad@mail.kubtelecom.ru
Mon Jun 16 08:19:02 2003
On Sat, 14 Jun 2003 23:30:20 +0200,
Magnus Lyck(ML) wrote to me:
ML> Clearer now?
Yes, now it is clear.
ML> Use type() to check what type you have in each situation.
>> >>> s =3D u"abc=C1=C2=D7"
>> >>> s.encode('cp1251')
>>Traceback (most recent call last):
>> File "<stdin>", line 1, in ?
>> File "/usr/lib/python2.1/encodings/cp1251.py", line 18, in encode
>> return codecs.charmap_encode(input,errors,encoding_map)
>>UnicodeError: charmap encoding error: character maps to <undefined>
ML> That means that your unicode string contains values that CP1251
ML> can't present. Does "print s" produce the output you would expect?
no, 'print s' prodices error:
'UnicodeError: ASCII encoding error: ordinal not in range(128)'
ML> How does it look if you do "print s"? Does it look like cyrillic?
ML> what about "print repr(s)". Are all values in the correct range?
now, values are not listed in the link you gave me
(http://www.oasis-open.org/docbook/xmlcharent/0.3/iso-cyr1.ent)
>>ML> txt.decode('koi8-r').encode('cp1251')
>>
>> >>> txt.decode('koi8-r')
>>Traceback (most recent call last):
>> File "<stdin>", line 1, in ?
>>AttributeError: decode
ML> That means that txt is not an object of type string. If it's
>>> txt =3D "=C1=C2=D7"
>>> type(txt)
<type 'string'>
>>> txt.decode("koi8-r")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: decode
and dir(txt) doesn't contain attribute 'decode'
ML> Look here:
>>>> u =3D u'\u042F\u042B\u042C'
ML> Now we have a unicode representaion with three
ML> cyrillic letters. You should be able to do
ML> "print u" and see something reasonable. I start
no, I can't see anything reasonable:
>>> u =3D u'\u042F\u042B\u042C'
>>> u
u'\u042f\u042b\u042c'
>>> print u
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeError: ASCII encoding error: ordinal not in range(128)
ML> If you still have problems, look at the error handling issues
ML> I wrote about.
sorry, I still can't understand source of my problems :(
--=20
Denis.