Strange problems with encoding

Peter Otten __peter__ at web.de
Thu Nov 6 09:03:08 EST 2003


Sebastian Meyer wrote:

> Hi newsgroup,
> 
> i am trying to replace german special characters in strings like
>     str = re.sub('ö', 'oe', str)
> 
> When i work with this, i always get the message
> UniCode Error: ASCII decoding error : ordinal not in range(128)
> 
> Yes i  have googled, i searched the faq, manual and python library and
> searched all known soruces of information. I played with the python
> builtin function encode to enforce the rigth encoding, but the error
> stays the same. I ve read a lot about UniCode and internal conversion
> about Strings done by python, but somehow i ve missed the clue.
> Nope, python says Huuups... ordinal not in range(128), ;-(
> 
> Anyone of you having any idea?? Seems like i am too stupid to read
> documentation carefully., perhaps i misunderstand something...
> 
> thanks for your help in advance
> 
> Sebastian

Works here, even with my older snake:

Python 2.2.1 (#1, Sep 10 2002, 17:49:17)
[GCC 3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.sub("ö", "oe", "Döspaddel")
'Doespaddel'
>>> re.sub("ö", "oe", u"Döspaddel")
u'Doespaddel'
>>> re.sub("ö", u"oe", u"Döspaddel")
u'Doespaddel'
>>> re.sub(u"ö", u"oe", u"Döspaddel")
u'Doespaddel'

To provoke a UnicodeError, I have to convert a unicode string with umlauts
to str without providing the encoding:

>>> str(u"Döspaddel")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: ASCII encoding error: ordinal not in range(128)

I suspect that you have something similar hidden in your code (i. e.
characters >= 128 that are not converted). The remedy is to explicitly
decode with the appropriate encoding:

>>> u"Döspaddel".encode("latin-1")
'D\xf6spaddel'
>>>

Try to build a minimal script that shows the reported behaviour and fix it
or post it for more detailed  advice. By the way, don't use str as a
variable name, it's the type of "ordinary" strings.

Peter





More information about the Python-list mailing list