Unicode Hell
Fredrik Lundh
fredrik at pythonware.com
Fri Nov 7 04:20:39 EST 2003
Stuart Forsyth wrote:
> The replace string in this case is actually the contents of a file. I
> have simplified it for the purposes of the example. The file I'm doing
> the replace on is a web archive (.mht) file. Within that file are a
> number of different replace fields e.g. #name# #organisation# etc..
> Everything was working fine until the replace function tried to replace
> the #name# replace field with a posting variable that had a tilde in it.
> The script then moaned about it being non-ascii and crashed. The exact
> error is:
>
> Error Type:
> Python ActiveX Scripting Engine (0x80020009)
> Traceback (most recent call last): File "<Script Block >", line 80, in ?
> FileContents =
> FileContents.replace('Repl_learner',str(Request("learner"))) File
> "C:\Python23\lib\site-packages\win32com\client\dynamic.py", line 169, in
> __str__ return str(self.__call__()) UnicodeEncodeError: 'ascii' codec
> can't encode characters in position 5-9: ordinal not in range(128)
if you're replacing parts of a Unicode string with the contents of a non-
Unicode string, Python assumes that the second string contains only
plain ASCII.
if it doesn't, you have to tell Python what encoding you're using in the
second string; there's no way Python can figure that out by itself. here's
how to do that:
myunicodestring.replace(tag, replacestring.decode("iso-8859-1"))
also see item 5 on this page:
http://effbot.org/zone/unicode-objects.htm
</F>
More information about the Python-list
mailing list