Unicode -> String problem
Jay Parlar
jparlar at home.com
Sun Jul 8 16:42:09 EDT 2001
I'm having a problem converting unicode text to string type with str().
The code snippet causing the problem is
if type(pageText) == UnicodeType:
newText = str(pageText)
and the error message I receive is
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "D:\MyData\HOME\PWA\Scripts\filter.py", line 60, in parser
newText = str(pageText)
UnicodeError: ASCII encoding error: ordinal not in range(128)
Now, I know there is a lot of precedence for these "...ordinal not in range(128)" questions, but I've looked around, and I
haven't found anything that will explicitly do what I want, namely, completely remove any uncovertable unicode characters. I
have to be able to parse this text afterwards, using a lot of Python's string functions, so I need 'newText' to be a string, but I'd
really prefer not to have the various unicode strings (ie \xa0) showing up. Is there a simple way to convert the unicode text to
StringType, removing the resulting unicode strings for unrepresentable characters?
Thanks in advance to anyone who's looking at this,
Jay P.
More information about the Python-list
mailing list