string.replace non-ascii characters
Steven Bethard
steven.bethard at gmail.com
Mon Feb 12 00:23:59 EST 2007
Samuel Karl Peterson wrote:
> Greetings Pythonistas. I have recently discovered a strange anomoly
> with string.replace. It seemingly, randomly does not deal with
> characters of ordinal value > 127. I ran into this problem while
> downloading auction web pages from ebay and trying to replace the
> "\xa0" (dec 160, nbsp char in iso-8859-1) in the string I got from
> urllib2. Yet today, all is fine, no problems whatsoever. Sadly, I
> did not save the exact error message, but I believe it was a
> ValueError thrown on string.replace and the message was something to
> the effect "character value not within range(128).
Was it something like this?
>>> u'\xa0'.replace('\xa0', '')
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 0:
ordinal not in range(128)
You might get that if you're mixing str and unicode. If both strings are
of one type or the other, you should be okay:
>>> u'\xa0'.replace(u'\xa0', '')
u''
>>> '\xa0'.replace('\xa0', '')
''
STeVe
More information about the Python-list
mailing list