string.replace non-ascii characters

Steven Bethard steven.bethard at gmail.com
Mon Feb 12 00:23:59 EST 2007


Samuel Karl Peterson wrote:
> Greetings Pythonistas.  I have recently discovered a strange anomoly
> with string.replace.  It seemingly, randomly does not deal with
> characters of ordinal value > 127.  I ran into this problem while
> downloading auction web pages from ebay and trying to replace the
> "\xa0" (dec 160, nbsp char in iso-8859-1) in the string I got from
> urllib2.  Yet today, all is fine, no problems whatsoever.  Sadly, I
> did not save the exact error message, but I believe it was a
> ValueError thrown on string.replace and the message was something to
> the effect "character value not within range(128).

Was it something like this?

 >>> u'\xa0'.replace('\xa0', '')
Traceback (most recent call last):
   File "<interactive input>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 0: 
ordinal not in range(128)

You might get that if you're mixing str and unicode. If both strings are 
of one type or the other, you should be okay:

 >>> u'\xa0'.replace(u'\xa0', '')
u''
 >>> '\xa0'.replace('\xa0', '')
''

STeVe



More information about the Python-list mailing list