[Tutor] Trouble in dealing with special characters.

Alan Gauld alan.gauld at yahoo.co.uk
Fri Dec 7 03:57:35 EST 2018


On 07/12/2018 08:36, Sunil Tech wrote:

> I am using Python 2.7.8
>>>> tx = "MOUNTAIN VIEW WOMEN’S HEALTH CLINIC"
>>>> tx.decode()
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 19:
> ordinal not in range(128)
> 
> How to know whether in a given string(sentence) is there any that is not
> ASCII character and how to replace?

How to detect is to do wat you just did but wrap a try/except around it:

try:
    tx.decode()
except UnicodeError:
    print " There were non ASCII characters in the data"

Now, how you replace the characters is up to you.
The location of the offending character is given in the error.
(Although there may be more, once you deal with that one!)
What would you like to replace it with from the ASCII subset?

But are you really sure you want to replace it with
an ASCII character? Most display devices these days
can cope with at least UTF-8 version of Unicode.
Maybe you really want to change your default character
set so it can handle those extra characters??

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos




More information about the Tutor mailing list