Unicode error

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Fri Jul 23 06:42:26 EDT 2010


On Fri, 23 Jul 2010 03:14:11 -0700, dirknbr wrote:

> I am having some problems with unicode from json.
> 
> This is the error I get
> 
> UnicodeEncodeError: 'ascii' codec can't encode character u'\x93' in
> position 61: ordinal not in range(128)
> 
> I have kind of developped this but obviously it's not nice, any better
> ideas?
> 
>         try:
>             text=texts[i]
>             text=text.encode('latin-1')
>             text=text.encode('utf-8')
>         except:
>             text=' '

Don't write bare excepts, always catch the error you want and nothing 
else. As you've written it, the result of encoding with latin-1 is thrown 
away, even if it succeeds.


text = texts[i]  # Don't hide errors here.
try:
    text = text.encode('latin-1')
except UnicodeEncodeError:
    try:
        text = text.encode('utf-8')
    except UnicodeEncodeError:
        text = ' '
do_something_with(text)


Another thing you might consider is setting the error handler:

text = text.encode('utf-8', errors='ignore')

Other error handlers are 'strict' (the default), 'replace' and 
'xmlcharrefreplace'.


-- 
Steven



More information about the Python-list mailing list