UnicodeEncodeError -- 'character maps to <undefined>'

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sat Aug 27 11:00:20 EDT 2011


J wrote:

> Hi there,
> 
> I'm attempting to print a dictionary entry of some twitter data to screen
> but every now and then I get the following error:
> 
> (<type 'exceptions.UnicodeEncodeError'>, UnicodeEncodeError('charmap',
> u'RT @ciaraluvsjb26: BIEBER FEVER \u2665', 32, 33, 'character maps to
> <undefined>'), <traceback object at 0x01B323C8>)

Showing the actual traceback will help far more than a raw exception tuple.


> I have googled this but haven't really found any way to overcome the
> error. Any ideas?

I can only try to guess what you are doing, since you haven't shown either
any code or a traceback, but I can imagine that you're probably trying to
encode a Unicode string into bytes, but using the wrong encoding.

I can almost replicate the error: the exception is the same, the message is
not, although it is similar.

>>> s = u'BIEBER FEVER \u2665'
>>> print s  # Printing Unicode is fine.
BIEBER FEVER ♥
>>> s.encode()  # but encoding defaults to ASCII
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2665' in
position 13: ordinal not in range(128)



The right way is to specify an encoding that includes all the characters you
need. Unless you have some reason to choose another encoding, the best
thing to do is to just use UTF-8.

>>> s.encode('utf-8')
'BIEBER FEVER \xe2\x99\xa5'



-- 
Steven




More information about the Python-list mailing list