Python 3.0 crashes displaying Unicode at interactive prompt

Chris Rebert clp at rebertia.com
Sat Dec 13 22:07:22 CET 2008


On Sat, Dec 13, 2008 at 12:28 PM, John Machin <sjmachin at lexicon.net> wrote:
>
> Python 2.6.1 (r261:67517, Dec  4 2008, 16:51:00) [MSC v.1500 32 bit
> (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
>>>> x = u'\u9876'
>>>> x
> u'\u9876'
>
> # As expected
>
> Python 3.0 (r30:67507, Dec  3 2008, 20:14:27) [MSC v.1500 32 bit
> (Intel)] on win 32
> Type "help", "copyright", "credits" or "license" for more information.
>>>> x = '\u9876'
>>>> x
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
>  File "C:\python30\lib\io.py", line 1491, in write
>    b = encoder.encode(s)
>  File "C:\python30\lib\encodings\cp850.py", line 19, in encode
>    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
> UnicodeEncodeError: 'charmap' codec can't encode character '\u9876' in
> position
> 1: character maps to <undefined>
>
> # *NOT* as expected (by me, that is)
>
> Is this the intended outcome?

When Python tries to display the character, it must first encode it
because IO is done in bytes, not Unicode codepoints. When it tries to
encode it in CP850 (apparently your system's default encoding judging
by the traceback), it unsurprisingly fails (CP850 is an old Western
Europe codec, which obviously can't encode an Asian character like the
one in question). To signal that failure, it raises an exception, thus
the error you see.
This is intended behavior. Either change your default system/terminal
encoding to one that can handle such characters or explicitly encode
the string and use one of the provided options for dealing with
unencodable characters.

Also, please don't call it a "crash" as that's very misleading. The
Python interpreter didn't dump core, an exception was merely thrown.
There's a world of difference.

Cheers,
Chris

-- 
Follow the path of the Iguana...
http://rebertia.com



More information about the Python-list mailing list