Encoding conundrum

Dave Angel d at davea.name
Tue Nov 20 23:46:55 CET 2012

On 11/20/2012 04:49 PM, Daniel Klein wrote:
> With the assistance of this group I am understanding unicode encoding
> issues much better; especially when handling special characters that are
> outside of the ASCII range. I've got my application working perfectly now
> :-)
> However, I am still confused as to why I can only use one specific encoding.

Who says you can only use one?  You need to use the right encoding for
the device or file you're talking with, and if different devices want
different encodings, then you must use multiple ones.  Only one can be
the default, however, and that's where some problems come about.

> I've done some research and it appears that I should be able to use any of
> the following codecs with codepoints '\xfc' (chr(252)) '\xfd' (chr(253))
> and '\xfe' (chr(254)) :
> ISO-8859-1   [ note that I'm using this codec on my Linux box ]
> cp1252
> cp437
> latin1
> utf-8
> If I'm not mistaken, all of these codecs can handle the complete 8bit
> character set.

What 8 bit character set?  This is a nonsense statement.  If you mean
all of them can convert an 8 bit byte to SOME unicode character, then
fine.  But they won't convert each such byte to the SAME unicode
character, or they'd be the same encoding.

> However, on Windows 7, I am only able to use 'cp437' to display (print)
> data with those characters in Python. If I use any other encoding, Windows
> laughs at me with this error message:
>   File "C:\Python33\lib\encodings\cp437.py", line 19, in encode
>     return codecs.charmap_encode(input,self.errors,encoding_map)[0]
> UnicodeEncodeError: 'charmap' codec can't encode character '\xfd' in
> position 3: character maps to <undefined>
> Furthermore I get this from IDLE:
>>>> import locale
>>>> locale.getdefaultlocale()
> ('en_US', 'cp1252')
> I also get 'cp1252' when running the same script from a Windows command
> prompt.
> So there is a contradiction between the error message and the default
> encoding.
> Why am I restricted from using just that one codec? Is this a Windows or
> Python restriction? Please enlighten me.
I don't know much about Windows quirks anymore.  I haven't had to use it
much for years.



More information about the Python-list mailing list