Character encoding & the copyright symbol
Philip Semanchuk
philip at semanchuk.com
Thu Aug 6 12:31:27 EDT 2009
On Aug 6, 2009, at 12:14 PM, Robert Dailey wrote:
> Hello,
>
> I'm loading a file via open() in Python 3.1 and I'm getting the
> following error when I try to print the contents of the file that I
> obtained through a call to read():
>
> UnicodeEncodeError: 'charmap' codec can't encode character '\xa9' in
> position 1650: character maps to <undefined>
>
> The file is defined as ASCII and the copyright symbol shows up just
> fine in Notepad++. However, Python will not print this symbol. How can
> I get this to work? And no, I won't replace it with "(c)". Thanks!
If the file is defined as ASCII and it contains 0xa9, then the file
was written incorrectly or you were told the wrong encoding. There is
no such character in ASCII which runs from 0x00 - 0x7f.
The copyright symbol == 0xa9 if the encoding is ISO-8859-1 or
windows-1252, and since you're on Windows the latter is a likely bet.
http://en.wikipedia.org/wiki/Ascii
http://en.wikipedia.org/wiki/Iso-8859-1
http://en.wikipedia.org/wiki/Windows-1252
Bottom line is that your file is not in ASCII. Try specifying
windows-1252 as the encoding. Without seeing your code I can't tell
you where you need to specify the encoding, but the Python docs should
help you out.
HTH
Philip
More information about the Python-list
mailing list