Character encoding & the copyright symbol
Albert Hopkins
marduk at letterboxes.org
Thu Aug 6 12:45:43 EDT 2009
On Thu, 2009-08-06 at 09:14 -0700, Robert Dailey wrote:
> Hello,
>
> I'm loading a file via open() in Python 3.1 and I'm getting the
> following error when I try to print the contents of the file that I
> obtained through a call to read():
>
> UnicodeEncodeError: 'charmap' codec can't encode character '\xa9' in
> position 1650: character maps to <undefined>
>
> The file is defined as ASCII and the copyright symbol shows up just
> fine in Notepad++. However, Python will not print this symbol. How can
> I get this to work? And no, I won't replace it with "(c)". Thanks!
It's not actually ASCII but Windows-1252 extended ASCII-like. So with
that information you can do either of 2 things: You can open it in text
mode and specify the encoding:
>>> fp = open(filename, 'r', encoding='windows-1252')
>>> s = fp.read()
>>> print(s)
or you can open it in binary mode and decode it later:
>>> fp = open(filename, 'rb')
>>> b = fp.read()
>>> print(str(b, encoding='windows-1252'))
Or you may be able to set the default encoding to windows-1252 but I
don't know how to do that (in Windows).
p.s.
Next time it might be helpful to paste a code snippet else we have to
make assumptions about what you are actually doing.
More information about the Python-list
mailing list