[Tutor] UnicodeEncodeError: 'cp932' codec can't encode character '\xe9' in position

Robert Sjoblom robert.sjoblom at gmail.com
Sun Mar 11 00:38:04 CET 2012


Okay, so here's a fun one. Since I'm on a japanese locale my native
encoding is cp932. I was thinking of writing a parser for a bunch of
text files, but I stumbled on even printing the contents due to ...
something. I don't know what encoding the text file uses, which isn't
helping my case either (I have asked, but I've yet to get an answer).

Okay, so:

address = "C:/Path/to/file/file.ext"
with open(address, encoding="cp1252") as alpha:
    text = alpha.readlines()
    for line in text:
        print(line)

It starts to print until it hits the wonderful character é or '\xe9',
where it gives me this happy traceback:
Traceback (most recent call last):
  File "C:\Users\Azaz\Desktop\CK2 Map Painter\Parser\test parser.py",
line 8, in <module>
    print(line)
UnicodeEncodeError: 'cp932' codec can't encode character '\xe9' in
position 13: illegal multibyte sequence

I can open the document and view it in UltraEdit -- and it displays
correct characters there -- but UE can't give me what encoding it
uses. Any chance of solving this without having to switch from my
japanese locale? Also, the cp1252 is just an educated guess, but it
doesn't really matter because it always comes back to the cp932 error.

-- 
best regards,
Robert S.


More information about the Tutor mailing list