Py3: Read file with Unicode characters

Martin v. Loewis martin at v.loewis.de
Thu Apr 8 11:14:36 EDT 2010


Gnarlodious wrote:
> Attempting to read a file containing Unicode characters such as ±:
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position
> 5007: ordinal not in range(128)
> 
> I did succeed by converting all the characters to HTML entities such
> as "±", but I want the characters to be the actual font in the
> source file. What am I doing wrong? My understanding is that ALL
> strings in Py3 are unicode so... confused.

When opening the file, you need to specify the file encoding. If you
don't, it defaults to ASCII (in your situation; the specific default
depends on the environment).

Regards,
Martin



More information about the Python-list mailing list