UTF-8 Encoding Error
subhabangalore at gmail.com
subhabangalore at gmail.com
Fri Dec 23 01:38:15 EST 2016
I am getting the error:
UnicodeDecodeError: 'utf8' codec can't decode byte 0x96 in position 15: invalid start byte
as I try to read some files through TaggedCorpusReader. TaggedCorpusReader is a module
of NLTK.
My files are saved in ANSI format in MS-Windows default.
I am using Python2.7 on MS-Windows 7.
I have tried the following options till now,
string.encode('utf-8').strip()
unicode(string)
unicode(str, errors='replace')
unicode(str, errors='ignore')
string.decode('cp1252')
But nothing is of much help.
If any one may kindly suggest.
I am trying if you may see.
More information about the Python-list
mailing list