read a file and remove Mojibake chars
Daiyue Weng
daiyueweng at gmail.com
Thu Apr 7 04:47:34 EDT 2016
Hi, when I read a file, the file string contains Mojibake chars at the
beginning, the code is like,
file_str = open(file_path, 'r', encoding='utf-8').read()
print(repr(open(file_path, 'r', encoding='utf-8').read())
part of the string (been printing) containing Mojibake chars is like,
'锘縶\n "name": "__NAME__"'
I tried to remove the non utf-8 chars using the code,
def read_config_file(fname):
with open(fname, "r", encoding='utf-8') as fp:
for line in fp:
line = line.strip()
line = line.decode('utf-8','ignore').encode("utf-8")
return fp.read()
but it doesn't work, so how to remove the Mojibakes in this case?
many thanks
More information about the Python-list
mailing list