[Tutor] unicode() bug?
Jonathan Soons
jsoons at juilliard.edu
Sun Nov 9 17:07:37 EST 2003
the following program fails:
#!/usr/local/bin/python
f = open("text.txt", "r")
txt = f.read()
f.close()
output = open("unifile", "w")
output.write(unicode(txt, "iso-8859-1", "ignore"))
output.close()
so does the following:
#!/usr/local/bin/python
f = open("text.txt", "r")
txt = f.read()
f.close()
output = open("unifile", "w")
for i in range(len(txt)-1):
try:
output.write(unichr(txt[i]))
except:
print "bad character, but I'll keep going"
output.close()
The error is:
UnicodeError: ASCII encoding error: ordinal not in range(128)
The "ignore" seems to do nothing!
If I wanted it to stop on errors I would have used "strict".
How can I convert a string to unicode and skip any iffy characters?
Thank you.
More information about the Tutor
mailing list