[Tutor] unicode() bug?

Jonathan Soons jsoons at juilliard.edu
Sun Nov 9 17:07:37 EST 2003


the following program fails:

#!/usr/local/bin/python
f = open("text.txt", "r")
txt = f.read()
f.close()
output = open("unifile", "w")
output.write(unicode(txt, "iso-8859-1", "ignore"))
output.close()

so does the following:

#!/usr/local/bin/python
f = open("text.txt", "r")
txt = f.read()
f.close()
output = open("unifile", "w")
for i in range(len(txt)-1):
    try:
        output.write(unichr(txt[i]))
    except:
        print "bad character, but I'll keep going"
output.close()

The error is:

UnicodeError: ASCII encoding error: ordinal not in range(128)

The "ignore" seems to do nothing!
If I wanted it to stop on errors I would have used "strict".
How can I convert a string to unicode and skip any iffy characters?
Thank you.



More information about the Tutor mailing list