Suspected Unicode problem when reading from excell

Alex Martelli aleax at aleax.it
Wed Sep 5 08:23:08 EDT 2001


"maxm" <maxm at normik.dk> wrote in message
news:3b9612db$0$217$edfadb0f at dspool01.news.tele.dk...
    ...
>     print question, options, answerText
>
> But when I run it I get the following error:
>
> >>>  File "C:\div\uninstalled\python\excell\jensMarius.py", line 9, in ?
> >>>    print question, options, answerText
> >>>UnicodeError: ASCII encoding error: ordinal not in range(128)
>
> If I replace line 9 with this block, it prints out every question without
> Danish characters.
>
>     try:
>         print question
>     except:
>         pass
>
> So my guess is that it's caused by a problem with Danish characters.
>
> Has anybody got a possible solution?

sure,
    print question.encode('latin-1')
should work fine if your terminal is indeed using Latin-1 encoding.
Similarly, question.encode('mbcs') will work find if your screen
uses MBCS (MultiByte-Character-Set) encoding on Windows, 'cp437'
if it uses Codepage-437, and so on, and so forth.

Generally, to display Unicode strings, you must encode them in
the way that is suitable for your intended display device.  By
default, Python uses 'ascii' encoding and 'strict' error handling,
so you'll basically never get anything mis-displayed (well, hardly
ever -- the 7-bit ascii subset does tend to be OK on most kinds
of display devices, but even it can't cover ALL:-) and will get
exceptions you can trap and handle suitably, rather than missing
characters, question=marks, and so on:-).

If you DO want question-marks for "nonencodable" characters,
that's easy too:
    print question.encode('ascii','replace')
(or, use 'ignore' instead of 'replace' to have nonencodable
characters just omitted).


Alex







More information about the Python-list mailing list