Python nuube needs Unicode help

gheissenberger at gmail.com gheissenberger at gmail.com
Thu Jan 11 22:28:14 CET 2007


HELP!
Guy who was here before me wrote a script to parse files in Python.

Includes line:
print u
where u is a line from a file we are parsing.
However, we have started recieving data from Brazil. If I open file to
parse in VI, looks like:

<Utt id="3" transcribe="yes" audioRoot="A1"
audio="313-20070102144528.wav" grammarSet="G3" rawText="não"
recValue="{data:CHOICE=NO;}" conf="970" rawText2="" conf2="0"
transcribedText="não" parsableText="não"/

Clearly those "n&#227" are some non-Ascii characters, but how do I get
print to understand that?

I keep getting:
"UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in
position 40:
 ordinal not in range(128)"




More information about the Python-list mailing list