Python nuube needs Unicode help
Marc 'BlackJack' Rintsch
bj_666 at gmx.net
Fri Jan 12 02:21:05 EST 2007
In <mailman.2600.1168552888.32031.python-list at python.org>, Chris Mellon
wrote:
> On 11 Jan 2007 13:28:14 -0800, gheissenberger at gmail.com
> <gheissenberger at gmail.com> wrote:
>
>> <Utt id="3" transcribe="yes" audioRoot="A1"
>> audio="313-20070102144528.wav" grammarSet="G3" rawText="não"
>> recValue="{data:CHOICE=NO;}" conf="970" rawText2="" conf2="0"
>> transcribedText="não" parsableText="não"/
>>
>> Clearly those "nã" are some non-Ascii characters, but how do I get
>> print to understand that?
>>
>> I keep getting:
>> "UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in
>> position 40:
>> ordinal not in range(128)"
>>
>
> Find out what encoding the files are in and modify the script to use it.
The problem is not the encoding of the files as you see they are decoded
to unicode strings by the XML reading part already.
Ciao,
Marc 'BlackJack' Rintsch
More information about the Python-list
mailing list