[Tutor] string encoding

Dave Angel davea at ieee.org
Fri Jun 18 13:50:28 CEST 2010


Rick Pasotto wrote:
> <snip>
> I can print the string fine. It's f.write(string_with_unicode) that fails with:
>
> UnicodeEncodeError: 'ascii' codec can't encode characters in position 31-32: ordinal not in range(128)
>
> Shouldn't I be able to f.write() *any* 8bit byte(s)?
>
> repr() gives: u"Realtors\\xc2\\xae"
>
> BTW, I'm running python 2.5.5 on debian linux.
>
>   
You can write any 8 bit string.  But you have a Unicode string, which is 
16 or 32 bits per character.  To write it to a file, it must be encoded, 
and the default encoder is ASCII.  The cure is to encode it yourself, 
using the encoding that your spec calls for.  I'll assume utf8 below:

 >>> name = u"Realtors\xc2\xae"
 >>> repr(name)
"u'Realtors\\xc2\\xae'"
 >>> outfile = open("junk.txt", "w")
 >>> outfile.write(name)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 
8-9: ordin
al not in range(128)
 >>> outfile.write(name.encode("utf8"))
 >>> outfile.close()


DaveA



More information about the Tutor mailing list