how to write a unicode string to a file ?

Walter Dörwald walter at livinglogic.de
Mon Oct 19 10:02:21 EDT 2009


On 17.10.09 08:28, Mark Tolonen wrote:
> 
> "Kee Nethery" <kee at kagi.com> wrote in message
> news:AAAB63C6-6E44-4C07-B119-972D4F49E511 at kagi.com...
>>
>> On Oct 16, 2009, at 5:49 PM, Stephen Hansen wrote:
>>
>>> On Fri, Oct 16, 2009 at 5:07 PM, Stef Mientki 
>>> <stef.mientki at gmail.com> wrote:
>>
>> snip
>>
>>> The thing is, I'd be VERY surprised (neigh, shocked!) if Excel can't
>>> open a file that is in UTF8-- it just might need to be TOLD that its
>>> utf8 when you go and open the file, as UTF8 looks just like ASCII -- 
>>> until it contains characters that can't be expressed in ASCII. But I
>>> don't know what type of file it is you're saving.
>>
>> We found that UTF-16 was required for Excel. It would not "do the 
>> right thing" when presented with UTF-8.
> 
> Excel seems to expect a UTF-8-encoded BOM (byte order mark) to correctly
> decide a file is written in UTF-8.  This worked for me:
> 
> f=codecs.open('test.csv','wb','utf-8')
> f.write(u'\ufeff') # write a BOM
> f.write(u'马克,testing,123\r\n')
> f.close()

That can also be done with the utf-8-sig codec (which adds a BOM at the
start on writing):

f = codecs.open('test.csv','wb','utf-8-sig')
f.write(u'马克,testing,123\r\n')
f.close()

See http://docs.python.org/library/codecs.html#module-encodings.utf_8_sig

Servus,
   Walter



More information about the Python-list mailing list