Writing a Carriage Return in Unicode
MRAB
python at mrabarnett.plus.com
Wed Nov 18 20:06:33 EST 2009
Doug wrote:
> Hi!
>
> I am trying to write a UTF-8 file of UNICODE strings with a carriage
> return at the end of each line (code below).
>
> filOpen = codecs.open("c:\\temp\\unicode.txt",'w','utf-8')
>
> str1 = u'This is a test.'
> str2 = u'This is the second line.'
> str3 = u'This is the third line.'
>
> strCR = u"\u240D"
>
> filOpen.write(str1 + strCR)
> filOpen.write(str2 + strCR)
> filOpen.write(str3 + strCR)
>
> filOpen.close()
>
> The output looks like
> This is a test.âThis is the second line.âThis is the third
> line.â when opened in Wordpad as a UNICODE file.
>
> Thanks for your help!!
u'\u240D' isn't a carriage return (that's u'\r') but a symbol (a visible
"CR" graphic) for carriage return. Windows programs normally expect
lines to end with '\r\n'; just use u'\n' in programs and open the text
files in text mode ('r' or 'w').
Some Windows programs won't recognise UTF-8 text as UTF-8 in files
unless they start with a BOM; this will be handled automatically in
Python if you specify the encoding as 'utf-8-sig'.
More information about the Python-list
mailing list