UTF16, BOM, and Windows Line endings
Neil Hodgson
nyamatongwe+thunder at gmail.com
Mon Feb 6 17:19:30 EST 2006
Fuzzyman:
> How should I handle line-endings for UTF16 ? Is it possible that other
> programs (on windows) will have line endings as u'\r\n' ?
Yes, try Notepad and save as Unicode. For the text
Fuzzy
End of lines
>>> contents = open("C:\\fuzzy.txt", "rb").read()
>>> contents
'\xff\xfeF\x00u\x00z\x00z\x00y\x00\r\x00\n\x00E\x00n\x00d\x00
\x00o\x00f\x00 \x00l\x00i\x00n\x00e\x00s\x00'
>>>
The '\r\x00\n\x00' is a u'\r\n'.
> When saving
> files for that platform should I make the line endings u'\r\n' ? (This
> sequence obviously encodes to four bytes in UTF16). I would only do
> this to ensure compatibility with other programs the user may use to
> create the text files.
Notepad will read u'\r\n'. It doesn't like '\n' or u'\n'. Some
applications are OK with other line ends by '\r\n' and u'\r\n' are
safest on Windows.
Neil
More information about the Python-list
mailing list