[Tutor] unicode utf-16 and readlines
Poor Yorick
gp@pooryorick.com
Sat Jan 4 09:22:01 2003
On Windows 2000, Python 2.2.1 open.readlines seems to read lines
incorrectly when the file is encoded utf-16. For example:
>>> fh = open('0022data2.txt')
>>> a = fh.readlines()
>>> print a
['\xff\xfe\xfaQ\r\x00\n', '\x00']
In this example, Python seems to have incorrectly parsed the \n\r
characters at the end of the line. It's an error that one can work
around by slicing off the last three characters of every other list
element, but it makes working with utf-16 files non-intuitive,
especially for beginners. Or am I missing something?
Poor Yorick
gp@pooryorick.com