universal newlines and utf-16
Baz Walter
bazwal at ftml.net
Sun Apr 11 10:12:14 EDT 2010
i am using python 2.6 on a linux box and i have some utf-16 encoded
files with crlf line-endings which i would like to open with universal
newlines.
so far, i have been unable to get this to work correctly.
for example:
>>> open('test.txt', 'w').write(u'a\r\nb\r\n'.encode('utf-16'))
>>> repr(open('test.txt', 'rbU').read().decode('utf-16'))
"u'a\\n\\nb\\n\\n'"
>>> import codecs
>>> repr(codecs.open('test.txt', 'rbU', 'utf-16').read())
"u'a\\n\\nb\\n\\n'"
of course, the output i want is:
"u'a\\nb\\n'"
i suppose it's not too surprising that the built-in open converts the
line endings before decoding, but it surprised me that codecs.open does
this as well.
is there a way to get universal newlines to work properly with utf-16 files?
(nb: i'm not interested in other methods of converting line endings -
just whether universal newlines can be made to work correctly).
More information about the Python-list
mailing list