UTF-16 encoding line breaks?

Chris Reedy creedy at mitretek.org
Wed Jun 11 21:47:03 CEST 2003


Martin v. Löwis wrote:
> "Richard" <richardd at hmgcc.gov.uk> writes:
> 
> 
>>I have a script which uses the .encode('UTF-16') function to encode a string
>>into UTF-16. However I am having difficulties in putting line breaks into
>>that string. \n is what I normally use but does not appear to become valid
>>UTF-16 once encoded.
> 
> 
> Can you demonstrate that? It works fine for me.
> 
> Regards,
> Martin

Here's an example that, if it's not an outright error, at least confuses me:

Python 2.2.2 (#37, Oct 14 2002, 17:02:34) [MSC 32 bit (Intel)] on win32
IDLE 0.8 -- press F1 for help
 >>> import codecs
 >>> testfile = codecs.open('testfile.txt', mode='wb', encoding='utf16')
 >>> testfile.write('abc\r\ndef\r\n')
 >>> testfile.close()
 >>> testfile = codecs.open('testfile.txt', encoding='utf16')
 >>> testfile.read(100)
u'abc\r\ndef\r\n'
 >>> testfile.close()

Everything's ok so far.

 >>> testfile = codecs.open('testfile.txt', encoding='utf16')
 >>> testfile.readlines()
[u'abc\r\n', u'def\r\n']
 >>> testfile.close()

That looks fine.

 >>> testfile = codecs.open('testfile.txt', encoding='utf16')
 >>> testfile.readline()
Traceback (most recent call last):
   File "<pyshell#27>", line 1, in ?
     testfile.readline()
   File "C:\DD\Python\lib\codecs.py", line 330, in readline
     return self.reader.readline(size)
   File "C:\DD\Python\lib\codecs.py", line 252, in readline
     return self.decode(line, self.errors)[0]
UnicodeError: UTF-16 decoding error: truncated data
 >>>

Huh?

   All explanations gratefully appreciated, Chris



-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----==  Over 80,000 Newsgroups - 16 Different Servers! =-----




More information about the Python-list mailing list