[issue1753] TextIOWrapper.write writes utf BOM for every string

Erick Tryzelaar report at bugs.python.org
Mon Jan 7 09:50:34 CET 2008


New submission from Erick Tryzelaar:

I was playing around with python 3's io functions, and I found that when trying to write to 
an encoded utf-16 file that TextIOWrapper.write re-writes the utf-16 bom for every string:

>>> f=open('foo', 'w', encoding='utf-16')
>>> print('1234', file=f)
>>> print('5678', file=f)
>>> open('foo', 'rb').read()
b'\xff\xfe1\x002\x003\x004\x00\xff\xfe\n\x00\xff\xfe5\x006\x007\x008\x00\xff\xfe\n\x00'
>>> open('foo', 'r', encoding='utf-16').read()
'1234\ufeff\n\ufeff5678\ufeff\n'
>>> 

With the attached patch, it appears to generate the correct file:

>>> f=open('foo', 'w', encoding='utf-16')
>>> print('1234', file=f)
>>> print('5678', file=f)
>>> open('foo', 'rb').read()
b'\xff\xfe1\x002\x003\x004\x00\n\x005\x006\x007\x008\x00\n\x00'
>>> open('foo', 'r', encoding='utf-16').read()
'1234\n5678\n'
>>>

----------
components: Library (Lib)
files: io.py.patch
messages: 59438
nosy: erickt
severity: normal
status: open
title: TextIOWrapper.write writes utf BOM for every string
type: behavior
versions: Python 3.0
Added file: http://bugs.python.org/file9091/io.py.patch

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue1753>
__________________________________


More information about the Python-bugs-list mailing list