[issue1753] TextIOWrapper.write writes utf BOM for every string
Erick Tryzelaar
report at bugs.python.org
Mon Jan 7 09:50:34 CET 2008
New submission from Erick Tryzelaar:
I was playing around with python 3's io functions, and I found that when trying to write to
an encoded utf-16 file that TextIOWrapper.write re-writes the utf-16 bom for every string:
>>> f=open('foo', 'w', encoding='utf-16')
>>> print('1234', file=f)
>>> print('5678', file=f)
>>> open('foo', 'rb').read()
b'\xff\xfe1\x002\x003\x004\x00\xff\xfe\n\x00\xff\xfe5\x006\x007\x008\x00\xff\xfe\n\x00'
>>> open('foo', 'r', encoding='utf-16').read()
'1234\ufeff\n\ufeff5678\ufeff\n'
>>>
With the attached patch, it appears to generate the correct file:
>>> f=open('foo', 'w', encoding='utf-16')
>>> print('1234', file=f)
>>> print('5678', file=f)
>>> open('foo', 'rb').read()
b'\xff\xfe1\x002\x003\x004\x00\n\x005\x006\x007\x008\x00\n\x00'
>>> open('foo', 'r', encoding='utf-16').read()
'1234\n5678\n'
>>>
----------
components: Library (Lib)
files: io.py.patch
messages: 59438
nosy: erickt
severity: normal
status: open
title: TextIOWrapper.write writes utf BOM for every string
type: behavior
versions: Python 3.0
Added file: http://bugs.python.org/file9091/io.py.patch
__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue1753>
__________________________________
More information about the Python-bugs-list
mailing list