[New-bugs-announce] [issue6788] codecs.open on Win32 does not force binary mode

Ryan McGuire report at bugs.python.org
Thu Aug 27 06:17:09 CEST 2009


New submission from Ryan McGuire <python.org at enigmacurry.com>:

Opening a UTF-8 encoded file with unix newlines ("\n") on Win32:

codecs.open("whatever.txt","r","utf-8").read()

replaces the newlines ("\n") with CR+LF ("\r\n").

The docs specifically say that :

"Files are always opened in binary mode, even if no binary mode was
specified. This is done to avoid data loss due to encodings using 8-bit
values. This means that no automatic conversion of '\n' is done on
reading and writing."

And yet, opening the file with an explicit binary mode resolves the
situation:

codecs.open("whatever.txt","rb","utf-8").read()

This reads the file with the original newlines unmodified.

The implementation of codecs.open and the documentation are out of sync.

----------
assignee: georg.brandl
components: Documentation, Library (Lib)
messages: 91995
nosy: EnigmaCurry, georg.brandl
severity: normal
status: open
title: codecs.open on Win32 does not force binary mode
type: behavior
versions: Python 2.6, Python 3.1

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue6788>
_______________________________________


More information about the New-bugs-announce mailing list