
Victor Stinner wrote:
In Python 2, open() opens the file in binary mode (e.g. file.readline() returns a byte string). codecs.open() opens the file in binary mode by default, you have to specify an encoding name to open it in text mode.
In Python 3, open() opens the file in text mode by default. (It only opens the binary mode if the file mode contains "b".) The problem is that open() uses the locale encoding if the encoding is not specified, which is the case *by default*. The locale encoding can be:
- UTF-8 on Mac OS X, most Linux distributions - ISO-8859-1 os some FreeBSD systems - ANSI code page on Windows, e.g. cp1252 (close to ISO-8859-1) in Western Europe, cp952 in Japan, ... - ASCII if the locale is manually set to an empty string or to "C", or if the environment is empty, or by default on some systems - something different depending on the system and user configuration...
If you develop under Mac OS X or Linux, you may have surprises when you run your program on Windows on the first non-ASCII character. You may not detect the problem if you only write text in english... until someone writes the first letter with a diacritic.
How about a more radical change: have open() in Py3 default to opening the file in binary mode, if no encoding is given (even if the mode doesn't include 'b') ? That'll make it compatible to the Py2 world again and avoid all the encoding guessing. Making such default encodings depend on the locale has already failed to work when we first introduced a default encoding in Py2, so I don't understand why we are repeating the same mistake again in Py3 (only in a different area). Note that in Py2, Unix applications often leave out the 'b' mode, since there's no difference between using it or not. Only on Windows, you'll see a difference. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 28 2011)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/