[issue19846] Python 3 raises Unicode errors with the C locale

Larry Hastings report at bugs.python.org
Fri Dec 13 12:58:51 CET 2013


Larry Hastings added the comment:

> "The fact that write() -> open() relies on sys.getfilesystemencoding()
> (respectively locale.getpreferredencoding()) at default as encoding is
> either a defect or a bad design (I leave the decision to you)."
>
> Or am I overlooking something?

First, you should probably just drop mentioning write() or print() or any of the functions that actually perform I/O.  The crucial decisions about decoding are made inside open().

Second, open() is implemented in C.  It cannot "rely on sys.getfilesystemencoding()" as it never calls it.  Internally, sys.getfilesystemencoding() simply returns a C global called Py_FileSystemDefaultEncoding.  But open() doesn't examine that, either.

Instead, open() determines the default encoding by calling the same function that's used to initialize Py_FileSystemDefaultEncoding: get_locale_encoding() in Python/pythonrun.c.  Which on POSIX systems calls the POSIX function nl_langinfo().

If you want to see the actual mechanisms involved, you should read the C source code in Modules/_io in the Python trunk.  open() is implemented as the C function io_open() in _iomodule.c.  When it opens a file in text mode without an explicit encoding, it wraps it in a TextIOWrapper object; the __init__ function for this class is the C function textiowrapper_init() in textio.c.

As for your assertion that this is "either a defect or a bad design": I leave the critique of that to others.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19846>
_______________________________________


More information about the Python-bugs-list mailing list