Issue #11022: locale.getpreferredencoding() must not set temporary LC_CTYPE

Hi, I would like to know if it is too late (or not) to change the behaviour of open() for text files (TextIOWrapper). Currently, it calls locale.getpreferredencoding() to get the locale encoding by default. It is convinient and I like this behaviour... except that it changes temporary the LC_CTYPE locale to get the user prefered encoding instead of using the current encoding. Python 3 does already uses the user preferred encoding as the current encoding at startup. Changing temporary the current encoding to the user preferred encoding is useless and dangerous (may cause issues in multithreaded applications). Setting the current locale using locale.setlocale() does not affect TextIOWrapper, it's also surprising. The change is just to replace locale.getpreferredencoding() by locale.getpreferredencoding(False) in the io module. Can I change this behaviour (before the first beta) in Python 3.3? See the issue #11022 (and maybe also #6203) for the discussion on this change. -- Leaving LC_CTYPE unchanged (use the "C" locale, which is ASCII in most cases) at Python startup would be a major change in Python 3. I don't want to do that. You would see a lot of mojibake in your GUIs and get a lot of ugly surrogate characters in filenames because of the PEP 393. Setting the LC_CTYPE to the user preferred encoding is just very convinient and helps Python to speak to the user though the console, to the filesystem, to pass arguments on a command line of a subprocess, etc. For example, you cannot pass non-ASCII characters to a subprocess, characters written by the user in your GUI, if your current LC_CTYPE locale is C (ASCII): you get an Unicode encode error. Victor

Can I change this behaviour (before the first beta) in Python 3.3?
Fine with me. That code predates 43e32b2b4004. I don't recall discussion to set the LC_CTYPE locale and not take it back, but apparently, this is what Python currently does, which means that another setlocale call is not necessary. So in theory, your change should have no effect, unless somebody has modified some environment variables. Regards, Martin

Fine with me.
Ok, done with changeset 2587328c7c9c.
So in theory, your change should have no effect, unless somebody has modified some environment variables.
Changing TextIOWrapper to call locale.getpreferredlocale(False) instead of getpreferredlocale() has these two effects: 1) without the patch, setting LC_ALL, LC_CTYPE or LANG environment variable changes the encoding used by TextIOWrapper. 2) with the patch, setting LC_CTYPE (with locale.setlocale) changes the the encoding used by TextIOWrapper. IMO (2) is less surprising than (1) For example, it is the expected behaviour of the reporter of the issue #11022. In practice, it should not change anything for most people. Victor
participants (2)
-
"Martin v. Löwis"
-
Victor Stinner