[Python-ideas] Fix default encodings on Windows
Steve Dower
steve.dower at python.org
Wed Aug 10 15:08:48 EDT 2016
On 10Aug2016 1144, Paul Moore wrote:
> I presume you'd be targeting 3.7 for this change.
Does 3.6 seem too aggressive? I think I have time to implement the
changes before beta 1, as it's mostly changing default values and
mopping up resulting breaks. (Doing something like reimplementing files
using the Win32 API rather than the CRT would be too big a task for 3.6.)
> Most text editors still (AFAIK) use
> the ANSI codepage by default, and it's the one place where an
> identifying BOM isn't possible. So your alternative may be a safer
> choice. On the other hand, files from Unix (via say github) would
> typically be UTF-8 without BOM, so it becomes a question of choosing
> the best compromise. I'm inclined to go for cross-platform and UTF-8
> and clearly document the change.
That last point was my thinking. Notepad's default is just as bad as
Python's default right now, but basically everyone acknowledges that
it's bad. I don't think we should prevent Python from behaving better
because one Windows tool doesn't.
> We might want a more convenient short
> form for open(filename, "r", encoding=sys.getpreferredencoding()),
> though, to ease the transition... We'd also need to consider how the
> new default encoding would interact with PYTHONIOENCODING.
PYTHONIOENCODING doesn't affect locale.getpreferredencoding() (but it
does affect sys.std*.encoding).
> For the console, does this mean that the win_unicode_console module
> will no longer be needed when these changes go in?
That's the hope, though that module approaches the solution differently
and may still uses. An alternative way for us to fix this whole thing
would be to bring win_unicode_console into the standard library and use
it by default (or probably whenever PYTHONIOENCODING is not specified).
> Sorry, not much in the way of direct experience or information I can
> add, but a strong +1 on the change (and I'd be happy to help where
> needed).
Testing with obscure filenames and strings is where help will be needed
most :)
Cheers,
Steve
More information about the Python-ideas
mailing list