[Python-3000] locale-aware strings ?

Guido van Rossum guido at python.org
Mon Sep 4 23:32:12 CEST 2006


On 9/4/06, David Hopwood <david.nospam.hopwood at blueyonder.co.uk> wrote:
> Guido van Rossum wrote:
> > I've always said (can someone find a quote perhaps?) that there ought
> > to be a sensible default encoding for files (including but not limited
> > to stdin/out/err), perhaps influenced by personalized settings,
> > environment variables, the OS, etc.
>
> While it should be possible to find out what the OS believes to be
> the current "system" charset (GetCPInfoEx(CP_ACP, ...) on Windows;
> LC_CHARSET environment variable on Unix), that does not mean that it
> is this charset that Python programs should normally use. When defining
> a new text-based file type, it is simpler to define it to be always UTF-8.

In this particular case I don't care what's simpler to implement, but
what's most likely to do what the user expects. If on a particular box
most files are encoded in encoding X, and the user did whatever is
necessary to tell the tools that that's their preferred encoding, I
want Python to honor that encoding when opening text files, unless the
program makes other arrangements explicitly (such as specifying an
explicit encoding as a parameter to open()).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list