[Python-Dev] PEP 263 considered faulty (for some Japanese)
Guido van Rossum
guido@python.org
Tue, 12 Mar 2002 08:27:16 -0500
> Guido> I think I can propose a compromise though: there may be two
> Guido> default encodings, one used for Python source code, and one
> Guido> for data.
[Stephen J. Turnbull]
> Why go in this direction? It's better to allow each individual stream
> to specify a codec to be implicitly applied, I think. Consider Emacs,
> for example, which allows specification of default codecs for (1) file
> contents (2) names of file system objects (3) process I/O (but not I
> and O and E separately, which has caused problems!) (4) console input
> and (5) console output. All of those are plausible candidates for
> having separate defaults in Python as well.
>
> For example, in Japan it's easy to imagine a program with local file
> contents defaulting to UTF-8 (for cross-system portability) needing to
> access the Windows 9x console and file system in Shift JIS, while
> process (eg, network) I/O might be EUC-JP if the server were Unix.
> (Yes, I'm straining, but not much.)
>
> But if you allow codecs for each stream, people who want to have
> different defaults for certain classes of stream would just derive
> classes which initialized the default codec appropriately.
Attaching codecs to streams is currently pretty painful AFAICT (I've
never tried it :-), but I think your idea has merit: there are
sufficiently many different contexts where an encoding must be
specified that it makes sense to allow setting different defaults for
the different contexts. The issue of filename encoding is one with
which we (well, some of us) have struggled recently.
We'd have to
think more about which contexts exactly to consider; for now I can
come up with:
- file I/O;
- OS filenames;
- implicit mixing of 8-bit and Unicode strings;
- invocation of unicode(s) or u.decode() without an encoding.
I see your proposal as a possible future generalization of my
two-encodings proposal, not as an incimpatible alternative.
In the light of the post by Atsuo Ishimoto and the responses from both
Marc-Andre Lemburg and Martin von Loewis, however, I'm not sure
whether Suziki Hisao's response represents the Japanese community, and
it's possible that nothing needs to be done.
--Guido van Rossum (home page: http://www.python.org/~guido/)