[Python-Dev] Python-3.0, unicode, and os.environ

Stephen J. Turnbull stephen at xemacs.org
Fri Dec 12 09:57:20 CET 2008

Toshio Kuratomi writes:
 > Adam Olsen wrote:
 > > On Thu, Dec 11, 2008 at 6:55 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
 > >> Unfortunately, even programmers experienced in I18N like Martin, and
 > >> those with intuition-that-has-the-force-of-law<wink> like Guido,
 > >> express deliberate disbelief on this point.  They say that filesystem
 > >> names and environment variable values are text, which is true from the
 > >> semantic viewpoint but can't be fully supported by any implementation.
 > > 
 > > With all the focus on backup tools and file managers I think we've
 > > lost perspective.  They're an important use case, but hardly the
 > > dominant one.


 > > Please, as a user, if your app is creating new files, do NOT use
 > > bytes!  You have no excuse for creating garbage, and garbage doesn't
 > > help the user any.  Getting the encoding right, use the unicode APIs,
 > > and don't pass the buck on to everything else.
 > > 
 > Uhmmm.... That's good advice but doesn't solve any problems :-(.

Exactly.  Furthermore, the problems *already exist*.  My current
locale is UTF-8 and all files dated since about 2002 have UTF-8 names,
*except* in my MIME-bodies garbage can, where only recently have I got
around to coercing my MUA to doing the right thing.  And of course
there are still legacy files names in EUC-JP, which I suppose I could
search for but since I only access a directory containing one once in
a pale blue moon, I'm not gonna bother.

It's just not reasonable to expect users or even sysadminns to go
around cleaning up legacy data.

More information about the Python-Dev mailing list