[Python-Dev] Add a new "locale" codec?

Stephen J. Turnbull stephen at xemacs.org
Thu Feb 9 15:59:48 CET 2012


Victor Stinner writes:

 > There is the same problem [that encode-decode with the 'locale'
 > codec doesn't roundtrip reliably] with the filesystem encoding
 > (sys.getfilesystemencoding()),

-1 on a query to the OS that pretends to be a constant.

You see, it's not the same problem.  The difference is that 'locale'
is a constant and should correspond to a constant encoding, while
'sys.getfilesystemcoding()' is a library function that queries the
system, and it's obvious from the syntax that this is expected to
change in various circumstances, so if you want roundtripping you need
to save the result.

Having a nondeterministic "locale" codec is just begging application
(and maybe a few middleware) programmers to use it everywhere they
don't feel like thinking about I18N.  Experience shows that that is
everywhere!

If this is needed, it should be spelled "os.getlocaleencoding()" (or
"sys.getlocaleencoding()"?)  Possibly there should be corresponding
getlocalelanguage(), getlocaleregion(), and getlocalemodifier()
functions, and they should take an optional string argument whose
appropriate component is returned.

Or maybe there should be a "parselocalestring()" function that returns
a named tuple.

Or maybe this three-line function doesn't need to be a builtin?


More information about the Python-Dev mailing list