[Python-Dev] Small issues in gettext support

Mon Apr 26 11:16:07 EDT 2004

> I'd be worried most about backwards compatibility, since the module has
> worked this way since its early days.  Also, wouldn't this be an

IMO, the current behavior is wrong, so breaking backwards
compatibility in that case would be fixing something
important.

> opportunity for getting lots of UnicodeErrors?  E.g. my system encoding
> is 'ascii' so gettext() would fail for catalogs containing non-ascii
> characters.  I shouldn't have to change my system encoding just to avoid
> errors, but with your suggestion, wouldn't that make many catalogs
> basically unusable for me?

There are a few extra points to notice here:

- Different .mo files may have different encodings.

- The translation system is made in a way that the programmer should
  not have to worry about the encoding used by the translators.

- The current scheme may introduce a wrong practice: forcing
  translators to use some specific encoding to avoid breaking the
  program.

- We already have support for getting the "unicode" version of the
  string. This is currently the right way to get the translation in
  some specific encoding, since it uncouples the translation encoding
  from the expected encoding.

- In cases where you'd get the "UnicodeError", you'd see a mangled
  string which would be unreadable. To avoid the UnicodeError, we
  may also return the original string in cases where the
  UnicodeError is raised.

> Would adding bind_textdomain_codeset() would provide a way for the
> application to change the default encoding?

Yes, it changes the default encoding.

> If so, I'd be in favor of adding bind_textdomain_codeset() but not
> changing the default encoding for returned strings.  Then update the
> documentation to describe current behavior and how to change it via
> that function call.

Thanks for your suggestion!

-- 
Gustavo Niemeyer
http://niemeyer.net