[I18n-sig] Re: gettext in the standard library

Martin von Loewis loewis@informatik.hu-berlin.de
Mon, 4 Sep 2000 15:56:56 +0200 (MET DST)


> [Martin von Loewis]
>=20
> > Also, after discussion, I think we concluded that supporting alternative
> > locale categories is useless; the code should always assume LC_MESSAGES.

[Fran=E7ois Pinard]
> The charset selection could be also part of the LANG specification (after
> a period), or implied by the LC_CTYPE value (which itself might be derived
> from LC_ALL).  To make things a bit worse, many packages allow LANGUAGE
> to override LANG.

That was not the issue here. The question was whether dcgettext should
be supported, which allows to specify a category other than
LC_MESSAGES when looking for catalogs.

> LANGUAGE is an extension of LANG allowing fallback languages,
> something that has been asked by people when `gettext' was designed
> and which looked reasonable to us (yet Richard objected that we
> loose time over this).

Yes, gettext.py supports this convention.

> I also wanted to stress another point.  Regionalised translation files
> automatically fallback on non-regionalised files when available, on a
> message per message basis.  For example, a typical `de_AT' (Autrichian
> German) translation file contains only a few re-translations, the bulk
> of them is still kept within `de'.

The current gettext supports trying these in order.

However, looking at the implementation, it seems both conventions are
implemented incorrectly: The fall-backs are used when opening the
catalog. When the catalog is there, but lookup finds that a message is
not translated, it won't try the fall-backs. Instead, it will just
return the English message.

In the case of LANGUAGE, I think this is acceptable: If you set it to
de:sv, you may get German, Swedish, or English translations. However,
in real live, you either get German or Swedish, since catalogs are
likely full translations, or not present at all.

As for de_AT falling back to de on a per-message basis - gettext.py
doesn't do that. As for 'a typical' de_AT file: I have a total of 2
de_AT files on my installation, whereas I have 211 de translations.
So it seems that the typical de_AT translation is empty, in which case
it would indeed fall back to de.

Regards,
Martin