[I18n-sig] Re: pygettext.py extraction of docstrings

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Tue, 14 Aug 2001 21:17:22 +0200


> As gettext maintainer, I'm used to think in the categories of
> programmer - translator - user. Translated docstrings are not for
> the users, because users are not programmers in general. And the
> programmers (of .py programs), who must have looked at the various
> Python manuals, certainly reads English.

This is a wrong assumption; people writing programs in Python not
necessarily read fluently English (let alone speaking it). I assume
the same is true for any other "scripting" language. E.g. for Ruby,
much of the language documentation is in Japanese, since most of the
Ruby users prefer to read Japanese documentation. Likewise, the French
translation of the Python documentation was started precisely because
users don't read English that well.

Even among my colleagues, I find that they often mis-interpret English
documentation, and get the fine points only when pointed to them, and
after looking up certain keywords in a dictionary. They would not have
the same problems if the documentation was available in German.

So in your categories, these people are certainly users - of Python,
in the specific case.

>   - translated docstrings have a much smaller audience than
>     usual translated messages,

In addition to the above, I think you are missing an important detail
of Python's introspectiveness: Many Python applications present
docstrings to the user, instead of using them for documentation, by
means of accessing some object's __doc__ attribute at
runtime. E.g. you might have a drop-down menu, each item invoking a
different function. Then somebody might chose to key the online help
into the docstring. It is somewhat hackish, but common.

>   - docstrings are harder to translate, because the translator
>     needs to have programmer's know-how.

For the original purpose of docstrings, yes, certainly.

> As a consequence for gettext, I could live with an xgettext option
> --docstrings which extracts *only* the docstrings of a set of source
> files.

Again, for the application I have in mind (providing online help in
the progamming process), that is acceptable. I think for Barry's
application, it is not.

> The GNU gettext tools are currently being modified to handle various
> programming languages. A new flag 'python-format' is being
> introduced, with appropriate format string checking in 'msgfmt'.
> xgettext will also have a Python backend, making pygettext obsolete
> (except for docstring extraction, for the time being).

It turns out that there is a "batteries included" issue here. I know a
few cases where people have been using pygettext just because it was
already on their (Windows) system, whereas GNU gettext was not that
readily available (you'd need a C compiler to build it). So while most
Unix people will switch to GNU gettext for performance reasons
(pygettext is slow), I doubt that pygettext will go away anytime soon.

Regards,
Martin