[Doc-SIG] Translation of Python documentation
Martin v. Loewis
martin@loewis.home.cs.tu-berlin.de
Mon, 9 Jul 2001 17:32:09 +0200
At the European Python Meeting, we had a few sessions on translating
Python documentation. These were initiated by Benoit Lacherez, who
currently manages the French Python translations (frpython.sf.net).
The French translation group currently uses a script to markup the
original documentation, which copies the English text into commented
regions. The translator inserts the French translations in-between
these regions.
We have discussed versioning of the documentation to some extend, and
found two problems:
1. it is still unclear if and when the documentation will be converted
to XML. Having XML might simplify the translation process to some
degree, but it will also mean that the existing translations need
to be converted, as well.
2. version tracking is quite a challenge. So far, the French
translators had problems when documentation moved from one file to
another after 1.5.2. However, we anticipate further problems with
version changes, like:
- the order of paragraphs or sections may change
- changes might merely affect formatting (e.g. line breaking), but
a plain diff will display the entire paragraph as changed
3. it might be desirable to offer "incomplete" translations, which
only offers translation when they are available, and English
documentation for the rest.
To solve these issues, we propose that
a) the conversion to XML is done rather sooner than later,
b) in the original documents, unique identifications of sections
and *desc elements are introduced. These identifications can
then be used in the translations to specify correlate the
translations with the original text. This might look like
<funcdesc id='capitalize'>
<signature>
<name>capitalize</name>
<args>word</args>
</signature>
<description>
<para>Capitalize the first character of the argument.</para>
</description>
</funcdesc>
A script would need to check whether these are truly unique, and
whether they are present in all places (and assign them if they
aren't). I assume they can be used for cross-referencing, also.
c) some sort of versioning is used in the translations. It is not
clear to me what the best approach would be, options include:
- attribute each element with an ID also with the CVS version
number where this element was last changed.
- attribute each such element with a hash value for its contents.
Regards,
Martin