[Python-ideas] Support localization of unicode descriptions

Tue Feb 19 11:00:46 EST 2019

On 7/11/18 1:14 AM, Terry Reedy wrote:
> On 7/10/2018 5:20 PM, David Mertz wrote:
>> The problem with non-canonical translations of the Unicode character
>> names is that there is not one unique possible rendering into
>> language X. Equally, I could find synonyms in general English for the
>> names, but one would be official, the others at best informally
>> clarifying.
>
> If the Unicode consortium does not provide official translations, then
> we *might* 'bless' some other source.  The first place I would look
> would be the translators we already trust enough to display their work
> on the official doc page.
>
>> For informational purposes I think it's great to have a third party
>> project to find out "Unicode character named 'Something In English'
>> is roughly translated as <whatever> in your native language." But
>> it's hard to see how an unofficial loose cross-language dictionary
>> should be party of the standard library.
>
> The doc translations are intentionally not in the cpython repository
> and not in the cpython distribution and not considered part of the
> stdlib. In general, core devs have no particular expertise, interest,
> or time to vet translators and review translations.
>
> A proposal to make turtle a package and put translations of turtle
> commands in a submodule got no traction.  "Put it on Pypi"
>
> I have rejected proposals to put translations of IDLE's menus in an
> idlelib subdirectory, to be distributed with cpython as 'part' of the
> stdlib.  I am thinking about various ideas to allow users to customize
> the menu, either by editing a file or processing a download.  (For
> instance, the Japanese doc translation includes the IDLE chapter,
> which has a list of menu items and descriptions.)  But this is a
> different issue.
>
> The repository for unicode description translations should also be
> other than the cpython repository.

I have made the following collection of scripts
https://github.com/OpenTaal/python-unicodedata_l10n as a start for
offering l18n support. At the moment, there are only a few languages
with a wide coverage of translations, but that is progressing slowly at
https://unicode-table.com/

I made it so that PO and MO files are being generated. In order to
package and publish this on e.g. PyPI, I'm looking for someone who has
more experience in that area.

>
>> On Tue, Jul 10, 2018, 5:11 PM Terry Reedy <tjreedy at udel.edu
>> <mailto:tjreedy at udel.edu>> wrote:
>>
>>     On 7/10/2018 4:45 AM, Pander wrote:
>>
>>      > This is a third party initiative. The translations are
>> contributed by
>>      > volunteers. I have talked with Python core developers and they
>>     suggested
>>      > to post this here, as it is for them out of scope for Python
>> std lib.
>>
>>     Python-ideas list is for discussion of python and the stdlib
>> library.
>>     This is not a place for prolonged discussion of pypi projects.
>>     It *is* a place to discuss adding a hook that can be used to access
>>     translations.
>>
>>     There are both official doc translations, accessible from the
>> official
>>     doc pages, and others that are independent.  The official ones, at
>>     least, are discussed on the doc-sig list
>>     https://mail.python.org/mailman/listinfo/doc-sig
>>     There are currently 7 languages and coordinators listed at
>>     https://devguide.python.org/experts/#documentation-translations
>>     4 have progressed far enough to be listed in the drop-down box on
>>     https://docs.python.org/3/
>>
>>     I should think that these people should be asked if they want to be
>>     involved with unicode description translations.  They should
>> certainly
>>     have some helpful advice.
>>
>>     The description vocabulary is rather restricted, so a word
>> translation
>>     dictionary should be pretty easy.  For at least for some
>> languages, it
>>     should be possible to generate the 200000 description
>> translations from
>>     this. The main issues are word order and language-dependent 'word'
>>     units.  Hence, the three English words "LATIN SMALL LETTER"
>> become two
>>     words in German, 'LATEINISCHER KLEINBUCHSTABE', versus three
>> words in
>>     Spanish, but in reverse order, 'LETRA PEQUEÑA LATINA'.  It is
>> possible
>>     that the doc translators already uses translation software that deal
>>     with these issues.
>
>