From for-mm-at-python at rplab.ru Sun Jul 2 10:25:20 2017 From: for-mm-at-python at rplab.ru (Sergey Maslennikov) Date: Sun, 02 Jul 2017 23:25:20 +0900 Subject: [Mailman-i18n] How can we switch mailman to KOI8-R charset for Russian? In-Reply-To: <09d9db2c-d938-a44a-45e5-0d5a592295ff@msapiro.net> References: <1498382149.2779.30.camel@sn-y510p> <09d9db2c-d938-a44a-45e5-0d5a592295ff@msapiro.net> Message-ID: <1499005520.2821.14.camel@sn-y510p> In general it works. Thank you very much. We, however, did not switch mailman but added KOI8-R charset for Russian. Below there is a description. On Sun, 2017-06-25 at 09:18 -0700, Mark Sapiro wrote: 1) add this line to mm_cfg.py ... > LC_DESCRIPTIONS['ru'] = ('Russian', 'koi8-r', 'ltr') In principle, we may want to run multilingual (UTF-8 encoded) lists on our machines. Therefore we did not replace Russian language. Instead we added new one: LC_DESCRIPTIONS['ru_koi8'] = ('Russian-KOI8', 'koi8-r', 'ltr') KOI8-R is English/Russian charset and we usually mark both the languages as possible options for a mailing list. List description may be KOI8-R-encoded while default charset for English is UTF-8. In such a case, if somebody chooses ?English? as preferable language at http(s)://lists./listinfo/, the list description becomes unreadable. Therefore we a) added one more language: LC_DESCRIPTIONS['en_koi8'] = ('English-KOI8', 'koi8-r', 'ltr') b) created link in templates: ln -s en en_koi8 > 2) recode messages/ru/LC_MESSAGES/mailman.po with 'iconv f=utf-8 > t=koi8-r' and run 'msgfmt -o messages/ru/LC_MESSAGES/mailman.mo > messages/ru/LC_MESSAGES/mailman.po' to recompile the message catalog. We copied messages/ru to messages/ru_koi8 then, as you wrote, recoded messages/ru_koi8/LC_MESSAGES/mailman.po by iconv and recompiled by msgfmt. > 3) recode all the templates in templates/ru We recoded them to templates/ru_koi8. As Mailman/Cgi/private.py tuned browsers to receive UTF-8 text (KOI8-R Russian text was unreadable), we corrected that script to tune browsers for the charset of the list preferred language. In Mailman/Cgi/listinfo.py and Mailman/Cgi/admin.py there are functions "overview" which list the lists and their descriptions. That functions use the charset of the default server language while the charsets of the descriptions may be different. For instance, if charset of the description is KOI8-R, description is written in Russian, and charset of default server language is UTF-8 then description is unreadable. Therefore we changed that scripts to recode descriptions into charset of the default server language in cases mailing list language charset != document (default server) language charset. This way may be useful for those who want to use one byte (usually bilingual (English/another language)) charsets. In particular it may be useful for Russian users because the information density of Russian language (information / character) is a bit lower than that of English one [1] while each Russian symbol takes twice more memory in UTF-8 as well as because of excessive computations when data are being compressed, copied, stored, etc. [1] http://www.kwintessential.co.uk/blog/translation/translation-text-expansion-how-it-affects-design Sergey Maslennikov Moscow