
Hi,
My wife is trying to set up a list for a small group of mainly Hebrew speakers. We've configured it to make the Hebrew locale available, and this seems to work. (Apparently my wife doesn't think much of the Hebrew translation, but that's another issue)
The problem is that when she tried entering in Hebrew descriptions and welcome messages, she got text like this:
קבוצת עבודה וניהול
- כלבי נחייה.
I'm guessing this is an issue with dealing with UTF8 characters being typed into the Mailman form.
Is there anything I can do about this, or is this a Mailman or browser bug?
Browser is IE8. Mailman is 2.1.11 running on Debian 5.0.
NOte that the default list language was set to English, if that makes any difference. Would the default language need to be set to Hebrew to make these fields input correctly when the form is submitted?
Thanks, Geoff.

Geoff Shang wrote:
The problem is that when she tried entering in Hebrew descriptions and welcome messages, she got text like this:
קבוצת עבודה וניהול
- כלבי נחייה.
I'm guessing this is an issue with dealing with UTF8 characters being typed into the Mailman form.
Is there anything I can do about this, or is this a Mailman or browser bug?
Browser is IE8. Mailman is 2.1.11 running on Debian 5.0.
NOte that the default list language was set to English, if that makes any difference. Would the default language need to be set to Hebrew to make these fields input correctly when the form is submitted?
Either that or set the character set for English to UTF-8 if you/she has that ability. It requires putting
add_language('en', 'English (USA)', 'utf-8', 'ltr')
in mm_cfg.py and restarting Mailman.
There is another possibility. In recent versions prior to 2.1.13, HTML entities like < or ק were escaped as < or ק for the web interface. If this is the issue, the underlying data are correct; they just display incorrectly in the web admin interface.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

On Tue, 17 Aug 2010, Mark Sapiro wrote:
Geoff Shang wrote:
The problem is that when she tried entering in Hebrew descriptions and welcome messages, she got text like this:
קבוצת
[snip]
NOte that the default list language was set to English, if that makes any difference. Would the default language need to be set to Hebrew to make these fields input correctly when the form is submitted?
Either that or set the character set for English to UTF-8 if you/she has that ability. It requires putting
add_language('en', 'English (USA)', 'utf-8', 'ltr')
in mm_cfg.py and restarting Mailman.
I can do this, as I am an admin on the system where this is hosted.
I see now that there are entries like this in Defaults.py. So the entry in mm_cfg.py will overwrite this as it's meant to.
Maybe a stupid question, but why not have it utf-8 for all of them? I mean, isn't that the whole point of unicode, that it just works everywhere? Or does it need to match the character set of the messages/templates?
There is another possibility. In recent versions prior to 2.1.13, HTML entities like < or ק were escaped as < or ק for the web interface. If this is the issue, the underlying data are correct; they just display incorrectly in the web admin interface.
No, this was wrong in Email messages too. But the default English character set is US/ASCII so that's hardly surprising.
All this has led me to look at I18N, as it's not an area I've looked at before in Mailman. And I've got a couple of questions:
There's add_language lines for every supported language in Defaults.py, but only languages located in /etc/mailman/<languagecode> are actually available. Does Mailman check for them here or is there some other Mailman glue that says which languages are actually available for use?
Further to this, I can only find the template files and not the messages file. Where should these be located on a Debian system?
Thanks, Geoff.

Geoff Shang wrote:
Maybe a stupid question, but why not have it utf-8 for all of them? I mean, isn't that the whole point of unicode, that it just works everywhere? Or does it need to match the character set of the messages/templates?
Yes, it does need to match the character set of the messages and templates. It is still us-ascii for English just because of superstition. The character set for the translations is/was chosen by the translator, in many cases before unicode/utf-8 became generally accepted.
All this has led me to look at I18N, as it's not an area I've looked at before in Mailman. And I've got a couple of questions:
- There's add_language lines for every supported language in Defaults.py, but only languages located in /etc/mailman/<languagecode> are actually available. Does Mailman check for them here or is there some other Mailman glue that says which languages are actually available for use?
This is all Debian packaging, possibly based on never released upstream functionality for installing only selected languages.
In standard Mailman 2.1.x, all supported languages are available. The message catalogs are in $prefix/messages/<lang-code>/LC-MESSAGES/mailman.(po|mo) and the templates are in $prefix/templates/<lang-code>/*
- Further to this, I can only find the template files and not the messages file. Where should these be located on a Debian system?
Look in Defaults.py/mm_cfg.py for MESSAGES_DIR.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (2)
-
Geoff Shang
-
Mark Sapiro