
Hi all, I have (so far) been unable to show correctly Danish characters in the names of the list members. Let me give 2 examples. 1. Say I create a list 'test' and subscribe a member like here: newlist -l da test kaja@cs.au.dk kaja echo "Kaja Jørgensen <kaja@cs.au.dk>" | add_members -r - -a n -w n test to which Mailman responds with: Tilmeldt: Kaja Jørgensen <kaja@cs.au.dk> So far so good. Now I ask for the membership list with list_members -f test and receive back: Kaja J?rgensen <kaja@cs.au.dk> 2. I edit the membership list from the web page: - go to Membership List - replace member name Kai Nielsen with Kai Jørgensen Nielsen (looks good) - submit changes - go to another web membership page, then back to one with 'Kai...' and see: Kai Jørgensen Birger Nielsen Relevant settings in .../mailman/Mailman/mm_cfg.py are: # Charset and Encoding DEFAULT_CHARSET = 'iso-8859-1' VERBATIM_ENCODING = ['iso-8859-1'] We've been using Mailman for yers. Currently running v. is 2.1.11 on Linux (RHEL 5.3). Could you help? Perhaps point me to a previous thread on this list which provides a solution? Regards Kaja

Kaja P. Christiansen wrote:
I have (so far) been unable to show correctly Danish characters in the names of the list members. Let me give 2 examples.
1. Say I create a list 'test' and subscribe a member like here: newlist -l da test kaja@cs.au.dk kaja echo "Kaja Jørgensen <kaja@cs.au.dk>" | add_members -r - -a n -w n test to which Mailman responds with: Tilmeldt: Kaja Jørgensen <kaja@cs.au.dk>
So far so good. Now I ask for the membership list with list_members -f test and receive back: Kaja J?rgensen <kaja@cs.au.dk>
The name 'Kaja Jørgensen' is stored internally as a python unicode object. list_members encodes this for display using the encoding given by Python's sys.getdefaultencoding(). This in turn defaults to "ascii". If you want list_members to show non-ascii characters as other than '?', you have two choices. You can edit the definition of setencoding() in /usr/lib/pythonV.V/site.py to replace "ascii" with "iso-8859-1", or you can edit Mailman's bin/list_members and replace the line ENC = sys.getdefaultencoding() with ENC = "iso-8859-1"
2. I edit the membership list from the web page: - go to Membership List - replace member name Kai Nielsen with Kai Jørgensen Nielsen (looks good) - submit changes - go to another web membership page, then back to one with 'Kai...' and see: Kai Jørgensen Birger Nielsen
Again, the name is properly stored internally. This display is due to over-protection of the web interface from cross-site-scripting attacks. The attached file escape_html.patch.txt contains a patch that will allow these names to display properly in the admin Membership List. -- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Mark Sapiro wrote:
You can edit the definition of setencoding() in /usr/lib/pythonV.V/site.py to replace "ascii" with "iso-8859-1", ...
Actually, the above is not the proper way to do this. The proper way is to create /usr/lib/pythonV.V/site-packages/sitecustomize.py containing the two lines import sys sys.setdefaultencoding('iso-8859-1') -- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Hi Mark, Thanks much for the escape_html.patch.txt and the advice below: they have solved the problem with Danish characters in list_members (also sync_members!) and web membership page. This brings me to another question. When I define DEFAULT_CHARSET in the mm_cfg.py, and mm_cfg.py is imported in other modules, the encoding is actually known to Mailman. Would it be possible to use this value instead of patching Python? Kaja
Mark Sapiro wrote:
You can edit the definition of setencoding() in /usr/lib/pythonV.V/site.py to replace "ascii" with "iso-8859-1", ...
Actually, the above is not the proper way to do this. The proper way is to create /usr/lib/pythonV.V/site-packages/sitecustomize.py containing the two lines
import sys sys.setdefaultencoding('iso-8859-1')

Kaja P. Christiansen wrote:
This brings me to another question. When I define DEFAULT_CHARSET in the mm_cfg.py, and mm_cfg.py is imported in other modules, the encoding is actually known to Mailman. Would it be possible to use this value instead of patching Python?
DEFAULT_CHARSET was used only by the pipermail archiver and hasn't been actually used by anything in Mailman since before Mailman 2.1.0. With regard to changing the encoding for the command line tools that currently use Python's getdefaultencoding() is concerned, they should probably be using something like locale.getdefaultlocale()[1] instead. This is something we should look into for Mailman 2.2/3.0 -- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (2)
-
Kaja P. Christiansen
-
Mark Sapiro