[Mailman-Users] UTF-8 question

Mark Sapiro mark at msapiro.net
Thu Jan 10 22:22:37 CET 2008


Eva Isaksson wrote:
>
>The changes made for utf-8 included:
>- changing the template files and the Finnish translation file
>into utf-8 with iconv.
>- fixing the mm_cfg.py by making DEFAULT_CHARSET = 'UTF-8'
>and copying the relevant litany of LC_DESCRIPTIONS from
>Defaults.py into mm_cfg.py and setting Finnish into utf-8.


You really only needed to copy the one add_language() that you changed
(and also leave off the _() around the name) as in

add_language('fi', 'Finnish', 'utf-8')

>Now, the problem that keeps puzzling me:
>
>1. Before the change, mails with charset=iso-8859-1 and
>charset=utf-8 were being distributed with the charset
>untouched. I figured out as this is how it should be - right?


Yes, but not necessarily in all cases.


>2. As our server hosts a lot of lists (almost 400 of them)
>I decided to try utf-8 out on a smaller scale first, on an 
>Ubuntu server, running only a couple of lists with its standard 
>mailman package, version 2.1.5. The utf-8 change was a success. 
>The webpages and archive were all okay, and the charset of 
>mails was untouched too.
>
>3. As things looked promising, I decided to proceed with our
>real list server. The result: 
>Web pages, archive, all okay, now in utf-8.
>Mails... all of them in utf-8. And I mean ALL mails.
>
>My question: was this to be expected? Is everything meant
>to be in utf-8 from now on, including the forcing of
>charset=utf-8 into all list mail headers? And why didn't this 
>happen neither with the iso-8859-1 settings, nor with the
>Ubuntu server?


I'm not certain about all of this, but there are places including
Scrubber (removing attachments and flattening a message to plain text)
and adding msg_header and msg_footer where the character set of a
message can be coerced.

I did do a very simple test, and I don't see the problem.

Can you post an example of a test message as sent to a list and the
corresponding message as received from the list with the character set
coerced to UTF-8?


-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan



More information about the Mailman-Users mailing list