[Mailman-Developers] 2.1b5 pipermail archives assume all posts are in list language charset?

Ben Gertzfield che@debian.org
Thu Nov 21 17:25:12 2002

Barry, i18n folks,

I installed 2.1b5 to see how the status of pipermail archives and 
multilanguage posts was coming along.

It's starting to look good on the main index page.  Here's a test I made 
with Japanese, Chinese, and German all on the same page (note: I don't 
know Chinese or German so I just made up something that looked good for 
the test ;)  The list is configured to allow all languages, but is set 
to us-ascii.


(Ignore the duplicate Chinese test).  The multilingual Subject lines are 
all converted to Unicode entities perfectly.

But if you click on the second post (posted in iso-2022-jp), you can see 
that the HTML specifies us-ascii as the character set:


resulting in major mojibake (FYI, Barry, "mojibake" is the Japanese word 
for "messed up characters", perfectly applicable here since there's no 
word for that in English).

Where is pipermail getting the us-ascii from?  I assume it's from the 
list config being set to English as the default list.  But shouldn't it 
get the charset of the post from the post itself?

Interestingly, the Subject on the above url is Unicode escaped, but the 
title of the page is:

 <title> [Mailman]  =?iso-2022-jp?b?GyRCRnxLXDhsJUYlOSVIGyhC?=

Literally, in base64!  Barry, you want I should work on this some this 
week, or do you have time to look at it?


