[Mailman-Users] Archives and International Character Sets (Mailman2.1.11)
mark at msapiro.net
Fri Mar 27 21:09:09 CET 2009
Drew Tenenholz wrote:
>The list is a announce-only type in Russian (Cyrillic), but the
>default language is set to English (so I can read the admin pages and
>complete the necessary tasks). As I believe Mark mentioned before,
>this means that the messages themselves (sent by the Russian
>Moderator team using Outlook Express or webmail and either
>windows-1251 or KOI8-R encondings) arrive at Mailman and are
>distributed in email as with their original encodings. However, the
>mailman archive in this configuration seems to save the messages as
>HTML entity codes which display fine in the Mailman archive as single
>messages, but are unreadable once they get to the monthly archive
By monthly archive, I assume you mean the .txt and/or .txt.gz files. Is
>1) What can be done to get the monthly archive in a readable format?
Either set the list's preferred language to Russian (and navigate
through the admin pages by position), or set Mailman's character set
for English to UTF-8 by putting the following line in mm_cfg.py.
add_language('en', 'English (USA)', 'utf-8', 'ltr')
>2) Is there any way to correct the existing monthly archives?
The messages in the cumulative
archives/private/LISTNAME.mbox/LISTNAME.mbox file are all in their
original charset and encoding, so if you do 1), you can then rebuild
the archive with bin/arch --wipe and that will rebuild the .txt files
with the new charset.
One thing to be aware of though is that although the monthly .txt files
look like .mbox files, they don't contain complete message headers. In
particular, even though the character set may now be utf-8 or koi8-r,
there are no content-type or other headers in the file to so indicate.
Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
San Francisco Bay Area, California better use your sense - B. Dylan
More information about the Mailman-Users