[Mailman-Developers] Patch for Archiver/HyperArch.py

Tokio Kikuchi tkikuchi@is.kochi-u.ac.jp
Sat, 06 Oct 2001 13:19:08 +0900


Hi,

Some time ago, someone complained about the pipermail not representing
proper charset in the Content-Type header. Here is a patch for the
latest CVS (2.1a).

I am not very sure which is better to use as a default language,
mm_cfg.DEFAULT_SERVER_LANGUAGE or maillist.preferred_language.

Tokio

-------- Original Message --------
Date: Sat, 6 Oct 2001 13:09:35 +0900 (JST)
From: Mailman Admin <mailman@seppyo.org>
To: tkikuchi@is.kochi-u.ac.jp

--- HyperArch.py.orig	Thu Jul 26 14:26:48 2001
+++ HyperArch.py	Sat Oct  6 12:50:39 2001
@@ -104,7 +104,7 @@
 blankpat = re.compile(r'^\s*$')
 
 # content-type charset
-rx_charset = re.compile('charset="(\w+)"')
+rx_charset = re.compile('charset=(\S+)',re.IGNORECASE)
 
 # 
 # Starting <html> directive
@@ -140,7 +140,7 @@
     _last_article_time = time.time()
 
     # for compatibility with old archives loaded via pickle
-    charset = mm_cfg.DEFAULT_CHARSET
+    x, charset = mm_cfg.LC_DESCRIPTIONS[mm_cfg.DEFAULT_SERVER_LANGUAGE]
     cenc = None
     decoded = {}
 
@@ -172,7 +172,9 @@
         self.decoded = {}
         mo = rx_charset.search(self.ctype)
         if mo:
-            self.check_header_charsets(string.lower(mo.group(1)))
+            cset = string.lower(mo.group(1))
+            re.sub('"','',cset,2)
+            self.check_header_charsets(cset)
         else:
             self.check_header_charsets()
         if self.charset and self.charset in mm_cfg.VERBATIM_ENCODING:
@@ -194,6 +196,7 @@
         header, then an arbitrary charset is chosen.  Only those
         values that match the chosen charset are decoded.
         """
+        self.charset = msg_charset
         author, a_charset = self.decode_charset(self.author)
         subject, s_charset = self.decode_charset(self.subject)
         if author is not None or subject is not None:
@@ -527,7 +530,7 @@
         self._unlocklist = unlock
         self._lock_file = None
         self._charsets = {}
-        self.charset = None
+        x, self.charset = mm_cfg.LC_DESCRIPTIONS[maillist.preferred_language]
 
         if hasattr(self.maillist,'archive_volume_frequency'):
             if self.maillist.archive_volume_frequency == 0: