[Mailman-Developers] Patch for Archiver/HyperArch.py
Tokio Kikuchi
tkikuchi@is.kochi-u.ac.jp
Sat, 06 Oct 2001 13:19:08 +0900
Hi,
Some time ago, someone complained about the pipermail not representing
proper charset in the Content-Type header. Here is a patch for the
latest CVS (2.1a).
I am not very sure which is better to use as a default language,
mm_cfg.DEFAULT_SERVER_LANGUAGE or maillist.preferred_language.
Tokio
-------- Original Message --------
Date: Sat, 6 Oct 2001 13:09:35 +0900 (JST)
From: Mailman Admin <mailman@seppyo.org>
To: tkikuchi@is.kochi-u.ac.jp
--- HyperArch.py.orig Thu Jul 26 14:26:48 2001
+++ HyperArch.py Sat Oct 6 12:50:39 2001
@@ -104,7 +104,7 @@
blankpat = re.compile(r'^\s*$')
# content-type charset
-rx_charset = re.compile('charset="(\w+)"')
+rx_charset = re.compile('charset=(\S+)',re.IGNORECASE)
#
# Starting <html> directive
@@ -140,7 +140,7 @@
_last_article_time = time.time()
# for compatibility with old archives loaded via pickle
- charset = mm_cfg.DEFAULT_CHARSET
+ x, charset = mm_cfg.LC_DESCRIPTIONS[mm_cfg.DEFAULT_SERVER_LANGUAGE]
cenc = None
decoded = {}
@@ -172,7 +172,9 @@
self.decoded = {}
mo = rx_charset.search(self.ctype)
if mo:
- self.check_header_charsets(string.lower(mo.group(1)))
+ cset = string.lower(mo.group(1))
+ re.sub('"','',cset,2)
+ self.check_header_charsets(cset)
else:
self.check_header_charsets()
if self.charset and self.charset in mm_cfg.VERBATIM_ENCODING:
@@ -194,6 +196,7 @@
header, then an arbitrary charset is chosen. Only those
values that match the chosen charset are decoded.
"""
+ self.charset = msg_charset
author, a_charset = self.decode_charset(self.author)
subject, s_charset = self.decode_charset(self.subject)
if author is not None or subject is not None:
@@ -527,7 +530,7 @@
self._unlocklist = unlock
self._lock_file = None
self._charsets = {}
- self.charset = None
+ x, self.charset = mm_cfg.LC_DESCRIPTIONS[maillist.preferred_language]
if hasattr(self.maillist,'archive_volume_frequency'):
if self.maillist.archive_volume_frequency == 0: