[ mailman-Bugs-1246004 ] 2.1.6 bin/arch bombs out on unicodeerror

Bugs item #1246004, was opened at 2005-07-27 15:19 Message generated for change (Comment added) made by schoinobates You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=100103&aid=1246004&group_id=103 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: mail delivery Group: 2.1 (stable) Status: Open Resolution: None Priority: 5 Private: No Submitted By: Auke Kok (sofar) Assigned to: Nobody/Anonymous (nobody) Summary: 2.1.6 bin/arch bombs out on unicodeerror Initial Comment: Running bin/arch on my i18n- development list I got this beauty: Updating HTML for article 2390 Updating HTML for article 2391 Pickling archive state into /var/mailman/archives/private/xfce-i18n/pipermail.pck Traceback (most recent call last): File "bin/arch", line 200, in ? main() File "bin/arch", line 188, in main archiver.processUnixMailbox(fp, start, end) File "/var/mailman/Mailman/Archiver/pipermail.py", line 573, in processUnixMailbox self.add_article(a) File "/var/mailman/Mailman/Archiver/pipermail.py", line 625, in add_article article.parentID = parentID = self.get_parent_info(arch, article) File "/var/mailman/Mailman/Archiver/pipermail.py", line 657, in get_parent_info article.subject) File "/var/mailman/Mailman/Archiver/HyperDatabase.py", line 311, in getOldestArticle self.__openIndices(archive) File "/var/mailman/Mailman/Archiver/HyperDatabase.py", line 251, in __openIndices t = DumbBTree(os.path.join(arcdir, archive + '-' + i)) File "/var/mailman/Mailman/Archiver/HyperDatabase.py", line 65, in __init__ self.load() File "/var/mailman/Mailman/Archiver/HyperDatabase.py", line 179, in load self.__sort(dirty=1) File "/var/mailman/Mailman/Archiver/HyperDatabase.py", line 73, in __sort self.sorted.sort() UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128) ---------------------------------------------------------------------- Comment By: Schoinobates Volans (schoinobates) Date: 2007-02-28 22:16 Message: Logged In: YES user_id=41822 Originator: NO The script needs to actually upgrade all archive volumes, not only the current one, because if a post to a ML comes with a date in the past, it will be added to the old volume. New script: #! /usr/bin/python # # Copyright (C) 2007 Lionel Elie Mamane <lmamane@debian.org> # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation; either version 2 # of the License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. """Convert a list's archive databases to unicode where appropriate This script is intended to be run as a bin/withlist script, i.e. % bin/withlist -l -r unicodify_archives <mylist> """ import paths import time from Mailman.i18n import _ from Mailman import mm_cfg def unicodify_string(s): if isinstance(s,unicode): return s elif isinstance(s,str): try: return s.decode() except UnicodeDecodeError: pass try: return s.decode('utf-8') except UnicodeDecodeError: pass return s.decode('windows-1252', 'replace') def unicodify_fst(t): l = list(t[1:]) l.insert(0, unicodify_string(t[0])) return tuple(l) def unicodify_archives(mlist): # Only act if we are using the internal archiver if mm_cfg.PUBLIC_EXTERNAL_ARCHIVER: return else: from Mailman.Archiver import HyperArch h = HyperArch.HyperArchive(mlist) for archive in h.archives: for hdr in ('subject', 'author'): h.database.mapKeys(unicodify_fst, archive, hdr) h.close() if __name__ == '__main__': print _(__doc__.replace('%', '%%')) ---------------------------------------------------------------------- Comment By: Schoinobates Volans (schoinobates) Date: 2007-02-27 23:45 Message: Logged In: YES user_id=41822 Originator: NO In Debian, we fixed that problem with the following patch and running the following withlist script on all mailing list on upgrade. Let me note in passing that the code for clearIndex looks very suspicious: it takes an "index" argument but completely ignores it. It should probably clear its argument and not be hardcoded to clear the thread index. --- Mailman/Archiver/HyperDatabase.py 2005-08-27 03:40:17.000000000 +0200 +++ Mailman/Archiver/HyperDatabase.py 2007-02-27 20:33:41.103527160 +0100 @@ -324,15 +343,22 @@ def clearIndex(self, archive, index): self.__openIndices(archive) if hasattr(self.threadIndex, 'clear'): self.threadIndex.clear() return finished=0 try: key, msgid=self.threadIndex.first() except KeyError: finished=1 while not finished: del self.threadIndex[key] try: key, msgid=self.threadIndex.next() except KeyError: finished=1 + + def mapKeys(self, f, archive, index): + self.__openIndices(archive) + index = getattr(self, index + 'Index') + d = index.dict + index.dict = dict(zip(map(f, d.keys()), d.values())) + index.__dirty = 1 #! /usr/bin/python # # Copyright (C) 2007 Lionel Elie Mamane <lmamane@debian.org> # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation; either version 2 # of the License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. """Convert a list's archive databases to unicode where appropriate This script is intended to be run as a bin/withlist script, i.e. % bin/withlist -l -r unicodify_archives <mylist> """ import paths import time from Mailman.i18n import _ from Mailman import mm_cfg def unicodify_string(s): if isinstance(s,unicode): return s elif isinstance(s,str): try: return s.decode() except UnicodeDecodeError: pass try: return s.decode('utf-8') except UnicodeDecodeError: pass return s.decode('windows-1252', 'replace') def unicodify_fst(t): l = list(t[1:]) l.insert(0, unicodify_string(t[0])) return tuple(l) def unicodify_archives(mlist): # Only act if we are using the internal archiver if mm_cfg.PUBLIC_EXTERNAL_ARCHIVER: return else: from Mailman.Archiver import HyperArch h = HyperArch.HyperArchive(mlist) currentVolume = h.dateToVolName(time.time()) if currentVolume in h.archives: for hdr in ('subject', 'author'): h.database.mapKeys(unicodify_fst, currentVolume, hdr) h.close() if __name__ == '__main__': print _(__doc__.replace('%', '%%')) ---------------------------------------------------------------------- Comment By: Mark Sapiro (msapiro) Date: 2006-12-15 19:10 Message: Logged In: YES user_id=1123998 Originator: NO See the threads at http://mail.python.org/pipermail/mailman-developers/2006-February/018587.htm... and http://mail.python.org/pipermail/mailman-users/2006-February/049345.html ---------------------------------------------------------------------- Comment By: Eugene Crosser (crosser) Date: 2006-12-15 11:37 Message: Logged In: YES user_id=124141 Originator: NO I can see the same thing (line numbers are different) with 2.1.9 version. Can it be that no one else got bitten by it yet? How to fix it? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=100103&aid=1246004&group_id=103
participants (1)
-
SourceForge.net