[ mailman-Bugs-1246004 ] 2.1.6 bin/arch bombs out on unicodeerror

SourceForge.net noreply at sourceforge.net
Wed Feb 28 22:16:05 CET 2007


Bugs item #1246004, was opened at 2005-07-27 15:19
Message generated for change (Comment added) made by schoinobates
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=100103&aid=1246004&group_id=103

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: mail delivery
Group: 2.1 (stable)
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Auke Kok (sofar)
Assigned to: Nobody/Anonymous (nobody)
Summary: 2.1.6 bin/arch bombs out on unicodeerror

Initial Comment:

Running bin/arch on my i18n- development list I got
this beauty:

Updating HTML for article 2390
Updating HTML for article 2391
Pickling archive state into
/var/mailman/archives/private/xfce-i18n/pipermail.pck
Traceback (most recent call last):
  File "bin/arch", line 200, in ?
    main()
  File "bin/arch", line 188, in main
    archiver.processUnixMailbox(fp, start, end)
  File "/var/mailman/Mailman/Archiver/pipermail.py",
line 573, in processUnixMailbox
    self.add_article(a)
  File "/var/mailman/Mailman/Archiver/pipermail.py",
line 625, in add_article
    article.parentID = parentID =
self.get_parent_info(arch, article)
  File "/var/mailman/Mailman/Archiver/pipermail.py",
line 657, in get_parent_info
    article.subject)
  File
"/var/mailman/Mailman/Archiver/HyperDatabase.py", line
311, in getOldestArticle
    self.__openIndices(archive)
  File
"/var/mailman/Mailman/Archiver/HyperDatabase.py", line
251, in __openIndices
    t = DumbBTree(os.path.join(arcdir, archive + '-' + i))
  File
"/var/mailman/Mailman/Archiver/HyperDatabase.py", line
65, in __init__
    self.load()
  File
"/var/mailman/Mailman/Archiver/HyperDatabase.py", line
179, in load
    self.__sort(dirty=1)
  File
"/var/mailman/Mailman/Archiver/HyperDatabase.py", line
73, in __sort
    self.sorted.sort()
UnicodeDecodeError: 'ascii' codec can't decode byte
0xc3 in position 0: ordinal not in range(128)


----------------------------------------------------------------------

Comment By: Schoinobates Volans (schoinobates)
Date: 2007-02-28 22:16

Message:
Logged In: YES 
user_id=41822
Originator: NO

The script needs to actually upgrade all archive volumes, not only the
current one, because if a post to a ML comes with a date in the past, it
will be added to the old volume.


New script:

#! /usr/bin/python
#
# Copyright (C) 2007 Lionel Elie Mamane <lmamane at debian.org>
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
# 
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
# 
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software 
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301, USA.

"""Convert a list's archive databases to unicode where appropriate

This script is intended to be run as a bin/withlist script, i.e.

% bin/withlist -l -r unicodify_archives <mylist>
"""

import paths
import time
from Mailman.i18n import _
from Mailman import mm_cfg

def unicodify_string(s):
    if isinstance(s,unicode):
        return s
    elif isinstance(s,str):
        try:
            return s.decode()
        except UnicodeDecodeError:
            pass
        try:
            return s.decode('utf-8')
        except UnicodeDecodeError:
            pass
        return s.decode('windows-1252', 'replace')

def unicodify_fst(t):
    l = list(t[1:])
    l.insert(0, unicodify_string(t[0]))
    return tuple(l)

def unicodify_archives(mlist):
    # Only act if we are using the internal archiver
    if mm_cfg.PUBLIC_EXTERNAL_ARCHIVER:
        return
    else:
        from Mailman.Archiver import HyperArch
        h = HyperArch.HyperArchive(mlist)
        for archive in h.archives:
            for hdr in ('subject', 'author'):
                h.database.mapKeys(unicodify_fst, archive, hdr)
        h.close()



if __name__ == '__main__':
    print _(__doc__.replace('%', '%%'))


----------------------------------------------------------------------

Comment By: Schoinobates Volans (schoinobates)
Date: 2007-02-27 23:45

Message:
Logged In: YES 
user_id=41822
Originator: NO

In Debian, we fixed that problem with the following patch and running the
following withlist script on all mailing list on upgrade. Let me note in
passing that the code for clearIndex looks very suspicious: it takes an
"index" argument but completely ignores it. It should probably clear its
argument and not be hardcoded to clear the thread index.

--- Mailman/Archiver/HyperDatabase.py	2005-08-27 03:40:17.000000000 +0200
+++ Mailman/Archiver/HyperDatabase.py	2007-02-27 20:33:41.103527160 +0100
@@ -324,15 +343,22 @@
 
     def clearIndex(self, archive, index):
         self.__openIndices(archive)
         if hasattr(self.threadIndex, 'clear'):
             self.threadIndex.clear()
             return
         finished=0
         try:
             key, msgid=self.threadIndex.first()
         except KeyError: finished=1
         while not finished:
             del self.threadIndex[key]
             try:
                 key, msgid=self.threadIndex.next()
             except KeyError: finished=1
+
+    def mapKeys(self, f, archive, index):
+        self.__openIndices(archive)
+        index = getattr(self, index + 'Index')
+        d = index.dict
+        index.dict = dict(zip(map(f, d.keys()), d.values()))
+        index.__dirty = 1



#! /usr/bin/python
#
# Copyright (C) 2007 Lionel Elie Mamane <lmamane at debian.org>
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
# 
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
# 
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software 
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301, USA.

"""Convert a list's archive databases to unicode where appropriate

This script is intended to be run as a bin/withlist script, i.e.

% bin/withlist -l -r unicodify_archives <mylist>
"""

import paths
import time
from Mailman.i18n import _
from Mailman import mm_cfg

def unicodify_string(s):
    if isinstance(s,unicode):
        return s
    elif isinstance(s,str):
        try:
            return s.decode()
        except UnicodeDecodeError:
            pass
        try:
            return s.decode('utf-8')
        except UnicodeDecodeError:
            pass
        return s.decode('windows-1252', 'replace')

def unicodify_fst(t):
    l = list(t[1:])
    l.insert(0, unicodify_string(t[0]))
    return tuple(l)

def unicodify_archives(mlist):
    # Only act if we are using the internal archiver
    if mm_cfg.PUBLIC_EXTERNAL_ARCHIVER:
        return
    else:
        from Mailman.Archiver import HyperArch
        h = HyperArch.HyperArchive(mlist)
        currentVolume = h.dateToVolName(time.time())
        if currentVolume in h.archives:
            for hdr in ('subject', 'author'):
                h.database.mapKeys(unicodify_fst, currentVolume, hdr)
        h.close()



if __name__ == '__main__':
    print _(__doc__.replace('%', '%%'))


----------------------------------------------------------------------

Comment By: Mark Sapiro (msapiro)
Date: 2006-12-15 19:10

Message:
Logged In: YES 
user_id=1123998
Originator: NO

See the threads at
http://mail.python.org/pipermail/mailman-developers/2006-February/018587.html
and
http://mail.python.org/pipermail/mailman-users/2006-February/049345.html

----------------------------------------------------------------------

Comment By: Eugene Crosser (crosser)
Date: 2006-12-15 11:37

Message:
Logged In: YES 
user_id=124141
Originator: NO

I can see the same thing (line numbers are different) with 2.1.9 version.
Can it be that no one else got bitten by it yet?  How to fix it?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=100103&aid=1246004&group_id=103


More information about the Mailman-coders mailing list