[Mailman-Users] Help with ONE broken list (Mailman 2.1.9)

Drew Tenenholz drew.tenenholz at isid.org
Tue May 5 21:22:28 CEST 2015

Hi all --

I'm hoping you can make some suggestions on hot to get a broken list working again.  Here's the basic details.

Mailman 2.1.9
RHEL 5.x
I have root access to the VM running our lists.

The problem was caused by an IT Dept. standard RHEL update on April 21.  Somehow, this update managed to revert some file back to an earlier state, breaking just ONE of my lists.  This was a list I migrated from a different mailman server  on March 31  (v2.1.14 installed from source on Mac OS X - so not the Apple distribution).  All of the other lists which were in place before March 31 are just fine.

For certain, /var/lib/mailman/data/aliases was replaced with an earlier version (I can't tell you how long it took to find this out....), and the newest aliases from the migration were removed.  So, I replaced them, and restarted both mailman and postfix, but no joy.

Now, I've been struggling through trying to get the rest of the configuration to work.  Here are some sample problems:

1) http://list.name.tld/mailman/admin/list-name/general page will not load, giving the "Oops, we've hit a bug" mailman/python error.  Oddly enough, I can manually type in the correct URL for things like the list members or language options and those pages DO load.

2) I've probably broken a lot of common sense rules, but I've replaced the /var/lib/mailman/lists/list-name/config.pck file with the one I used during the migration.  That not only doesn't fix the list, it seems to reset the list admin password (which i can now fix with change_pw), the URL (which I can fix with bin/withlist -l -r fix_url list-name -v).  but, as I said, the list is still showing all the problems as before.

3) For unknown reasons, /bin/config_list -o will not complete.

So, I got the IT folks to actually restore a snapshot of the machine to a DIFFERENT VM, and frustratingly enough, that version of the list seems to work fine.  I can even copy over the config.pck from the damaged list into this April 4 snapshot, and it WORKS!  (Grrr....)  I can also export the configuration from this newly restored and munged setup back onto the production machine, and it FAILS.

On the production machine, bin/check_db and bin/check_perms both run without reporting any problems.

If it makes any difference at all, this list is set to run with RUSSIAN CYRILLIC (KOI8-R) as the default language, so when I export the members list with full names, all I get are ???.  

I'd prefer to not only have the working list, but working archives, and the correct subscriber list as of the date things started to go worky.

Then, I hope to restore all of the shunted messages that were clogging up the entire system most of which were pending messages for the damaged list, and clear out that queue.

If you have gotten this far:
2) As you can see, this is a VERY ACTIVE mailman install.  It probably sends 500,000 mails/day to 70,000 subscribers on 12-15 lists, and also has to handle 50-100 confirmation requests/day with people constantly adding/removing themselves.  Sort of like trying to sip out of a fire hose....

Thinks in Advance,
Drew Tenenholz

More information about the Mailman-Users mailing list