Mailman DOA after power outage crash.
![](https://secure.gravatar.com/avatar/9aac8dfb2443fd797bb8acbcb2322fca.jpg?s=120&d=mm&r=g)
Greetings,
We had a series of longer than battery outages on wednesday, and
when we came back online, mailman refused to run. Heres the messages:
mailman# tail -f error Oct 30 02:28:06 2010 (11862) couldn't load config file /usr/mailman/lists/[listname]/config.pck.last 1778451844
Oct 30 02:28:06 2010 (11862) couldn't load config file /usr/mailman/lists/[listname]/config.db [Errno 2] No such file or directory: '/usr/mailman/lists/[listname]/config.db'
Oct 30 02:28:06 2010 (11862) couldn't load config file /usr/mailman/lists/[listname]/config.db.last [Errno 2] No such file or directory: '/usr/mailman/lists/[listname]/config.db.last'
Oct 30 02:28:06 2010 (11862) All [listname]B fallbacks were corrupt, giving up
Oct 30 02:28:06 2010 (11862) error opening list: [listname] [Errno 2] No such file or directory: '/usr/mailman/lists/[listname]/config.db.last'
Please tell me theres a reasonable recovery for this on 2.1.21 (plus a patch)?
Thanks!
//Alif
-- "Never belong to any party, always oppose privileged classes and public plunderers, never lack sympathy with the poor, always remain devoted to the public welfare, never be satisfied with merely printing news, always be drastically independent, never be afraid to attack wrong, whether by predatory plutocracy or predatory poverty."
Joseph Pulitzer, 1907 Speech
![](https://secure.gravatar.com/avatar/56f108518d7ee2544412cc80978e3182.jpg?s=120&d=mm&r=g)
J.A. Terranson wrote:
2.1.12?
The recovery is to restore the corrupt /usr/mailman/lists/[listname]/config.pck from the most recent good backup.
Mailman does the best it can by trying to first write config.pck.tmp.<hostname>.<pid> and then removing config.pck.last, moving config.pck to config.pck.last and finally moving config.pck.tmp.<hostname>.<pid> to config.pck.
You could check for a /usr/mailman/lists/[listname]/config.pck.tmp.<hostname>.<pid> file and try moving that to /usr/mailman/lists/[listname]/config.pck if it exists, but even if it does, it may be bad too.
It appears that your system is caching disk writes to the extent that both the config.pck and config.pck.last were incompletely written when the power failed. You might look into that, and also consider setting
SYNC_AFTER_WRITE = Yes
in mm_cfg.py (see the documentation of this in Defaults.py).
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
![](https://secure.gravatar.com/avatar/9aac8dfb2443fd797bb8acbcb2322fca.jpg?s=120&d=mm&r=g)
On Sat, 30 Oct 2010, Mark Sapiro wrote:
Thanks for the terrible news :-/ Is my [completely off the cuff] understanding of the config file correct that it just holds the settings from the config pages? If so, is there any reason I cannot "create" a new list (after moving the existing one) with the same name and then sub the new config.pck for the old corrupted one?
Thanks for the flush information: even though this was a unique situation (every box in the rack was damaged, several beyond repair [including the backup server]), living in the middle of the tornado capital of the world means it could realistically happen again, so thats a really good one to know.
//Alif
-- "Never belong to any party, always oppose privileged classes and public plunderers, never lack sympathy with the poor, always remain devoted to the public welfare, never be satisfied with merely printing news, always be drastically independent, never be afraid to attack wrong, whether by predatory plutocracy or predatory poverty."
Joseph Pulitzer, 1907 Speech
![](https://secure.gravatar.com/avatar/56f108518d7ee2544412cc80978e3182.jpg?s=120&d=mm&r=g)
J.A. Terranson wrote:
Assuming you don't use some kind of custom member adaptor, the config.pck also contains all the list membership information.
You can try running 'strings' on the various config.pck* files to see if you can extract useful information that way.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
![](https://secure.gravatar.com/avatar/9aac8dfb2443fd797bb8acbcb2322fca.jpg?s=120&d=mm&r=g)
On Sat, 30 Oct 2010, Mark Sapiro wrote:
Everything seems to be in there: I see all the settings, plus it looks like all the subscribers are there as well. It's going to be *really* painfull manually restoring all ~1200 addresses and their information, but I suppose it will have to do.
Whats really odd is that only one (of approx 30 lists) was corrupted, and when I compare the output of the corrupted vs known goos files, they look roughly identical. Clearly they aren't, but... Is there any way to tell "where" mailman thinks the corruption begins or is it just the absence of a clean flag somewhere that I am hosed on?
//Alif
-- "Never belong to any party, always oppose privileged classes and public plunderers, never lack sympathy with the poor, always remain devoted to the public welfare, never be satisfied with merely printing news, always be drastically independent, never be afraid to attack wrong, whether by predatory plutocracy or predatory poverty."
Joseph Pulitzer, 1907 Speech
![](https://secure.gravatar.com/avatar/56f108518d7ee2544412cc80978e3182.jpg?s=120&d=mm&r=g)
J.A. Terranson wrote:
Probably that one list was the only one that was being or had 'recently' been updated when the power was removed.
Mailman doesn't have a clue as to what the problem is. The file is a Python pickle and all Mailman knows is its attempt to cPickle.load() the file threw an exception or didn't return a Python dictionary.
See <http://docs.python.org/library/pickle.html>.
Also, you may be able to use some of the pickletools functions to help determine what might be wrong with the file and how to fix it, See <http://docs.python.org/library/pickletools.html>. Also, see the text at the beginning of the /usr/lib/pythonx.x/pickletools.py file for documentation of the pickle file format. Note that config.pck is a "protocol 1" pickle.
The first step would be to run the following Python commands. It is best to run them under withlist as the unpickling process may need access to Mailman.
$ /usr/mailman/bin/withlist -i (some output)
Then, depending on the error, you might be able to use pickletools or just a dump of the file to see what the problem is.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
![](https://secure.gravatar.com/avatar/56f108518d7ee2544412cc80978e3182.jpg?s=120&d=mm&r=g)
J.A. Terranson wrote:
2.1.12?
The recovery is to restore the corrupt /usr/mailman/lists/[listname]/config.pck from the most recent good backup.
Mailman does the best it can by trying to first write config.pck.tmp.<hostname>.<pid> and then removing config.pck.last, moving config.pck to config.pck.last and finally moving config.pck.tmp.<hostname>.<pid> to config.pck.
You could check for a /usr/mailman/lists/[listname]/config.pck.tmp.<hostname>.<pid> file and try moving that to /usr/mailman/lists/[listname]/config.pck if it exists, but even if it does, it may be bad too.
It appears that your system is caching disk writes to the extent that both the config.pck and config.pck.last were incompletely written when the power failed. You might look into that, and also consider setting
SYNC_AFTER_WRITE = Yes
in mm_cfg.py (see the documentation of this in Defaults.py).
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
![](https://secure.gravatar.com/avatar/9aac8dfb2443fd797bb8acbcb2322fca.jpg?s=120&d=mm&r=g)
On Sat, 30 Oct 2010, Mark Sapiro wrote:
Thanks for the terrible news :-/ Is my [completely off the cuff] understanding of the config file correct that it just holds the settings from the config pages? If so, is there any reason I cannot "create" a new list (after moving the existing one) with the same name and then sub the new config.pck for the old corrupted one?
Thanks for the flush information: even though this was a unique situation (every box in the rack was damaged, several beyond repair [including the backup server]), living in the middle of the tornado capital of the world means it could realistically happen again, so thats a really good one to know.
//Alif
-- "Never belong to any party, always oppose privileged classes and public plunderers, never lack sympathy with the poor, always remain devoted to the public welfare, never be satisfied with merely printing news, always be drastically independent, never be afraid to attack wrong, whether by predatory plutocracy or predatory poverty."
Joseph Pulitzer, 1907 Speech
![](https://secure.gravatar.com/avatar/56f108518d7ee2544412cc80978e3182.jpg?s=120&d=mm&r=g)
J.A. Terranson wrote:
Assuming you don't use some kind of custom member adaptor, the config.pck also contains all the list membership information.
You can try running 'strings' on the various config.pck* files to see if you can extract useful information that way.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
![](https://secure.gravatar.com/avatar/9aac8dfb2443fd797bb8acbcb2322fca.jpg?s=120&d=mm&r=g)
On Sat, 30 Oct 2010, Mark Sapiro wrote:
Everything seems to be in there: I see all the settings, plus it looks like all the subscribers are there as well. It's going to be *really* painfull manually restoring all ~1200 addresses and their information, but I suppose it will have to do.
Whats really odd is that only one (of approx 30 lists) was corrupted, and when I compare the output of the corrupted vs known goos files, they look roughly identical. Clearly they aren't, but... Is there any way to tell "where" mailman thinks the corruption begins or is it just the absence of a clean flag somewhere that I am hosed on?
//Alif
-- "Never belong to any party, always oppose privileged classes and public plunderers, never lack sympathy with the poor, always remain devoted to the public welfare, never be satisfied with merely printing news, always be drastically independent, never be afraid to attack wrong, whether by predatory plutocracy or predatory poverty."
Joseph Pulitzer, 1907 Speech
![](https://secure.gravatar.com/avatar/56f108518d7ee2544412cc80978e3182.jpg?s=120&d=mm&r=g)
J.A. Terranson wrote:
Probably that one list was the only one that was being or had 'recently' been updated when the power was removed.
Mailman doesn't have a clue as to what the problem is. The file is a Python pickle and all Mailman knows is its attempt to cPickle.load() the file threw an exception or didn't return a Python dictionary.
See <http://docs.python.org/library/pickle.html>.
Also, you may be able to use some of the pickletools functions to help determine what might be wrong with the file and how to fix it, See <http://docs.python.org/library/pickletools.html>. Also, see the text at the beginning of the /usr/lib/pythonx.x/pickletools.py file for documentation of the pickle file format. Note that config.pck is a "protocol 1" pickle.
The first step would be to run the following Python commands. It is best to run them under withlist as the unpickling process may need access to Mailman.
$ /usr/mailman/bin/withlist -i (some output)
Then, depending on the error, you might be able to use pickletools or just a dump of the file to see what the problem is.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (2)
-
J.A. Terranson
-
Mark Sapiro