[Mailman-Users] Moving lists with large archives to new server

Mark Sapiro mark at msapiro.net
Sun Feb 1 02:57:29 CET 2015


On 01/30/2015 03:09 PM, Andrew Hodgson wrote:
> 
> My question revolves around the list archives.  If I start Mailman up before they are copied across, it is likely there will be waiting messages and these will get archived to a new mbox file.  I want to reduce downtime in moving the lists, but the mbox files are around 25GB, and there are a couple of lists which take the majority of that.  I was going to index the messages once the mbox files are copied over, but what is the smartest way to get these moved without downing Mailman for too long?


It seems like you are planning to just move the
archives/private/LIST.mbox/LIST.mbox files and rebuild the archives with
bin/arch --wipe after the .mbox files are moved.

This in fact is the recommended procedure. One consideration is whether
you want to preserve the message numbers in the new archive. If you
don't do this, and someone has a saved URL to an archived message, that
URL will return a different message from the new archive. So let's
assume you do want to do this.

The second consideration is whether your .mbox files are "good". They
probably are, but if they go back several years and you haven't
previously checked, they may contain unescaped "From " lines in message
bodies. bin/arch is very lax about validating "From " separators. Any
line that begins with those 5 characters is considered a message
separator and can result in trailing parts of messages being in the
archive without headers and separately from the leading parts.

This is normally not an issue with .mbox files written only by 2.1.x
versions of Mailman.  Mailman's bin/cleanarch script can help detect and
fix such problems.

OK. Now we'll assume you have a bunch of good LIST.mbox files on the old
server which you just stopped. I think there are two ways to proceed.
You'll have to decide which works better for you. While typing, I
decided Way 2. is better.

Way 1. Bring up the new server and start Mailman. Let the lists operate
and messages archive. These messages will ultimately be renumbered, but
this is a relatively short window so it may not be an issue.

Bring the LIST.mbox files as say LIST.mbox/List.mbox.old

After you have the files moved, find the PID of ArchRunner via something
like

ps -fwwC python

and kill it with SIGTERM OR SIGINT. Do not use SIGKILL or the master
will just restart it.

Then do something to the effect of

cat  LIST.mbox/List.mbox >> LIST.mbox/List.mbox.old
mv LIST.mbox/List.mbox LIST.mbox/List.mbox.new
mv LIST.mbox/List.mbox.old LIST.mbox/List.mbox

And verify the ownership and permissions on LIST.mbox/List.mbox which
now should have all the messages with the new ones at the end.

Then copy the ArchRunner command from the previous ps command output and
issue that command with sudo -u mailman's_uid and a trailing & to
restart ArchRunner. If You don't start in in the background, it will
remain attached to your terminal and breaking out will SIGINT it and it
will die again. Command is like

sudo -u mailman /usr/bin/python /var/MM/21/bin/qrunner \
--runner=ArchRunner:0:1 -s&

Now your mbox is good and new messages are being archived and you can
run bin/arch --wipe LIST to build the pipermail archive. This should be
OK as bin/arch will lock the archive to prevent conflicts with incoming
messages.

Way 2. I think this is better.

Bring up the new server, start Mailman and immediately stop ArchRunner
as above before any new messages arrive, perhaps before starting the MTA.

Now you can just bring the LIST.mbox/List.mbox files over and check
permissions and rebuild the archives with bin/arch with --wipe or not as
the pipermail archive should be empty anyway. Meanwhile lists are
running and messages for the archive are queued.

Then when the pipermail archives have been rebuilt, start ArchRunner as
above.

Now there are potentially some cleanup things to do. It is possible,
depending on how many posts arrived while ArchRunner was not running,
the qfiles/archive directory grew large. This can be a performance
issue. See the last paragraph in the FAQ at
<http://wiki.list.org/x/4030638>. You can shrink the queue directory by
waiting until it's empty, stopping Mailman, moving the directory aside
and recreating it with the same ownership and permissions and restarting
Mailman. and if by chance, one or two entries snuck in to the old, moved
aside directory, move them individually back to the new directory.

Finally, see the script at
<http://www.msapiro.net/scripts/update_archive_mtime> (mirrored at
<http://fog.ccsf.edu/~msapiro/scripts/update_archive_mtime>) if you care
about the modification times of the archived message files.

Caveat: I just typed this out. I'm *sure* it will work, but I've been
wrong before.

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan


More information about the Mailman-Users mailing list