[Mailman-Users] Load-balancing mailman between two servers

Wed Nov 29 00:30:22 CET 2006

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Nov 28, 2006, at 6:08 PM, Guy Waugh wrote:

> I'm still wondering whether I should be NFS-sharing the qfiles
> directory. I haven't delved into the Mailman source code to try to
> figure this out, but...

You should be able to NFS share the qfiles directory, but you want to  
be careful about how you set up your qrunners.  However, this  
probably won't help you with what I think you really want to do  
(IIUC), which is load balance the web interface.

First, pending messages are not kept in qfiles -- that's only for  
messages that are being processed by the mail delivery subsystem.  A  
message that's waiting for moderation will get dequeued until it's  
approved, at which point it will be re-queued into the appropriate  
qfile directory.

Access to the "databases" which manage these pending files are all  
protected by Mailman's lockfile implementation, which has had a long  
stable history and a high probability of being NFS-safe, modulo bugs  
in specific NFS implementations of course.  So as long as your web  
requests can be completed within the lock timeouts, you should be  
able to load-balance admindb management across multiple web servers.   
Of course, while one server is accessing a list, no other processing  
for that list will occur on any other machine, as those other  
machines wait for the first machine's list lock to be released.   
However, processing involving other lists can still occur, as can  
outgoing mail delivery, which does not need to acquire a list lock.

The story with qfiles is this: every qfile lives in its own little  
slice of sha1 hash space and each hash slice is (supposed to be)  
owned by exactly one qrunner process.  This allows the qrunner to  
process the messages in its hash slice without having to deal with  
pesky locks which slows things down as contentions are serialized (a  
good thing when dealing with databases, a bad thing when you're  
trying to churn out a stream of messages).  Thus, if you're looking  
to load balance qfile directory processing, you can still do that if  
you assign each qrunner process on each machine a unique slice of the  
hash space -- it must be unique across all machines.  IOW, machine 1  
could handle the odd slices of qfile/in while machine 2 could handle  
the even slices.  Or you could have 8 qrunners on each machine and  
slice up qfiles/in 16 ways (the implementation requires a factor of 2  
in the number of hash slices).

Of course, if machine 1 went down, all the messages in its hash  
slices would sit unprocessed, but it would be a fairly simple matter  
to reconfigure machine 2 to handle machine 1's slices, or to bring up  
a fallback machine to handle those slices in the meantime.

That's the intent anyway <wink>.  I hope this makes sense and helps  
you better plan your operational environment.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRWzGlXEjvBPtnXfVAQLBkQP9FWfWoEo7AoTkXdvpoj5pdeX+OWMbJ8kX
n7oTthTmkULjmtqMjhKL0XT7wdy/5iYNaFRCJrCq2YYmwQBok4VyBZA0vQ/aHJKN
9RN6lxWQKIzBvm7nBRgIdGq4gw9THRCbjg2H9HpJjy5KunLbdE1Zi6MVzH5ag05J
VncWRKYCCPU=
=mejc
-----END PGP SIGNATURE-----