[Mailman-Developers] Re: [Mailman-Users] Problem with qrunner and too much incoming mail

J C Lawrence claw@kanga.nu
Fri, 03 Nov 2000 18:13:32 -0800


<<This was originally off-list, moved back on list with mutual permission>>

On Fri, 3 Nov 2000 17:01:01 -0800 
Marc MERLIN <marc@merlins.org> wrote:

> On Fri, Nov 03, 2000 at 04:56:33PM -0800, J C Lawrence wrote:

>> Yup, that's what I mean by pass-thru.  

> Ok, I wasn't sure.

No worries.

>> of having Exim do that a loooong time back but I don't know if
>> anything ever happened there (there are obvious timeout and
>> performance issues when delivering to slow/unresponsive sites
>> (think about behavioural changes mid-negotiation)).

> I see. So the message gets spooled as a backup and erased right
> afterwards if the current process was able to deliver it I
> suppose.

Quite.  Note that this leaves with the same disk IO that not doing
pass-thru does.  The only time pass-thru gains you anything is when
you can hand the message off very close to as fast as, or faster
than you can receive it.  In many cases I could see this being a
gain given free queue runners and minimal queue runner negotiation
overhead or complexity.  

>> To deal with the physical IO problems, the very large volume mail
>> server companies I've dealt with and talked to (eg my last
>> client, Critical Path) use solid state disks for their mail
>> queues (in CP's case under a modified QMail, soon to be replaced
>> by NPLex).  Its a bloody expensive solution to be sure, but its
>> also a fast one.

> It can come down to that, but that's throwing money at hardware
> instead of fixing mailman's deficiencies.

True.  Mailman is the real bottleneck here.  A simple, and
computationally and code-wise cheap solution would be to split the
queues per list.  Simply have each list have a private outbound
queue area (this may be done already, I haven't looked at v2 in
detail and am short of time right now).  Then rather than threading,
simply have per-list queue locks.  The qrunner then forks N children
(configurable) which proceed to attempt to empty each queue in
order, each one locking its queue as it gets it (such that the
others pass over it) etc.  Thus deliveries to the MTA are
parallelised.

Its a fairly light weight code change (private queues and a child
forking qrunner) but it skips all the overhead of threading and
fancy locking schemse.  Further, as the number of posts per list
queue is likely to be low, the iteration rate, and therefore the
lock period (for other posts to join a given list's queue will be
small.

If Mailman v2 currently uses private queue (dunno) then your likely
problem is twofold: the thrash rate of walking 10K private queues
plus the MTA receipt problem.  This problem is mostly unique to the
very large number of lists your run.  A possibly better tack there
(given a reasonbly high frequency qrunner polling rate) would
probably be to store the queue entries in hash trees (hash by list
or hash by post) and then go from there.  Adjust the hash for say N
buckets (where N is the total number of parallel qrunners you want).
This has the advantage of minimising disk and directory thrashing.
The disadvantage is that the code impact would be correspondingly
large.

> On Fri, Nov 03, 2000 at 04:57:11PM -0800, J C Lawrence wrote:
>> > Yeah, that's just what I was saying here.  ...etc
>> 
>> BTW: Did you inbtend this to be off list?

> Yeah, it's not mailman related, it's exim stuff, thus off topic
> :-)

<nod>

This conversely whould probably go back to the list.  Mind if I
cross it?

-- 
J C Lawrence                                 Home: claw@kanga.nu
---------(*)                               Other: coder@kanga.nu
http://www.kanga.nu/~claw/        Keys etc: finger claw@kanga.nu
--=| A man is as sane as he is dangerous to his environment |=--