[Mailman-Developers] Re: [Mailman-Users] Problem with qrunner and too much incoming mail
Chuq Von Rospach
chuqui at plaidworks.com
Sat Nov 4 19:29:15 CET 2000
> > Once a connetion to the MX is established, bulkmail would then
>> just start delivering messages to it until the bin was emptied.
>> Any i/o blocks in any of the processes will allow async* to switch
>> to a different delivery channel. We may need to do some explicit
>> channel management to make sure some are not starved.
>Ouch. I really don't like this idea.
Neither do I. This is actually something that I've looked at long and
hard in my non-mailman server work. After a fair amount of work and
research, I finally came to the conclusion that you are MUCH better
off letting the MTA do the MTA's work, and letting the MLM do the MLM
work, and once you make the decision that the MLM has to *also*
become an MTA, you're doing down a road you don't want to travel.
Sendmail, for instance, has many years experience optimizing delivery
as an MTA. It's a complex, nasty business with lots of subtleties. If
you're building a list manager, how much work would you need to do to
get a private delivery system that's as well tuned and efficient as
sendmail already is? ditto all of the other MTAs.
I've built some prototype systems to test this. Even though (in
theory) you're adding a layer of delivery and other overhead, it's
very difficult to come even close to the performance a tuned MTA can
give you -- and you're writing a lot of code to do it.
One of the systems I've been investigating, for instance, would do
100% customized mail driven by a template document and pulling data
out of a database -- with a design parameter of up to 10 million
deliveries. the goal is at least 500K deliveries an hour, preferably
double that. Right now, on a system with a sendmail 8.9 base and a
non-optimized delivery tool, I'm doing 400-450K/hour. I expect to see
a nice addition when I move to sendmail 8.10.x in a week or so. this
is on a Sun E250, FWIW, with the sendmail queues living in a ram
disk. Good sized hardware, but not particularly big or fast hardware.
Instead of reinventing the MTA wheel, I think we're much better off
coming up with an MTA -> MLM interface that's very flexible and
highly configurable (most especially in how to deliver and how much
to parallelize the infeed to the MLM), and then focus on how to tune
the MTA and MLM through documentation.
Splitting the inbound and outbound queue would be my first thing
here, and probably split bounces into a third queue. That's a pretty
quick, easy optimization that makes sure the end user sees fast
response without being bogged down by deliveries, and that's a huge
perception issue. Then focus on parallelizing the delivery from
mailman into the MTA, and make that configurable so each admin can
tune it to their system and needs.
>As discussed previously amongst Chuq, Nigel and I, the needs of
>large list server systems are rather different from the normal home
>hobbyest requirements, but are not compleatly alien. However, the
>needs of very large list installations (cf ListServ, Egroups, or
>SourceForge) are rather different yet again.
This is a basic reality -- things don't scale. Or worse, they scale
for a while, and then you need to switch paradigms. I found that one
out the hard way. If someone wants a rhetoric on how to scale mail
list servers infinitely, I'd be happy to explain how, since I've had
to develop an architecture to do so. the nice thing is, it can be
done without exceptional engineering hassles -- but it's not just
adding another daemon or a faster CPU (although those are solutions
for parts of it, just not ALKWAYs the solutions)
> I'm not convinced of
>the value in beating on Mailman to support the (comparitively rare
>if high profile) very large installations when the current (much
>larger and more common Mailman-wise) mid-size realm still needs
>attention. Certainly, such changes should not detract from
>Mailman's current level of suitability for smaller installations.
I think we can build a Mailman that does this, at least for, oh, 95%
of the universe out there, and the other 5% are going to have custom
solutions anyway (or should!). What we don't want to do is screw up
Mailman for the "typical" user to make it work for the big site; but
we also don't want Mailman to get a reputation as a "small server
only" system, because it'll cause people to reject it in
implementations. Fortunately, I don't think you need to do that. It
just needs some tweaking.
>support for intermittently connected nodes. Say something like:
> Cron launches the bulkmailer.
> The bulkmailer forks N children processing the queue.
> The bulkmailer exits upon an empty queue.
> Should cron launch a new bulkmailer when the prevvious incarnation
> hasn't exited yet, the new instance merely exits immediately.
>Locking for the above is fairly simple. Standard IPCs can be used
>for the instance collision checks. Locking on the hash queues could
>be a bit intereting from a portability and performance vantage given
>the fact that the list side will be attemptiong to deliver into the
>same tree at the same time that deliveries are happening (no more
>lock collisions please) -- which pretty much requires that locking
>be on the queue-entry level rather than the hash bucket level. Not
>rocket science, just a bit finnicky.
>Will this handle SourceForge? Probably.
On reasonable hardware, definitely. That's basically how my current
custom system works. right now, the number of parallel infeeds from
mailman is 1. I'm willing to bet the delivery MTA is basically idle
and bored. By moving to parallel infeeds, you can stoke the MTA up to
speed, and the trick is for each site to figure out what number of
parallel infeeds will work keeping the queues full for the MTA to
stay busy without overloading them and causing the MTA to thrash.
That's simply a case of tuning and modelling. And simply allowing "N"
infeed threads to the MLM will solve Sourceforge's problem and pretty
much everyone else's, without having to get into the MTA business,
where the best we can really hope is to be "as good" as the real MTA.
so my recommendation is:
1) split the current qfiles into three queues: inbound, outbound and bounce
2) parallelize the outbound queue into "N" configurable delivery threads.
3) work on documentation on how to tune this for maximum perforamnce
with major MTAs,
and how to tune MTAs for maxiumum performance.
That's a set of pretty easy updates, no technological miracles or
black boxes, and solves all but the worst problems someone running
Mailman is likely to see. And for sites this *doesn't* solve, it's
either because they're doing the 5 pounds in a 1 pound bag thing, or
they probably need to start hiring people like us to custom build
Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui at plaidworks.com)
Apple Mail List Gnome (mailto:chuq at apple.com)
Be just, and fear not.
More information about the Mailman-Users