[Mailman-Developers] A thought on SMTPHOST

Mon, 04 Mar 2002 10:09:24 -0800

Just wanted to bluesky this a bit, and make a suggestion to Barry.

I'm starting to see volume stress on my E250, mostly due to high volume (14
million pieces of email last month, more or less) and disk I/O issues (only
2 SCSI drives in the beast). I've pretty much maxed out capacity on that
thing unless I wanted to stuff in more drives, and the cost of a good, fast
hard drive for a Sun box and the cost (to me) for a good, fast MacOS X box
aren't much different... And for some reason, my bosses like the idea of
MacOS X boxes. 

So I'm getting ready to start what's known as my "smurf army" in my design
documents: a farm of small, inexpensive, fast machines set up specifically
to deliver mail as quickly as possible. (the first three are ultra-5's,
mostly because another project went away and we got them for free....)

And I've been looking at the best way to interface all of these with the
delivery boxes. With Mailman, that means SMTPHOST.

The way SMTPHOST is set up, to implement a smurf army would require setting
up a round robin DNS of the various hosts. That'll work, but... If one of
the boxes goes down for some reason, some percentage of Mailman deliveries
would fail when it hits that part of the round robin, unless I tweak the
round robin constantly, which would require setting up a DNS box among the
smurf to delegate the round robin to instead of the corporate DNS, which is
not a good idea on any number of levels for us...

Which got me thinking. You could put the smurf army behind a load balancing
box (mac.com's SMTP is that way), but... I don't want to spend the money on
that, administer that, or do it that way.

I looked at what it'd take to hack multiple SMTP host support into Mailman.
Not a huge deal, actually, but then unless Barry buys the hack back, I'm
forked. I'd rather not, although it'd be a nice setup for larger
installations.

So... I suddenly realized we have a process to handle this. MX records.

What if Mailman were made MX aware?

You could then set up SMTPHOST to point to a DNS name. If that DNS name has
an MX definition for it, Mailman would use it instead of the IP. That would
make this backwards compatible, since under almost all circumstances, the MX
record will be pointing to the machine if it exists (if it points elsewhere,
but you're using as an SMTP host, what in the heck are you doing? I can only
think of one scenario where that makes sense (dedicated incoming SMTP
beastie) and if you're that large -- you probably want this hack)

This would allow you to define your "smurf army" any way you want, as big as
you want, and it's self-healing to failures. Imagine the following scenario

A record: smtp-out.lists.apple.com
MX  5 smurf1.lists.apple.com
MX  5 smurf2.lists.apple.com
MX  5 smurf3.lists.apple.com
MX  5 smurf4.lists.apple.com
MX 20 lists.apple.com
MX 30 applenews.lists.apple.com

As a forinstance. When qrunner starts up, it grabs the MX data and stores
it. Every time it has to delivery something, it picks one of the MX of 5 and
delivers to it. If that delivery fails, it grabs another 5. If all 4 fail,
it backs off to the 20, then the 30. Only if all six boxes fail does qrunner
drop out and retry later.

This allows each smurf to load-balance itself. If it gets busy, it stops
accepting mail and lets someone else handle it.

You could hack this kind of configuration into mailman itself, but one
advantage of the MX setup is if you run multiple servers feeding into that
farm, configuration is in one place for the farm.

It doesn't seem supporting MX (perhaps as a configurable option?) would be
difficult, but I haven't looked into the python library code yet. Barry,
what do you think about this?

-- 
Chuq Von Rospach, Architech
chuqui@plaidworks.com -- http://www.chuqui.com/

Someday, we'll look back on this, laugh
nervously and change the subject.