[Mailman-Developers] Re: To VERP or not to VERP?
Chuq Von Rospach
chuqui@plaidworks.com
Sun, 17 Jun 2001 14:23:32 -0700
On Sunday, June 17, 2001, at 11:58 AM, Fil wrote:
> On my setup (postfix) there's a heavy use of the /etc/postfix/transport
> feature, that redirects all domains with which I have a problem
what's known as the "bog" (a term I ripped off from Infobeat's
architecture). Try to deliver it once, exile it to the bog for non-real
time delivery.
Infobeat's actually kinda nice, if you want ot get really hard-core
about it. They deliver right out of the database into the client SMTP
port, effectively sucking up the outgoing MTA into the application
system. I've looked hard at doing that as well, but you start getting
into the grimy details of dealing with all the MX weirdness and the
like, and I've decided not to right an MTA right now, but instead to let
the MTA writers figure out how to best do that. )
> problems (mostly mailqueue becoming very heavy) -- maybe I should
> continue
> this way with more domains being handled to slave machine(s). Is that
> the
> option you're describing?
>
Not really, but it's a good, legitimate option. When I was under
sendmail 9, I implemented special slow queues as bogs and moved things
there after 3 delivery tries. Under sendmail 8.11, I don't, because the
sub-directory stuff allowed me to do away with having to deal with that
from a performance standpoint.
What I'm designing is a system that will cause a daemon on a central
server to connect to a bunch of machines in a farm and use them to
simultaneously generate messages out into multiple SMTP queues -- you
effectively hand a remote machine (an smtp smurf) a message template,
then feed it data as fast as it can send. You get a very compact
protocol between the two (since all you send over is the data to fill
the template), and each one is noting but a mail-creation process
feeding an MTA on a dedicated machine. And the rest, as they say, is
simply tuning for maximum perforamnce without thrashing.
This isn't a discussion-list system, but the delivery setup could be
adapted to that fairly easily. If you were going to do something similar
on Mailman, you could do it by having a farm of SMTP outgoing machines
under a round-robin with a short time-out, and making sure qrunner
re-looks up the IP after every message. Do that, and then extend to
allow for parallel qrunners, and you can buidl a heck of a mail list
farm. 2.1 will do most of that fairly easily, the rest is figuring out
and configuring the SMTP smurf farm and its round-robin.
In my case, I need to be able to take 40K length messages, and be able
to build a system that'll ship out ~1,000,000 addresses an hour, with
full customization (i.e. no piggybacking of adresses). Which might
explain why I did 3+ hours of math last night on the bandwidth question;
I needed it anyway, I'd been meaning to get to it, and as long as we
were talking about it, it was a great excuse to build a model for my
'real' work. My current system only does ~350-400,000 an hour with
limited piggybacking of addresses, and doesn't do the full
customization. But it also depends on a monolithic machine architecture,
which simply can't scale infinitely. Right now, my big bottleneck on
that machine is that my 100BaseT is full. I'm going to be bringing up a
quad ethernet soon, but from there, unless you start talking about fiber
and or trunking solutions, you're done. So rather than trying to eek out
another 2% at ever greater engineering cost, we're moving to a smurf
model, because you can buy a bunch of small, fast machines for the cost
of a big honker with gigabit ethernet and a trunking system.
The downside is you add administrative complexity, plus you need to
engineer getting the data where it needs to be (and the security of
making sure nobody else uses your smurfs...). But those seem to be
manageable. My current design includes three flavors of smurfs
(smtp-out, smtp-in, and web), and I'm actually working on whether it
makes sense take over a delegated sub-domain and run my own DNS, so I
can dynamically move the smufts into whatever function I need -- it'd
make a LOT of sense, from machine-usage terms, to be able to dedicate
all but one or two smurfs to SMTP-out during the early delivery, and
then start shunting them off one at a time into SMTP-in role or web-role
to handle returned bounces, postmaster mail and user unsubscription
requests. And once the bulge is done, take them offline and turn them
into bounce-processing smurfs.... I could do more with fewer machines,
but my office would end up looking like the Johnson Space center at
launch time... (grin).
--
Chuq Von Rospach, Internet Gnome <http://www.chuqui.com>
[<chuqui@plaidworks.com> = <me@chuqui.com> = <chuq@apple.com>]
Yes, yes, I've finally finished my home page. Lucky you.
USENET is a lot better after two or three eggnogs. We shouldn't allow
anyone on the net without a bottle of brandy. (chuq von rospach, 1992)