[Mailman-Developers] Huge lists
Chuq Von Rospach
Wed, 24 May 2000 20:47:21 -0700
At 6:38 PM -0700 5/24/2000, J C Lawrence wrote:
>True. My curiosity however is what MTA's do MX sorting, and more
>particularly, MX collapsing (eg for two different targets that share
>an MX's among their lowest level). The potential gains there are
>likely not huge, but could be (guesstimate) noticable for high
>volume servers with broad standard deviations in their target lists.
>I'll have to check into that some time.
but -- as the experts say, the first $500 buys you 90% of the stereo
response, and the rest of the money goes into getting you as close to
100% as you can get. MX sorting is definitely far up into that 90%
range, computationally and time expensive, and lots of other stuff
can be done first, with more gain, and less effort. For most lists,
the differential in performance between domain sorting and MX sorting
is probably not statistically meaningful.
Maybe one thing we need is a definition of what Mailman is and what
it isn't. Some kind of target for the size of lists it wants to
reasonably support. If it's 5,000 users, it doesn't matter what you
do. If it's 50,000, or 500,000, you definitely have different
So defining what mailman wants to solve can help us clear these
things up. "Every list in the universe" is a laudable goal, but it'll
probably delay shipping 2.0 for a decade or so... So I'd like to
suggest some performance goals be defined, and then program to those,
so we're all on the same page.
(being able to handle a moderately busy 25,000 user list, say 15-30
messages a day, would probably cover 95% of the mailing lists in the
universe, and still technologically well within reach... It'd be nice
to be able to say "5 million subscribers in 2 minutes!", but focus on
a solid "do most things for most folks" now, and add the high
performance/huge list support in 2.5. But leave the hooks in, so we
don't have to rewrite later....)
> > Definitely. Since most of the "performance" issues involve the
>> MTA, and the MLM only affects it based on how it stuffs things
>> into the MTA.
>There gets to be a point however where it really exceeds Mailman's
True, but a one page README.<mlm> page in the disto for each
reasonably supported MLM isn't a bad thing, and better than what
anyone has. Because one reality is that most MLMs are configured
(especially out of the box) to manage incoming mail, and efficient
handling of outgoing mail is very different. Some hints on dealing
with those optimizations and tradeoffs can't hurt, and wouldn't have
to be significant or huge efforts.
> Mailman is a list server, not a training course on how to
>build and configure a high volume mail system. While I don't think
>we've crossed or even approached that line, In general I'd rather
>spend time on Mailman than high end server considerations which are
>adequately (?) documented elsewhere.
I tend to agree -- but performance of mailman is inextricably tied to
performance and interface with the MTA. If you ignore the MTA, your
chances of making mailman work well are very small. and users will
tend to blame mailman, because "sendmail worked fine before we
installed mailman, so...."
> > Right now, I generally recommend sites doing a lot of mail-list
>I generally recommend heartily against Sendmail for such sites. I
>just don't see it as worth the extra effort (or obscurity) when
>newer MTAs such as Exim (wot I use currently), QMail or Postfix in
>general offer the same or better performance and configurability
>with the added benefit of human readable/auditable config files.
>While its a cheap logic, its easy to note that none of the very high
>volume commercial email sites out there are based on Sendmail
>(Critical Path, Hotmail, Onelist, EGroups, etc).
Valid points. But sendmail is a default-install in many
installations, and so it's going to be what's avaialble. So helping
people figure out how to best make use of it is important, sort of
like refusing to let AOL users on a list. Yes, some AOL users can be
problems, but AOL users also tend to be a huge part of an audience
(on my machines, 15% isn't uncommon).
Postfix looks like a *real* win, but until I run it through its
paces, I won't use it. But the people I know who do love it. And I've
got other fish to fry before moving to postfix (and right now, I'm
doing 400-500,000 an hour out of my mail system without trying too
hard, using sendmail 8.9.3, and peaks approaching 900K. So eeking out
more performance by swapping MTAs is not a priority)
> > As someone who deals with email for a living...
>I should probably note at this point that I'm working for Critical
>Path on their mail systems.
As long as we're into disclosure, I run a bunch of hobby lists at
plaidworks.com, but I also do most of the mail list stuff at Apple,
where there's a combination of off the shelf (or actually, heavily
hacked) majordomo and custom jobs, so my lists range from really tiny
(10-12) to very, very large. The large system is custom coded, with
the exception of the last remnant, which is bulk_mailer. I've
completely replaced everything else, and bulk_mailer's replacement is
going into test as soon as I finish it (and it'll fully VERP;
although I had a bit of a scare last week when I was doing some
throughput estimates and got some zeros wrong, and thought for a
while that my total delivery was going to range into the terabytes. I
was wrong, thank ghu, and it's merely in the range of 40-60 gigabytes
>Sorry, entirely different orders of magnitude there. Notes is bad,
>certainly, and there few things even close to being as bad as Notes
>or CC Mail (tho they've gotten a lot better in recent years (which
>isn't saying much)), but Exchange/Outlook make them look positively
>angelic in comparison.
Notes is obnoxious, especially since return-receipt is an
administrator controlled option, and not smart enough to NOT r-r
mailing lists (or anything else), and I've found Notes administrators
about as obnoxious as their software when you point things like that
you. The only word I can use for Exchange is brutal. There are
exchange sites out there who's idea of a bounce message is to return
the mail to the "to:" line with only the Message-ID changed. you can
imagine how much fun THAT is.
Those sites (fortunately rare, all broken, but at least two of them
have been broken that way for four bloody years) my site simply
Chuq Von Rospach - Plaidworks Consulting (mailto:email@example.com)
Apple Mail List Gnome (mailto:firstname.lastname@example.org)
And they sit at the bar and put bread in my jar
and say 'Man, what are you doing here?'"