[Mailman-Developers] Huge lists
Chuq Von Rospach
chuqui@plaidworks.com
Thu, 25 May 2000 11:05:57 -0700
At 10:32 AM +0100 5/25/2000, Nigel Metheringham wrote:
>Wietse had some figures on MTA performance analysis which he used as
>part of the design process for Postfix. He concluded that disk I/O was
>*the* limiting factor for an MTA
That matches my experience. I've had good results using RAM disks to
minimize this, which is sort of cheating, but worth it. And much of
the structural changes to Sendmail 8.10 are aimed at reducing this
impact, so they've finally figured this out, too. m
>If we have a million user list... and a message of a few K, I'm not
>sure I want to have a few GB of queue space taken up.
That's one reason in favor of having the MLM monitor MTA backlogs and
throttle itself. On my systems, I try to tune things so that I give
the MTA enough to chew on and get up to speed, but not so much that
it starts thrashing trying to deal with queue overhead issues. I'm
hoping sendmail 8.10 and down the road postfix allow me to no longer
worry about this (or worry about it less...). But for large lists,
it's another issue.
>Having said that I *really* would like the possibility of the
>occaisional message (maybe even just the password reminders.. although
>I'd prefer a method where some messages if the list was in a state
>where it has recently seen bounces that it cannot tie to a particular
>subscriber) be sent out using VERP.
It depends on the type of list. If you're a busy list, VERPing every
message could well be overkill, but again, it gets back to being able
to do other things as well, like pre-loading addresses into the
unsubscribe URL. there's some nice user-interface improvements you
can make once you have VERP to make the whole user experience much
less painful...
But if you're an announce list that only comes out occasionally,
sending out a monthly "here's your password" update on a list that
averages 2 messages a month seems to be the wrong thing, at least to
me. Because the hassle factor of the noise generated by the
administrative postings starts to overwhelm the content. your users
won't like that (this brings up a sub-discussion, that of the
"monthly reminder message", but we won't go there now... )
Again, I think we have to remember that there are lots of different
USES for mail lists, and different usage forms for those types. How
you handle a twice-a-month enewsletter is going to be much different
than a 40 message a day discussion list. So there's no single right
answer, and configurable options to support the different flavors is
a really Good Thing....
> However then we also need to
>recode the MTA incoming handling to take that - aliases don't cut it
>any more.
I was thinking last night that what would REALLY, REALLY be useful
here is an extended SMTP protocol that allows the VERPing to be
introduced by the receiving SMTP server, rather than the delivery
server or MLM. And after thinking about it, I went and laid down in a
dark room until I got over it... (snicker). But if you think about
it, the downside to VERP is you lose the efficiency of batching
multiple addresses into a single transaction, so the solution is to
extend SMTP to allow us to maintain that effeciency while building in
the VERPing data at time of delivery...
And if you think that's feasible, you need to lie down in a dark
room... But it's an intriguing thought....
>Counter examples are always problems.... The biggest UK ISP group
>(several "virtual" ISPs use the same bulk ISP service set) has a few
>million users each of whom have their own domain name - so you will
>find that *.freeserve.co.uk (around 2 million domains) all goes to the
>same batch of MXes.
Yeah. And until I realized that you need to worry about domain names
out to the fourth sub-domain, it was driving my database stuff crazy,
because lookups sludged out badly. Here in the states, you get used
to worrying about 2nd level domains, and maybe third, but when your
audience internationalizes, the rules change... I now track domain
name uniqueness out to the fourth part, just to handle places like
freeserve cleanly. Otherwise, all heck breaks loose.
>[On per-MTA documentation]
>Lets start bullying^Wpersuading people to contribute some documentation
>on this stuff or pointers to existing MTA documentation that addresses
>this.
As I find stuff out, I'll definitely make it available, and should be
able to at least help collate.
>Big lists are a different issue - you need to *choose* your MTA and
>hardware within your constraints for that. Tuning is probably a
>consultancy job for those.
I wouldn't worry too much about big stuff, either. And to be honest,
once I catch up on some other stuff, I'll be setting up a mail list
and other resources for big list admins.
>That particular one you mention should be blocked from the net -
>presumably their upstream is clueless too.
Yup. They effectively ARE blocked from all of my sites. One is in
italy, for instance, although another that I've had problems with is
a school site down in Scottsdale. the problem seems to be sites that
are running really downrev versions of things that nobody's watching
or upgrading.
--
Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com)
Apple Mail List Gnome (mailto:chuq@apple.com)
And they sit at the bar and put bread in my jar
and say 'Man, what are you doing here?'"