Re: Efficient final message disposition (was Re: [Mailman-Developers] Requirements for a new archiver)
On Thu, 30 Oct 2003 18:20:56 +0100 Brad Knowles <brad.knowles@skynet.be> wrote:
At 8:41 AM -0800 2003/10/30, Chuq Von Rospach wrote:
One of them is recipient sorting by average delivery time over the past week (probably want a decaying geometric mean), which would require tracking log data on a per-recipient basis.
While I don't disagree, this is really an MTA's job, not Mailman's. This is why I've been doing log analysis of MXes and routing mail to customised outbound MTAs on the basis of responsiveness, since early 2000. Adaptive MX routing is great stuff.
Another is two-level message handling, by configuring the MTA for the initial delivery attempt to use very low timeouts, but then to fall back to a secondary MTA (or MTA pool) that uses more standard timeouts for those sites that are slower.
Yup. I did it at the first level with an initial SMTP proxy which routed based on MX response records pulled from a DB.
Perhaps in its current form, that is true. However, not all sites are using sendmail 8.12, and of the ones that are, most are probably not using it in a manner that is more suitable for mailing lists.
I'm generally of the view that Mailman should do opportunistic domain sorting and per-MTA customised VERP handoffs (because nobody has standardised VERP across MTAs), and beyond that to back off. Mailman's job is to get the outbound mail into the MTA's spool as quickly as possible, wrapped in transactions (ie RCPT TO bundles) that are friendly to efficient processing, and that's it.
We're not in the game of second guessing the MTAs. That way lies wasted time and madness.
However, given the issues you've mentioned, it would probably be a good idea to be able to turn off selected "bulk_mailer" type features, so that you can let the MTA do more of it's job better -- if it is configured to do so.
There are thresholds for covering up for broken software. There are also thresholds for covering up for SysAdm negligence or oversight. You've got to pick where you stop accepting the problem. Ideally we should be resilient and friendly to both. Realistically we need to do something reasonable and not worry too hard about the rest.
Priorities.
Mailman's primary performance problems are not at the MTA hand off. MTA configuration and tuning for mailing lists is only a minor art. There is not-inconsiderable documentation and understanding of the field. A US$2K commodity box subjected to moderate tuning efforts using readily available documentation can sustain 2,400 outbound deliveries per minute. You do the arithmetic. In a perfect world that maps out to 3.4 million per day. Cut that under half for queue injection overhead other crap and you're still talking a million deliveries per day for a US$2K host.[1] A million messages a day already puts us above the 99th percentile for list server audiences. I'm not really concerned about that problem.
Where Mailman's performance hurts is in the handling of the list configs, especially for lists with very large memberships rosters and in queue runner performance and overhead (try watching queue runner's system resource profile in v2.1 for lists with > 50,000 members). For me those are the obvious low hanging fruit, and those are the points that will help not just the performance hounds, but also the lower 80% who are running under-provisioned under-configured under-admined multi-purpose boxes who want Mailman to be a bit more reasonable and forgiving about their not-so-brilliant systems.
[1] That's of course assuming reasonable sustained queue size and responsive MXes. However, those are separate problems and ignoring MTA-specific behaviours (like Exim's active hatred of large queues), the methods and systems to segment and tame those problems are fairly well known.
-- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. claw@kanga.nu He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live.
At 8:20 PM -0500 2003/10/30, J C Lawrence wrote:
While I don't disagree, this is really an MTA's job, not Mailman's. This is why I've been doing log analysis of MXes and routing mail to customised outbound MTAs on the basis of responsiveness, since early 2000. Adaptive MX routing is great stuff.
There is a need for this function, and no MTA available today
does it. MLMs throughout the history of the Internet have incorporated a variety of features for SMTP performance enhancement that are unique to mailing lists or are usually found primarily in mailing lists, and this is no different.
If you want to externalize all these functions outside of
mailman, that's fine. But then someone has to pick up the ball and start hacking on bulk_mailer or some other program to provide these features.
Yup. I did it at the first level with an initial SMTP proxy which routed based on MX response records pulled from a DB.
Again, this is a feature which is not found on any MTA available
today, and which is known to have a huge impact on mailing list performance. This feature needs to be provided somewhere, by someone.
I'm generally of the view that Mailman should do opportunistic domain sorting and per-MTA customised VERP handoffs (because nobody has standardised VERP across MTAs), and beyond that to back off. Mailman's job is to get the outbound mail into the MTA's spool as quickly as possible, wrapped in transactions (ie RCPT TO bundles) that are friendly to efficient processing, and that's it.
If you go back to Barry's message, he was talking about getting
even further involved, by doing a mail-merge process. Since there is no MMTP (something that Bryan Costales, Eric Allman, and I had worked on for a while, before we realized that it would just make the spam problem worse and then dropped all further efforts), there is a need for an intermediate program that is called by mailman and then hands the messages off to the MTA.
Either that intermediate program can be provided by mailman
itself, or it can come from a third party. But it needs to come from somewhere.
We're not in the game of second guessing the MTAs. That way lies wasted time and madness.
If there were MLTAs which were optimized for this function, I
would agree with you. Since we're trying to take standard MTAs which may have only some optimizations that might be generally applicable to most situations (including mailing lists), I must disagree.
For the mailing list specific optimizations that we know are not
provided by many common MTAs or MTA versions, we need to perform those optimizations before the message gets to the MTA.
We also need to be able to selectively turn them off, in the case
that there are MTAs that can do that specific job themselves and don't need our interference.
Where Mailman's performance hurts is in the handling of the list configs, especially for lists with very large memberships rosters and in queue runner performance and overhead (try watching queue runner's system resource profile in v2.1 for lists with > 50,000 members). For me those are the obvious low hanging fruit,
You should definitely go after the low-hanging fruit when you
can. However, you also have to consider how much work would go into fixing those problems.
A high priority item that would require re-engineering the entire
system is something that should be planned for the long term, perhaps in conjunction with other things that would likewise require significant re-engineering efforts as well.
Meanwhile, if there are other performance issues that can be
addressed which do not require such significant re-engineering, those should be given serious consideration in the shorter term.
-- Brad Knowles, <brad.knowles@skynet.be>
"They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania.
GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++)
participants (2)
-
Brad Knowles
-
J C Lawrence