On Wed, 24 May 2000 22:13:08 -0700 Chuq Von Rospach <chuqui@plaidworks.com> wrote:
At 9:52 PM -0700 5/24/2000, J C Lawrence wrote:
Umm, true. Looking at it again, and doing a quick check of my user base's MXing, I suspect we're dealing with a less than 1% gain. Bigger fish are available. Methinks my brain was farting.
Nope, just getting a bit ahead and thinking of something that's fun and technically challenging. Been there, done that... (grin).
Well, yes.
I don't believe that a list server has any business handling MX sorting unless it is also taking responsibility for being the list MTA. As Mailman isn't, its a moot point.
And that's an issue I've been wrestling with a lot -- do I do a specialized MTA? Or do I let the MTA do its job. After going back and forth on this for weeks, given my current delivery rates, I've decided to let the MTA do its job, and wait on writing a specialized MTA until I need that last couple of percent of performance. Moving to 8.10.1 seems like an easy performance bump, postfix looks like it'll buy me even more, and so while doing all the MXing and stuff would be fun, it can wait.
I'm a big believer in being lazy. I don't like handling problems that other people seem to have done a decent enough job already unless they're *really* interesting. There are already enough interesting things out there to do to keep me busy, so that doesn't happen often.
Ergo, if the MTA looks like it can do it, can be persuaded to do it, and that means I really don't have to, then well, who am I to argue? QMail, with which I'm becoming uncomfortably familiar, is bloody fast. I hope/expect that Postfix is very similar. Its tough to image a situation where my time and effort in replacing them (as a solo effort) would actually be worth it as versus throwing hardware at the problem or chatting up Wietse & co. I've written list servers and mini-MTAs before. There's a fair bit of hidden complexity and brain hurt in there I don't mind avoiding.
queue management is another issue. that's one place majordomo is weak at, because it doesn't. Everything is delivered as it comes in, so bursts can take a system to its knees.
True. Were Mailman asynchronous, a pattern as below would seem useful:
There is never more a single "queued message handler" process (maybe multi-threaded, or not). That process guarantees not to feed messages to the MTA any faster than XXX messages per second/minute, and to stop such feeding were system load to rise above ZZZ. The single instance rule prevents multiple handler processes for multiple mailing lists maxxing out the MTA as they all dump simultaneously. The problem of multiple list servers (boxes) dumping simultaneously to a remote MTA is properly, I believe, outside of Mailman's purview.
I don't see a value in trying to monitor MTA queue size. Too MTA specific. Monitoring list server queue size and implementing fairness algorithms in emptying the queue across multiple lists is worth looking at tho (arguably round-robin is "good enough" and is very simple to do, but it ain't pretty when things get ugly).
Another thing to worry about... On my big system, I only do a few mailings a week, but they bunch together. So I've had to do a bunch of work on making sure the system deals with this rationally... when we were doing one mailing on a given day, that was easy, but we're doing both a text and an HTML variant going out together, and that really complicates life.
Yeah, volume, as versus number of messages, can be another problem if only for disk IO.
Well, this is probably preaching to the choir, but I've gotten quite convinced that you isolate every piece you can from every other piece, and document the interfaces. that makes it quite easy to swap out a new piece without affecting the rest of the system ...
This is often called, "programming by contract". Its a Good Thing.
-- Allows archived messages to be replied to on the web via the archive page (replies post to the list).
Nice! does it restrict posting access to registered users or is it open?
I let the list server handle that. As I insert a special header in messages coming from the web interface its very easy to configure Mailman to hold such posting for moderator approval (privacy options page)
I used to use it, and then switched my web archives to a full forum system (web crossing) and crosslinked everything. that has its advantages and disadvantages.
One of my list members has been advocating WebCrossing. What do you think of it? It seemed excessively constraining to me, especially since I'm heading toward a massively WikiWiki-fied setup (every page can be commented on Wiki-style, all comments are free-standing Wiki entities etc etc etc).
-- Supports archive searching by MessageID. I've an MTA hack that inserts a MessageID-based URL into all outgoing Mailman list traffic so the user can just hit the URL and be taken to that message in the archives (searches the MHonArc DB, useful for thread reference etc).
Interesting hack. Very interesting hack.
<bow> Wish it were original to me. One of my list members came up with the idea and then went and implemented it. Its somewhere in Keystone under Tasks...
you could do something really nice with PHP and MySQL, too, and do away with MHonarc, and parse/templatize the text on the fly. that's sort of where I'm headed down the road....
Yeah, I've thought about that but I really just don't see enough advantage to justify the time it would take to get something better than I have now. Eric Hood (MHonArc author) has been also threatening to do something here for ages.
Its awfully tempting tho just on a "cool!" factor.
-- J C Lawrence Home: claw@kanga.nu ----------(*) Other: coder@kanga.nu --=| A man is as sane as he is dangerous to his environment |=--
At 12:05 AM -0700 5/25/2000, J C Lawrence wrote:
Its tough to image a situation where my time and effort in replacing them (as a solo effort) would actually be worth it as versus throwing hardware at the problem or chatting up Wietse & co.
throwing hardware at a problem isn't always possible. but the place where rolling your own internal MTA starts becoming useful is when the list is big enough that the disk I/O involving the MTA starts becoming the significant limiter. With sendmail 8.9.x, that's fairly easy to run into. With sendmail 8.10, it seems to be better, and the multiple queue stuff solves a multitude of problems involving huge directory structures.
VERP exacerbates the problem, since # of batches sent to the MTA equals the # of addresses, which explodes the number of control files, which... So at some point, it makes sense to deliver direct to recipient rather than build batches into the MTA, and completely avoid the disk I/O and deliver right out of the database to the receiving SMTP client. You could strongly parallelize the delivery setup because you'd do away with all of the MTA overhead, and do all sorts of fun things, like prioritize your delivery sorting and the like.
Which, if you're trying to deliver 5,000,000 emails a day and do so within a time-sensitive time period gets important -- and for the other 99.5% of the universe, just doesn't matter that much (snork).
I've written list servers and mini-MTAs before. There's a fair bit of hidden complexity and brain hurt in there I don't mind avoiding.
Yes, that's very true. Just dealing with MX gets gnarly.
True. Were Mailman asynchronous, a pattern as below would seem useful:
There is never more a single "queued message handler" process (maybe multi-threaded, or not). That process guarantees not to feed messages to the MTA any faster than XXX messages per second/minute, and to stop such feeding were system load to rise above ZZZ. The single instance rule prevents multiple handler processes for multiple mailing lists maxxing out the MTA as they all dump simultaneously.
That's basically how my big machine has evolved -- I'm using three queues, one to generate delivery batches (and requeue them into queue2), the 2nd queue to paralellize bulk_mailers into the MTA, and a third queue just for smartbounce and non-delivery batches, to keep them out of the way... It's nice, because my setup batch can generate a bunch of batches, and it's up to the queue system to make sure only "N" of them are running at any time, but any batch that hits slow domains doesn't back up huge numbers of addresses, the waiting batches slip into other slots. Oh, queuing theory is such fun. I got into computers to AVOID math...
The problem of multiple list servers (boxes) dumping simultaneously to a remote MTA is properly, I believe, outside of Mailman's purview.
I don't see a value in trying to monitor MTA queue size. Too MTA specific.
See the disk I/O issues above. In a perfect world, the MTA would self-throttle itself to avoid overload conditions. In practice, you have to be careful to both tune the MTA to maximize output, and the MLM to avoid blowing it out. If you have a burst that stuffs 2500 batches into a sendmail queue all at once, then sendmail has that big directory problem i a big way, and your system goes to hell.
Sendmail 8.10 goes a long way to minimizing this, but still, you can force your MTA to thrash, and when you do, everything gets really unhappy. So perhaps you don't need to have the MLM monitor the MTA constantly and throttle itself, but that's actually not a bad thing, IMHO, if it can be done reasonably -- on the other hand, I wouldn't make it a big focus, since it'd be a LOT easier to write some docs on how to tune the system adn what to watch out for, and let the admin do the tuning. Once the tuning is done, it probably won't require a lot of watching...
Well, this is probably preaching to the choir, but I've gotten quite convinced that you isolate every piece you can from every other piece, and document the interfaces. that makes it quite easy to swap out a new piece without affecting the rest of the system
This is often called, "programming by contract". Its a Good Thing.
Heh. It's also called "breaking a huge project down into tiny pieces so your customers don't worry nearly as much about deadlines"...
One of my list members has been advocating WebCrossing. What do you think of it?
Not appropriate for this list. Let's talk offline. I'm designing it out of my systems in favor of other things, but the reasons are complex -- and I've recommended it INTO at least one major development project at the same time. So I guess the answer is "it depends, but I'm not going to be using it myself..."
you could do something really nice with PHP and MySQL, too, and
Yeah, I've thought about that but I really just don't see enough advantage to justify the time it would take to get something better than I have now.
I wonder how much of this could be driven out of something like Midgard? But loading your entire archives into a database gives you the ability to do all sorts of interesting linking and searching and stuff, and "all" you'd need is some email->XML converter, and then...
Oh, man. We need to at least pretend to be on topic for this list, but I need a white board and a pen... (scribbly scribble...)
-- Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui@plaidworks.com) Apple Mail List Gnome (mailto:chuq@apple.com)
And they sit at the bar and put bread in my jar and say 'Man, what are you doing here?'"
participants (2)
-
Chuq Von Rospach -
J C Lawrence