
At 8:12 AM -0400 9/28/06, Barry Warsaw wrote:
What I find really intriguing about this approach is the ability to reject some messages immediately, presumably allowing the MTA to bounce them.
Yup.
We could reject the message then before it entered
Mailman's incoming queue.
Indeed, that's a key advantage. IIRC, procmail does this with the system-wide and user-defined rulesets.
I did a quick Google search to see if there were any GPL'd LMTP servers we could piggyback on, the idea being that if we could find a shell of a C program we could embed Python in and talk directly to Mailman during the LMTP protocol.
Does it have to be GPL? Is a Berkeley-type license not okay? Checking the source for sendmail 8.13.8, I find that there is an official part of the package which includes the LDA "mail.local", which is LMTP-capable, among other things. It can also do user mailbox hashing, based on the username. You can either hash directly to a path like /var/mail/u/s/user or use an MD5 hash of the username in a base64 representation (changing "/" to "_"), and you can control how many levels of hashing are to be used.
Seems to me like this would be a pretty obvious candidate.
Postfix has an lmtp server, but it seems fairly heavyweight
(being tied into the smtp server) and it's not clear to me we could combine our GPL code with Postfix's license.
Please check out the sendmail mail.local stuff and tell me if this is a better alternative. If you need a different license, please let me know -- I've known Eric for many years (since way before the company existed). While I won't make any guarantees, I will say that if we need a different license, I imagine that I can get a more sympathetic ear than you might otherwise be able to find.
ISTM that the trade-off then is rolling our own LMTP server vs. doing maildir delivery. Are we confident that we can implement a high performance enough server that would give us better throughput than maildir would? In Python?
Dunno about doing it in Python, but I will say that going to Maildir as an additional queue-on-disk mechanism on top of everything else we're already doing seems to be a big step backward in terms of potential performance issues and I don't really see any significant positive benefit.
At AOL, we used to use a queue-on-disk method for the Internet mail gateways. Sendmail would take the incoming message, hand it off to a custom LDA, the custom LDA would then dump that in a disk queue asynchronously, then a synchronous queue runner process would come along and pick up the messages and send them over to Stratus. Believe me, this system sucked big time -- we had never ending problems with disk queues building up to the point where the queue runners could never possibly catch up, etc....
And I'm not seeing any real significant operational differences here between what you're talking about doing and what AOL abandoned years ago. Okay, so you're talking about using Maildir instead of a typical "linear" queue-on-disk and you don't have to do file locking to guarantee queue entry creation, but that's still dumping everything into a single directory from which we then have to scan and pull stuff out and you probably do still have to do some sort of file locking in order to make sure that the input and output queue mechanisms don't step on each others toes.
It might be fun to try, but OTOH it /is/ a distraction from other MM 2.2 work that needs to get done. So unless anybody has any leads on existing GPL-compatible code we could use, or feels really motivated to work on a Python version, I'm inclined to go with maildir for MM2.2. It's not like we couldn't add LMTP at some later point.
The single queue directory on disk is already one of our biggest single bottlenecks. I don't see how using Maildir as a delivery mechanism from the MTA to Mailman is going to improve that.
In fact, it seems to me like we're just adding yet another bottleneck of exactly the same sort that we're trying to eliminate elsewhere, but with some additional drawbacks that are unique to Maildir and which will make our overall system performance even worse than it is today.
If we're going to make a big change, it seems to me that LMTP makes much more sense than Maildir. If we can't do LMTP, then I think we'd be much better off working on eliminating other bottlenecks in the system as opposed to adding yet another totally new source of bottlenecks that result from implementing Maildir.
It seems to me that this idea is a case of:
1. We have to do Something.
2. This is something.
3. Therefore, we have to do This.
I think we want to think long and hard about this idea and all it's potential drawbacks and new bottleneck sources, before we take that first step off the cliff.
-- Brad Knowles, <brad@stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
Founding Individual Sponsor of LOPSA. See <http://www.lopsa.org/>.