![](https://secure.gravatar.com/avatar/e6ea3e5ffc3558c74e9f8cbf3f38357a.jpg?s=120&d=mm&r=g)
At 10:09 AM -0400 9/28/06, Barry Warsaw wrote:
Does it have to be GPL? Is a Berkeley-type license not okay?
GPL would be best, but Berkeley is probably okay. We'd probably want to get confirmation of that from the FSF. The key thing is that it has to be compatible with the GPL (and the Python Software LIcense -- see below) so that we can combine the whole kit and kaboodle.
Is there any license questions or issues that we would need to have answered or confirmed by the Sendmail Consortium? Or should we wait on that until we've heard back from the FSF?
Dunno about doing it in Python, but I will say that going to Maildir as an additional queue-on-disk mechanism on top of everything else we're already doing seems to be a big step backward in terms of potential performance issues and I don't really see any significant positive benefit.
I don't think it's an additional queue-on-disk mechanism, certainly in comparison to what we're doing today.
Maildir was not designed as an efficient queue-on-disk strategy. It was designed to allow multiple simultaneous parallel deliveries to the NFS-mounted mailbox of a given user, and we know that it does a number of additional unnecessary things that seriously hurt its performance even in that relatively tightly defined context.
It does unnecessary file renames (which cause additional synchronous meta-data filesystem operations), it uses filenames that are too long and bust iname/inode caching schemes, and it doesn't make use of obvious significant performance-enhancing mechanisms like directory hashing.
It's pretty easy to design a mechanism that is much more efficient -- and scalable -- in handling multiple simultaneous deliveries to a user mailbox on NFS.
So why would we want to abuse a bad scheme for user-mailbox-on-NFS as an alternative scheme for queue-on-disk?
If we have queue-on-disk problems, why not solve them by implementing a more efficient queue-on-disk scheme, instead of abusing a poorly designed user-mailbox-on-NFS scheme?
That way,
you're not dumping all message destined for Mailman into one directory. Not as good as directory hashing, but better than what we have today.
That would be somewhat of an improvement in some respects, but Maildir also brings along a lot of additional baggage and I'm not at all convinced that it's worth the effort.
I'll grant you that LMTP delivery has the potential to be the most efficient mechanism by which messages get from the MTA into Mailman. But it's certainly more work and more complicated than maildir; will you grant that maildir is better than what we have today? Think of it as a waystation on the road to the ultimate uber-performing list server. :)
I'm not at all convinced that Maildir would be an overall improvement over what we have today. I think that adding a directory hashing scheme on a fork()/exec() model would probably be a bigger improvement than changing our inbox delivery mechanism from a fork()/exec() model and using Maildir instead.
At least by sticking with fork()/exec() and adding a directory hashing scheme on top of that, we wouldn't need to make any changes to the way we interface with MTAs today -- all the changes could be kept completely internal to Mailman. If we were to switch to Maildir as an inbox delivery method, not only would we have to change the way we interface with MTAs, we would also have to make internal changes to Mailman to support the use of Maildir as our queue-on-disk mechanism. That's a bigger overall change with bigger risk and relatively lower potential payoff.
If we were to work on implementing a directory hashing scheme instead of working on Maildir, we could still add LMTP at a later date.
That would allow us to go back at a later time and enhance our features that we provide to Mailing list administrators, while also giving us time to look more deeply into the potential performance issues and make sure that we're not causing more problems than we're solving.
Let me just say that ideally, I think LMTP would be a great way to go. It's not my top priority though. I'm looking for ways to get more developers involved in the project, and this seems like a perfect thing for someone seeking Mailman fame and fortune <wink>.
I'm not convinced that this is an improvement.
So, anyone care to take the challenge?
I'm not a developer, but I do have experience with building large-scale mail and mailing list systems, and if you're willing to listen to me then I'm willing to give you the benefit of my experience.
IMO, Maildir is a Red Herring. The one and only reason to ever consider using Maildir is if you're implementing a large-scale IMAP mail server system and you're required to store user mailboxes on NFS.
Even then, you'd be well-served to look for better storage mechanisms, because throwing potentially hundreds of thousands of messages into a single directory is guaranteed to cause huge performance issues, even if every single mailbox operation didn't involve scanning the entire directory and doing a stat() on every single file, locking the entire directory, creating/renaming/deleting the file(s) as appropriate, and then unlocking the directory.
I think we're better off spending our resources working on trying to resolve the real bottleneck issues that we already know are present in our system as opposed to working on cool stuff that may or may not help but would require more overall changes to more parts of the system and with relatively lower potential payoff.
-- Brad Knowles, <brad@stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
Founding Individual Sponsor of LOPSA. See <http://www.lopsa.org/>.