Re: [Mailman-Developers] Braindump on new archiver

On Sat, 16 Jun 2001 13:34:02 -0400 (EDT) Dale Newfield <Dale@Newfield.org> wrote:
On Sat, 16 Jun 2001, Thomas Wouters wrote:
Maildir works by making every message a file on its own. A mailbox is a directory with three subdirectories, 'cur', 'new' and 'tmp'. Messages in 'new' are unread, messages in 'cur' are read, and messages in 'tmp' are in transit.
What about file system limitations to the maximum number of files in a directory?
Typically this is an inode limit for the whole filesystem, not a directory size issue. Directory size is a performance issue that different filesystems handle variously well (eg Ext2FS badly, ReiserFS fairly well).
-- J C Lawrence claw@kanga.nu ---------(*) http://www.kanga.nu/~claw/ The pressure to survive and rhetoric may make strange bedfellows

On Sat, Jun 16, 2001 at 12:35:08PM -0700, J C Lawrence wrote:
On Sat, 16 Jun 2001 13:34:02 -0400 (EDT) Dale Newfield <Dale@Newfield.org> wrote:
On Sat, 16 Jun 2001, Thomas Wouters wrote:
Maildir works by making every message a file on its own. A mailbox is a directory with three subdirectories, 'cur', 'new' and 'tmp'. Messages in 'new' are unread, messages in 'cur' are read, and messages in 'tmp' are in transit.
What about file system limitations to the maximum number of files in a directory?
Typically this is an inode limit for the whole filesystem, not a directory size issue. Directory size is a performance issue that different filesystems handle variously well (eg Ext2FS badly, ReiserFS fairly well).
Exactly. The NetApps we use have their own filesystem, which doesn't have an inode limit (the WAFL filesystem is a very cool thing). They also use btree directory indexes, so large directories are not a problem. I believe ReiserFS and XFS (SGI's journalling filesystem, which is also ported to the Linux kernel) also use btree directories.
That said, yes, for lists with a lot of small messages, inodes could become a problem. I think most UNIX filesystems default to around 8k-bytes-per-inode nowadays, whereas the median size in my mailbox is about 2k. (Trust me.. it's a big enough sample size :-) I forgot to mention this, but the number of files was one of the reasons not to go for the directories-with-symlinks for archive data. Still, the problem can be avoided by removing the maildir mailbox (converting it to an mbox one) when the month (or week, or year, or whatever) ends and the archive is up to date. If it turns out to be a problem.
Don't forget that the current system is also pretty inode-heavy, since every archived message is already a file on its own. And don't think I'm forgetting about non-Linux, non-NetApp-attached systems :) Our listservers run BSDI and FreeBSD, both not brilliant in filesystem performance.
-- Thomas Wouters <thomas@xs4all.net>
Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
participants (2)
-
J C Lawrence
-
Thomas Wouters