[Mailman-Developers] Requirements for a new archiver
Brad Knowles
brad.knowles at skynet.be
Wed Oct 29 13:45:53 EST 2003
At 1:28 PM -0500 2003/10/29, John A. Martin wrote:
> Hmm... Maildirs.
Not.
From <http://www.washington.edu/imap/documentation/formats.txt.html>:
. mh This is supported for compatibility with the past. This is
the format used by the old mh program.
mh is very inefficient; the entire directory must be read
and each file stat()'d, and in order to determine the size
of a message, the entire file must be read and newline
conversion performed.
mh is deficient in that it does not support any permanent
flags or keywords; and has no means to store UIDs (because
the mh "compress" command renames all the files, that's
why).
[ ... deletia ... ]
The Maildir format used by qmail has all of the performance
disadvantages of mh noted above, with the additional problem that the
files are renamed in order to change their status so you end up having
to rescan the directory frequently the current names (particularly in
a shared mailbox scenario). It doesn't scale, and it represents a
support nightmare;
[ ... deletia ... ]
So what does this all mean?
A database (such as used by Exchange) is really a much better
approach if you want to move away from flat files. mx and especially
Cyrus take a tenative step in that direction; mx failed mostly because
it didn't go anywhere near far enough. Cyrus goes much further, and
scores remarkable benefits from doing so.
However, a well-designed pure database without the overhead of
separate files would do even better.
Of course, we all know about the database problems of Exchange,
and how Exchange admins have to frequently shut everything down and
clean their databases, how often they crash, how often they
completely trash all e-mail for all their users, etc....
I submit that the reason for this is the combination of crappy
Microsoft-style programming and the fact that no database handles
BLOBs well. Even top-notch programmers have real problems with these
kinds of implementations -- I am intimately familiar with the
database implementation methods used in the AOL mail system, and
suffice it to say that this is a really, really hairy nightmare that
you do *NOT* want.
That said, storing meta-data in a real database and then using
external filesystem techniques for actually accessing the data,
should give you the best of both worlds -- the speed of access of the
database, and the reliability and well-understood access and backup
mechanisms of filesystems.
--
Brad Knowles, <brad.knowles at skynet.be>
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
-Benjamin Franklin, Historical Review of Pennsylvania.
GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+
!w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++)
More information about the Mailman-Developers
mailing list