[Mailman-Developers] Requirements for a new archiver

Brad Knowles brad.knowles at skynet.be
Wed Oct 29 13:45:53 EST 2003


At 1:28 PM -0500 2003/10/29, John A. Martin wrote:

>  Hmm... Maildirs.

	Not.

	From <http://www.washington.edu/imap/documentation/formats.txt.html>:

. mh   This is supported for compatibility with the past.  This is
         the format used by the old mh program.

         mh is very inefficient; the entire directory must be read
         and each file stat()'d, and in order to determine the size
         of a message, the entire file must be read and newline
         conversion performed.

         mh is deficient in that it does not support any permanent
         flags or keywords; and has no means to store UIDs (because
         the mh "compress" command renames all the files, that's
         why).

	[ ... deletia ... ]

  The Maildir format used by qmail has all of the performance
  disadvantages of mh noted above, with the additional problem that the
  files are renamed in order to change their status so you end up having
  to rescan the directory frequently the current names (particularly in
  a shared mailbox scenario).  It doesn't scale, and it represents a
  support nightmare;

	[ ... deletia ... ]

So what does this all mean?

       A database (such as used by Exchange) is really a much better
  approach if you want to move away from flat files.  mx and especially
  Cyrus take a tenative step in that direction; mx failed mostly because
  it didn't go anywhere near far enough.  Cyrus goes much further, and
  scores remarkable benefits from doing so.

       However, a well-designed pure database without the overhead of
  separate files would do even better.



	Of course, we all know about the database problems of Exchange, 
and how Exchange admins have to frequently shut everything down and 
clean their databases, how often they crash, how often they 
completely trash all e-mail for all their users, etc....

	I submit that the reason for this is the combination of crappy 
Microsoft-style programming and the fact that no database handles 
BLOBs well.  Even top-notch programmers have real problems with these 
kinds of implementations -- I am intimately familiar with the 
database implementation methods used in the AOL mail system, and 
suffice it to say that this is a really, really hairy nightmare that 
you do *NOT* want.


	That said, storing meta-data in a real database and then using 
external filesystem techniques for actually accessing the data, 
should give you the best of both worlds -- the speed of access of the 
database, and the reliability and well-understood access and backup 
mechanisms of filesystems.

-- 
Brad Knowles, <brad.knowles at skynet.be>

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
     -Benjamin Franklin, Historical Review of Pennsylvania.

GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+
!w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++)



More information about the Mailman-Developers mailing list