![](https://secure.gravatar.com/avatar/a148cdd5c639fe49576e590c26f615ef.jpg?s=120&d=mm&r=g)
At 11:54 AM -0800 2003/10/29, Peter C. Norton wrote:
It always confounds me that people will go for database voodoo and deride filesystems when a filesystem is a highly specialised database in and of itself.
I am aware of that. I was aware of that when I first gave my
invited talk entitled "Design and Implementation of Highly Scalable E-mail Systems", which you can find at <http://www.shub-internet.org/brad/papers/dihses/>.
Note that Eric Allman (author of the original Ingres database,
among many other things) and Kirk McKusick (author of the Berkeley Fast File System) were in the audience. I did not embarrass myself.
Databases aren't meant to be storage for abstract binary data. They're meant to be a searchable index of data of types they understand.
Correct. And despite all claims to the contrary from the
vendors, no database properly "understands" binary large objects, nor do they give you another datatype they do actually understand that would be suitable for the storage of e-mail message bodies.
Assuming I had a clean slate to start a database project for a mail store, personally I'd much rather prototype it in something like postgresql where I could add data types to deal with email. I could then make header types, text types, mime types classes, etc. Then I could test to see if it was a good idea to implement it.
IMO, that would be an exercise in futility. We've been down this
road a million times before. We don't need to go down it again to know that the result is not likely to be successful, especially when we have alternatives that are proven to work well -- we store the message meta-data in the database, and then the message bodies in an separate message store akin to INN timecaf/timehash "heaps" (see <http://www.shub-internet.org/brad/papers/dihses/lisa2000/sld090.htm>).
I think using a standard sql database for doing mail operations is asking for trouble. Standard databases don't know how to parse rfc822/2822 headers and that means that you've got to either write a whole lot of stored procedures in a clunky query language (or java!?!?!) and then maintain it, or you've got to do it all in the imap/pop3/whatever server which means a whole lot of yammering traffic between the database and the I/P/W server all the time, which == slow.
You don't ask the database to understand or parse RFC2822 headers
or messages. That's up to your application. You just store data using the formats known to the database, and the message bodies according to the methods above.
-- Brad Knowles, <brad.knowles@skynet.be>
"They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania.
GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++)