Re: [Mailman-Developers] Requirements for a new archiver

29 Oct 2003


      On Wed, 2003-10-29 at 14:38, Chuq Von Rospach wrote:
...
Hint: look at what INN did when they implmented cycbufs.
Effectively, you create 1-N files, or create files as needed. Each file
is N bytes long, pre-allocated on file creation. When you store
messages, they're written into the file sequentially (or any other way
you want. If you want to get into best fit allocations and turn this
into a malloc() style heap, be my guest).
Metadata to access the info is then a filename, and an lseek() pointer
into the file, and # of bytes to read, plus your normal identifying
info. It's fast, it's efficient use of file pointers, it avoids the
worst aspects of the unix file system, and I'm amazed nobody ever
thinks to use it for other purposes (or that it took that long for
usenet people to discover it, I suggested a simpler variant of it back
in the 80s and was told inodes are our friends...)
I'm not sure if Andrew Koenig is on this list, but he described an
algorithm he developed to quickly find messages in an mbox file.  If
he's here, maybe he can talk about it.
...
From lines in the body of the message.  MMDF would be better, but I
I really don't like mbox files, primarily because they require munging
think ideal from a philosophical point of view would be
one-message-per-file if it can be done efficiently cross-platform.
Maybe file system experts here can provide pointers or advice on exactly
which file and operating systems make this approach feasible, even for
huge message counts.
...
you can even do expiration/purge/etc if you want, by moving stuff
around and changing the pointers.
I've even thought of using it as the backing store for a picture
library. With a nice relational database and a series of these "data
boxes", I think you have store data in the best and fastest possible
way...
It's a very interesting idea.
-Barry

Re: [Mailman-Developers] Requirements for a new archiver

Barry Warsaw