Re: [Mailman-Developers] Speaking about kitties (or archivers)

Jeebus, finally catching up on this thread.
On Apr 24, 2012, at 01:41 PM, Stephen J. Turnbull wrote:
IMHO, premature optimization. Among other things, there isn't going to be a "the" archiver-core. Mailman should provide "a" archiver-core, and I think it should be based on maildir (which is apparently Barry's intuition, too). Theory and implementation of maildir are simple and robust, and that allows us to concentrate on the archiver interface.
In fact, the prototype archiver already implements maildir storage. This storage isn't exposed via REST yet though.
-Barry

- Barry Warsaw <barry@list.org>:
On Apr 24, 2012, at 01:41 PM, Stephen J. Turnbull wrote:
IMHO, premature optimization. Among other things, there isn't going to be a "the" archiver-core. Mailman should provide "a" archiver-core, and I think it should be based on maildir (which is apparently Barry's intuition, too). Theory and implementation of maildir are simple and robust, and that allows us to concentrate on the archiver interface.
In fact, the prototype archiver already implements maildir storage. This storage isn't exposed via REST yet though.
Maildir is robust but it doesn't scale under high load. You can add indexes, but they are limited sooner or later too.
Concerning mailbox formats Timo Sirainens current approach to collect a limited number of messages in one file and then start a new one combines the best of both worlds - mbox and maildir - in mdbox <http://wiki2.dovecot.org/MailboxFormat/dbox>.
Con It takes an index to know in which files a message is located.
Pro A magnitude faster to backup, which I would keep an eye on because mailing list archives tend to be large and backing up a directory of small files is a well known performance killer.
I can get you in contact with Timo if you like to.
p@rick
-- state of mind ()
Franziskanerstraße 15 Telefon +49 89 3090 4664 81669 München Telefax +49 89 3090 4666
Amtsgericht München Partnerschaftsregister PR 563

On Jun 02, 2012, at 07:19 AM, Patrick Ben Koetter wrote:
Maildir is robust but it doesn't scale under high load. You can add indexes, but they are limited sooner or later too.
Concerning mailbox formats Timo Sirainens current approach to collect a limited number of messages in one file and then start a new one combines the best of both worlds - mbox and maildir - in mdbox <http://wiki2.dovecot.org/MailboxFormat/dbox>.
Con It takes an index to know in which files a message is located.
Pro A magnitude faster to backup, which I would keep an eye on because mailing list archives tend to be large and backing up a directory of small files is a well known performance killer.
I can get you in contact with Timo if you like to.
I've chatted with him a few times (I'm a Dovecot user and fan).
Would someone like to take a crack at implementing this format either in, or on top of, the Python stdlib mailbox module:
http://docs.python.org/library/mailbox.html
I'd much rather use something standard (and maintained by someone else!) than a bunch of custom code specific to Mailman.
Cheers, -Barry

- Barry Warsaw <barry@list.org>:
On Jun 02, 2012, at 07:19 AM, Patrick Ben Koetter wrote:
Maildir is robust but it doesn't scale under high load. You can add indexes, but they are limited sooner or later too.
Concerning mailbox formats Timo Sirainens current approach to collect a limited number of messages in one file and then start a new one combines the best of both worlds - mbox and maildir - in mdbox <http://wiki2.dovecot.org/MailboxFormat/dbox>.
Con It takes an index to know in which files a message is located.
Pro A magnitude faster to backup, which I would keep an eye on because mailing list archives tend to be large and backing up a directory of small files is a well known performance killer.
I can get you in contact with Timo if you like to.
I've chatted with him a few times (I'm a Dovecot user and fan).
+1
Would someone like to take a crack at implementing this format either in, or on top of, the Python stdlib mailbox module:
http://docs.python.org/library/mailbox.html
I'd much rather use something standard (and maintained by someone else!) than a bunch of custom code specific to Mailman.
Speaking of standards. I discussed this with Timo in private mail and he suggests not to use mdbox directly. It is still under development and he thinks it should not become a standard format like mbox or maildir.
To quote Timo from our conversation:
I'd prefer using mdbox via Dovecot itself, either via LMTP or dovecot-lda or maybe by adding some "doveadm save" command. Anything else I think would be problematic. Even using Dovecot's library for accessing mdbox would be problematic in some installations if you didn't also read several settings from dovecot.conf (e.g. lock_method).
Note: The "'doveadm save' command" above refers to an idea where dovecot would import to be archived messages from a MM3 mailing list into Dovecot (and whatever format has been definded for the mailbox).
Maybe - and to pick up an idea Barry had mentioned to me a long time ago about mailing list management directly from a mail client - we would gain the most if we implemented an LMTP client as archiver (better: archive transport).
This would introduce the chance to choose among many LMTP servers and their specific, optimized storage format (Dovecot -> mdbox, Cyrus IMAP -> ?) including their servers various IMAP SEARCH capabilities for searches in archives.
p@rick
-- state of mind ()
Franziskanerstraße 15 Telefon +49 89 3090 4664 81669 München Telefax +49 89 3090 4666
Amtsgericht München Partnerschaftsregister PR 563

On Jun 03, 2012, at 11:21 PM, Patrick Ben Koetter wrote:
Maybe - and to pick up an idea Barry had mentioned to me a long time ago about mailing list management directly from a mail client - we would gain the most if we implemented an LMTP client as archiver (better: archive transport).
This would introduce the chance to choose among many LMTP servers and their specific, optimized storage format (Dovecot -> mdbox, Cyrus IMAP -> ?) including their servers various IMAP SEARCH capabilities for searches in archives.
I really like the idea of an LMTP client archiver (transport). Does anybody want to take a crack at this?
-Barry
participants (2)
-
Barry Warsaw
-
Patrick Ben Koetter