[Spambayes] filtering in the face of disk quotas or full disks

Skip Montanaro skip at pobox.com
Sat Mar 22 09:52:33 EST 2003


    bill> this issue, in combination with some of the manual processes
    bill> posted to the list to maintain db size and relevancy has made me
    bill> wonder if spambayes shouldn't incorporate the ability to FIFO
    bill> token/training info.

This is also not what I'm worried about.  While we need to provide means to
manage the size of the database, that is essentially an offline activity.
I'm worried simply about the situation where a mail message arrives and
there's no disk space left to process it properly.

You really can't control the way the database file size grows.  Since it's
implementing a hash, once the key density gets too high, it expands the
database dramatically and shuffles things all around.  In between these
striking leaps in size, the database grows little, if at all, for each new
key added.

Let me restate the problem: I just don't want Spambayes to be accused,
rightly or wrongly, of losing mail because a disk quota was exceeded or a
disk partition filled up.  Everything else is merely an inconvenience.  Lost
mail can't be recovered.  What motivated this was an (incorrect, in my
opinion) assumption by a sys admin where I work that because there was a
failure in a mail setup using procmail and SpamAssassin when the disk quota
was exceeded that it was obviously a SpamAssassin problem.  

Skip



More information about the Spambayes mailing list