[Spambayes] filtering in the face of disk quotas or full disks
bill at parducci.net
Sat Mar 22 10:27:45 EST 2003
Skip Montanaro wrote:
> This is also not what I'm worried about. While we need to provide means to
> manage the size of the database, that is essentially an offline activity.
> I'm worried simply about the situation where a mail message arrives and
> there's no disk space left to process it properly.
ok, but to date, this is a *manual* 'offline activity' involving any number of homegrown solutions to resolve. while this is operationally acceptable to advanced users such as those that mind this list, i believe that it is impractical for the vast majority of those who could benefit from this solution (but are unable/unwilling to keeps multiple copies of mail in numerous files, etc.)
> You really can't control the way the database file size grows. Since it's
> implementing a hash, once the key density gets too high, it expands the
> database dramatically and shuffles things all around. In between these
> striking leaps in size, the database grows little, if at all, for each new
> key added.
perhaps using the current h architecture, but if you have the ability to maintain the size of the input pool (possibly via a secondary data store that handles raw tokens), then it seems illogical that the size of the db cannot be managed within reason.
> Let me restate the problem: I just don't want Spambayes to be accused,
> rightly or wrongly, of losing mail because a disk quota was exceeded or a
> disk partition filled up. Everything else is merely an inconvenience. Lost
> mail can't be recovered. What motivated this was an (incorrect, in my
> opinion) assumption by a sys admin where I work that because there was a
> failure in a mail setup using procmail and SpamAssassin when the disk quota
> was exceeded that it was obviously a SpamAssassin problem.
good luck preventing misplaced accusations! :o)
More information about the Spambayes