[Spambayes] SpamBayes as a gatway solution

Bobby Wilkins hwilkins at harrahs.com
Wed Aug 27 16:13:10 EDT 2003


My bayes DB is 17MB; we have over 1,000 users on an Exchange server; we might have upwards of 17GB of disk for bayes databases alone.  Considering we currently limit users to 100MB of storage, this would represent an increase of ~20% in our Exchange server disk.  I'd also warrant that it would be a more-than-20% increase in disk I/O for the database activity...

To be commercially-viable, a server-based bayes implementation that kept individual databases would:
 - have to be able to store its data efficiently (grouping 
   common message information while keeping individual
   scores sounds difficult off-the-cuff)
 - potentially use a compressed database engine (don't know
   if there is one of those around, but I've seen a few
   proprietary ones)
 - NOT hammer the server in either CPU or disk I/O
 - NOT grow infinitely (automatically trim old data;
   deciding what "old data" is, however, sounds hard)

-----Original Message-----
From: Peter Beckman [mailto:beckman at purplecow.com]
Sent: Wednesday, August 27, 2003 12:32 PM
To: ROTTENBERG,HAL (HP-USA,ex1)
Cc: 'Itamar Rosenn'; spambayes at python.org
Subject: RE: [Spambayes] SpamBayes as a gatway solution


On Wed, 27 Aug 2003, ROTTENBERG,HAL (HP-USA,ex1) wrote:

 What about a modified SpamBayes, where each incoming email address has/creates
 its own database?  
[snip]

 Then again, my DB is 2.5MB.  
[snip]



More information about the Spambayes mailing list