speed of spambayes?

Paul Rubin http
Mon Dec 1 02:04:51 CET 2003


jjl at pobox.com (John J. Lee) writes:
> > Spamassassin right now but it takes around 1.5 seconds to process a
> > message on a 2 ghz Athlon.  I believe part of that time is spent doing
> > network lookups to check the source addresses against various spam
> > blacklists.  I want to crunch through several gigabytes of spam
> > folders to see if any legitimate messages got trapped, so need a fast
> 
> Well, that's only a couple of days even if it's mostly CPU :-)

No it's much more than a few days.  My spamassassin-based classifier
seems to process my mail files at about 20 MB per hour (maybe less),
so 50 hours per GB (maybe more).  I have about 5 GB of spam that I
want to process, so that's at least 1.5 weeks of nonstop despamming.

> > classifier with a low false negative rate (it's ok if the false
> > positive rate isn't so low, since almost all the messages in these
> > folders are already spam).
> 
> You might want to tune it a bit first, then.

Hmm, good point, spam filters are usually set up the other way.

Thanks.




More information about the Python-list mailing list