speed of spambayes?
Mon Dec 1 02:04:51 CET 2003
jjl at pobox.com (John J. Lee) writes:
> > Spamassassin right now but it takes around 1.5 seconds to process a
> > message on a 2 ghz Athlon. I believe part of that time is spent doing
> > network lookups to check the source addresses against various spam
> > blacklists. I want to crunch through several gigabytes of spam
> > folders to see if any legitimate messages got trapped, so need a fast
> Well, that's only a couple of days even if it's mostly CPU :-)
No it's much more than a few days. My spamassassin-based classifier
seems to process my mail files at about 20 MB per hour (maybe less),
so 50 hours per GB (maybe more). I have about 5 GB of spam that I
want to process, so that's at least 1.5 weeks of nonstop despamming.
> > classifier with a low false negative rate (it's ok if the false
> > positive rate isn't so low, since almost all the messages in these
> > folders are already spam).
> You might want to tune it a bit first, then.
Hmm, good point, spam filters are usually set up the other way.
More information about the Python-list