speed of spambayes?

Emile van Sebille emile at fenx.com
Mon Dec 1 03:02:15 CET 2003

Paul Rubin:
> Can someone using spambayes tell me about how fast it runs?

IIRC, Tim Peters did some specific measurements during spambayes

... aah - here it is: (from message id
in http://mail.python.org/pipermail/python-dev/2002-August.txt.gz

[Eric S. Raymond]
> I'm in the process of speed-tuning this now.  I intend for it to be
> blazingly fast, usable for sites that process 100K mails a day, and
> think I know how to do that.  This is not a natural application for
> Python :-).

[Tim Peters]
> I'm not sure about that.  The all-Python version I checked in added
> Python-Dev messages to the database in 2 wall-clock minutes.  The
time for
> computing the statistics, and for scoring, is simply trivial (this
> be true of a "normal" Bayesian classifier (NBC), but Graham skips
most of
> the work an NBC does, in particular favoring fast classification
time over
> fast model-update time).

This was 15 months ago, and I'm not sure how that relates to GBs per
howlongs, but it's something to start with.


Emile van Sebille
emile at fenx.com

More information about the Python-list mailing list