[Spambayes] Are there plans for a daemonized or compiled version ofSpambayes?

Tim Peters tim.one at comcast.net
Sat Sep 20 18:35:35 EDT 2003


[Martinez, Michael]
> I've been running Spambayes on our agency Linux smtp gateway for
> several months and very happy with its classification of spam. My
> gateway is a qmail system and it pipes all incoming email through the
> hammiefilter prior to delivery.

Yup, running a distinct classifier for each email is a pretty crazy design
for high-volume use.

> However, a performance problem arises when the gateway gets hit during
> peak hours with a lot of emails. What happens is the system slows down
> tremendously, in part due to the number of python instances that get
> forked in order to scan the emails.
>
> I was wondering: are there any plans to develop a lightweight,
> daemonized version of Spambayes?

The answer to that depends on you too:  what are your plans?  Python is a C
program, and can be daemonized like any other.  Note the project's pspam
directory sets up a classifier backed by a ZODB database, which can be
attached to via opening a ZEO connection.  That would be a pleasant way to
let multiple clients hook up at will to an always-running classifier.

> In the same vein, are there plans to port it to C or another compiled
> language?

AFAICT, the most expensive part of running spambayes now is running Berkeley
database lookups, and the Sleepycat bsddb implementation is already written
in C.  So profile before you presume to know what would help.  Based on what
I've measured, my interest in recoding any of the rest in C is nil.

> How difficult would this be?

It would be extremely tedious.  You don't escape the needs for a database,
for I/O, or for a variety of complex string-processing operations.  The
parts of the Python implementation that supply those to Python programmers
are already coded in C, but much easier to use from Python than from C.




More information about the Spambayes mailing list