Kenny Pitt kennypitt at hotmail.com
Wed Oct 6 23:15:19 CEST 2004

Wayne Pedersen wrote:
> I am interested in including the SpamBayes engine in one of my
> applications. 
> I'm not a Python expert and rather new to the language.
> I see many source files, but I am interested in the ones which
> actually identify the mail as spam or ham.  The wiki didn't seem to
> offer much other than end user support.  
> Where can I find some documentation on the core engine of SpamBayes?

There really isn't any documentation on the source code other than the
source code itself.  Most of the base classifier stuff is pretty thoroughly
commented, but level of commenting varies in other areas.

It's really a bit complicated to define where to find all the source code
you would need.  All the source for the actual classifier engine is located
in the "spambayes" subdirectory of the source, but there are a number of
other source files mixed in there as well.

In order to use the engine, you also need some sort of driver program that
handles things like opening the correct training database files, obtaining
the message to be classified, and training on incorrect messages.  You will
find examples of this scattered throughout the source tree, primarily in the
"scripts" and "Outlook2000" subdirectories.  A good place to start might be
"scripts/sb_filter.py", which is probably the simplest version of the
filter.  It just reads messages from stdin or a file, classifies or trains
them, and writes the results to stdout.

Kenny Pitt

