Stuart D. Gathman
stuart at bmsi.com
Mon Jul 14 22:48:07 CEST 2003
I have created RPMs and a python wrapper for the DSPAM project. The DSPAM
project was created by NetworkDweebs and provides a Bayesian spam
filtering Mail Delivery Agent. More importantly, it exports the core
engine as a C library. The core library is what is wrapped for Python. I
also took the liberty of creating a 'tokenize' entry point to the core
library allowing python code to use its own database structure.
Tokenizing is the only step that needs C speed.
There are no docs yet for the python wrapper, but good short examples in
python are included in the source. The python versions of some of the
DSPAM utilities, like dspam_corpus.py, are faster than their C counterpart
(26 vs. 33 secs)! (This is due to the python code not opening and closing
the database for each message.) The web page has an example of using
dspam in a Python milter.
More information about the Python-list