skip@pobox.com wrote:
So I come, hat in hand, looking for some brave Mailman developer who is willing to test out my modified version of gate_news. You can grab the latest version from Launchpad:
bzr pull lp:~smontanaro/mailman/SpamBayes
There is an associated doc repo with a few instructions for setting up the SpamBayes stuff:
bzr pull lp:~smontanaro/mailman-administrivia/SpamBayes
A sample spambayes.ini file lives in the cron directory alongside gate_news. It's basically what I would use on mail.python.org if I had the necessary savvy to do this myself.
If you have any questions I'd be happy to answer them. I can help you get SpamBayes installed if you've never done that before. (It's quite straightforward if you're familiar with the normal Python setup.py thing or use setuptools.) I can also provide ham and spam training sets from mail.python.org so you can construct a useful database for SpamBayes to score messages against. (You could run with an empty training database but that would just cause all messages to score as "unsure" and be held as possible spam.)
Skip,
I have installed SpamBayes and am running your modified gate_news. The test list is <http://www.msapiro.net/mailman/listinfo/python> and it is gating comp.lang.python from news.bu.edu.
Currently I have
#BAYESCUSTOMIZE=/usr/local/mailman/cron/spambayes.ini
in mailman's crontab. I.e. it is commented out so SpamBayes is not actually being invoked.
I could use the training sets and some advice on how to proceed. Presumably the files
lookup_ip_cache:/usr/local/spambayes-corpus/dnscache.pck crack_image_cache:/usr/local/spambayes-corpus/imagecache.pck persistent_storage_file:/etc/spambayes/wordprobs.cdb
referenced in your spambayes.ini get created when the training sets are processed, but I'm unclear on that part of the process.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan