[Spambayes] Server side instructions for qmail

Mon Sep 22 08:44:31 EDT 2003

I noticed that the FAQ contains "Postfix notes" on a server side
-server-side-spambayes-solution>). I'd like to submit the following
"Qmail notes":



Spambayes is installed on our agency's smtp / MX gateway. This machine
runs Redhat Linux 7.1, qmail 1.03, qmail-scanner 1.16, and hbedv's
"Antivir". Incoming mail is accepted by tcpserver and handed off to
qmail-scanner. Qmail-scanner runs the virus software ("antivir") and
hands the message to qmail. Qmail accepts local delivery on all
domain-bound email. This email is delivered to ~alias/.qmail-default.
(This is a standard configuration for qmail). 


~alias/.qmail-default pipes each email through Spambayes. The
.qmail-default is set up as follows:


| /usr/local/spambayes/hammiefilter.py -d /usr/local/spambayes/.hammiedb
| qmail-remote MSServer.csrees.usda.gov "$SENDER"
$DEFAULT at csrees.usda.gov


The permissions for the /usr/local/spambayes directory are set with the
following command:

chown -R qmailq.qmail /usr/local/spambayes


As shown above, there are two pipes. The first pipes it through
Spambayes. The second pipes it through qmail's remote delivery
mechanism, which delivers the email to our Exchange Server. 


Delivered emails are filtered on a per-user basis in Outlook by setting
the Rules to detect the Spambayes tag in the message header. If the tag
reads "Spambayes-Classification: spam" then the email is either deleted
or placed in the user's Spam folder. If it reads
"Spambayes-Classification: unsure" then it's placed in the user's Unsure
folder. If it reads "Spambayes-Classification: ham" then nothing special
is done - it is delivered to the user's Inbox as normal. 


The user is given the choice of whether to set up his rules or not. 


Training of Spambayes is done in the following manner. Our users are
given my email address and are told that, if they like, they may send
emails to me that they consider spam, or that end up being
"mis-classified" by the system. I created two directories:





The emails sent to me by the users are retrieved from the qmail archive
and placed into the appropriate directory.  When I'm ready to do a
training (which I do once or twice a month), I run the following


1.	I use a simple script to insert a blank From: line at the top of
each email 
2.	I use a simple script to remove the qmail-scanner header from
the bottom of each email.
3.	uuencoded attachments are removed
4.	cat /usr/local/spambayes/training/spamdir/* >>
5.	cat /usr/local/spambayes/training/hamdir/* >>
6.	/usr/local/spambayes/mboxtrain -d /usr/local/spambayes/.hammiedb
-g /usr/local/spambayes/training/ham -s


(Step #6 can be run without shutting down qmail.)


Most of the time, emails that are sent to me are clearly discernible as
to whether they are spam or not. Occasionally there is an email that is
borderline, or that one person considers spam but others don't. This is
usually things like newsletter subscriptions or religious forums. In
this case, I follow my own rule that if there is at least one person in
the agency who needs or wants to receive this type of email, and as long
as it is non-offensive, work-related, or there are a lot of people in
the agency who have an interest in the topic, then I will either train
it as ham, or, if it's already being tagged ham, leave it. An example of
this are emails that discuss religious topics. There are a lot of people
in this agency who are subscribed to religious discussion groups, so in
my mind, it's good practice to make sure these messages are not tagged


The above system works well on several levels. It's manageable because
there's a central location for training and tagging spam (the smtp
server). It's manageable also because our IT PC Support staff does not
have to install Spambayes on each PC nor train all of our user's on its
use. If a user does not like the way our system tags the emails, he does
not have to set up his Outlook rules. But, we've had a good response
from the users who are using their Rules. They're willing to put up with
one or two mis-classified emails in order to have 95% of their junk
email not in their Inbox.


Michael Martinez

Linux System Administrator


United States Department of Agriculture


More information about the Spambayes mailing list