[ mailman-Patches-534577 ] Add SpamAssassin filter to mail pipeline

SourceForge.net noreply at sourceforge.net
Mon May 5 21:31:03 EDT 2003


Patches item #534577, was opened at 2002-03-25 16:17
Message generated for change (Comment added) made by jhenstridge
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=300103&aid=534577&group_id=103

Category: list administration
Group: Mailman 2.1
Status: Open
Resolution: None
Priority: 1
Submitted By: James Henstridge (jhenstridge)
Assigned to: Nobody/Anonymous (nobody)
Summary: Add SpamAssassin filter to mail pipeline

Initial Comment:
This filter adds support for discarding or holding spam
sent to the mailing list.  It contacts a spamd daemon
(from SpamAssassin -- http://spamassassin.taint.org) to
score the message.

If the score is above a certain threshold (default 10),
the message is discarded and an entry is written to the
vette log.

If the score is above another lower threshold (default
5), the message is held for moderation.

The SpamAssassin.py file should be installed in
Mailman/Handlers/.  The LIST_PIPELINE variable in
Mailman/Handlers/HandlerAPI.py should be modified to
include a 'SpamAssassin' item (I put it just after the
existing 'SpamDetect' item).

To change the defaults, the following can be added to
the mm_cfg.py file:
  SPAMASSASSIN_HOST = 'host:port'  # how to contact SA
  SPAMASSASSIN_DISCARD_SCORE = 10
  SPAMASSASSIN_HOLD_SCORE = 5

If you don't want to discard messages, then
DISCARD_SCORE can be set to something very high (1000
should do it).

It looks the MM2.1 filter APIs have changed a bit, so
this filter will need some modifications to work with
that version.  When I get round to upgrading, I might
look into updating it.


----------------------------------------------------------------------

>Comment By: James Henstridge (jhenstridge)
Date: 2003-05-06 11:31

Message:
Logged In: YES 
user_id=146903

I have just attached updated versions of the patches (dated
06/05/2003).  These versions include a number of bug fixes
that I have been testing locally for a while.  I also added
a similar workaround for the True/False usage (the True and
False constants were only added in Python 2.2.1 and 2.3a).

This version also puts the SpamAssassin score in the
"reason" for held messages, which means you can easily see
the scores of messages in the new Mailman 2.1 moderation
overview page.

I have also put together some documentation on the Mailman
setup I use:

http://www.daa.com.au/~james/articles/mailman-spamassassin/

This includes information on how to set up an unprivileged
spamd that maintains separate Bayes databases for each
mailing list.

----------------------------------------------------------------------

Comment By: Pug Bainter (phelim_gervase)
Date: 2003-05-06 00:55

Message:
Logged In: YES 
user_id=484284

After installing patch 668685 for the HTDig integration into
Mailman 2.1.1, I started getting the following:

May 02 16:50:34 2003 (23484) Uncaught runner exception:
global name 'False' is not defined
May 02 16:50:34 2003 (23484) Traceback (most recent call last):
  File "/var/mailman2/Mailman/Queue/Runner.py", line 105, in
_oneloop
    self._onefile(msg, msgdata)
  File "/var/mailman2/Mailman/Queue/Runner.py", line 155, in
_onefile
    keepqueued = self._dispose(mlist, msg, msgdata)
  File "/var/mailman2/Mailman/Queue/IncomingRunner.py", line
130, in _dispose
    more = self._dopipeline(mlist, msg, msgdata, pipeline)
  File "/var/mailman2/Mailman/Queue/IncomingRunner.py", line
153, in _dopipeline
    sys.modules[modname].process(mlist, msg, msgdata)
  File "/var/mailman2/Mailman/Handlers/SpamAssassin.py",
line 75, in process
    score, symbols = check_message(mlist, str(msg))
  File "/var/mailman2/Mailman/Handlers/SpamAssassin.py",
line 57, in check_message
    connection = spamd.SpamdConnection(SPAMD_HOST)
  File "/var/mailman2/Mailman/Handlers/spamd.py", line 79,
in __init__
    self.request_headers =
mimetools.Message(StringIO.StringIO(), seekable=False)
NameError: global name 'False' is not defined


I corrected this by defining "False = 0" in spamd.py. I
don't know what the "real" solution should be though.


----------------------------------------------------------------------

Comment By: James Henstridge (jhenstridge)
Date: 2003-03-17 13:46

Message:
Logged In: YES 
user_id=146903

Attached is an updated version of the filter for adding
SpamAssassin support to mailman.  This version is targetted
at Mailman 2.1.x.

The code for talking to spamd has been split out into a
separate file, so that it can be updated independently of
the Mailman specific code.  It has also been updated to work
with SpamAssassin 2.50 (and should be a lot more robust to
future additions to the spamd protocol).

The filter has also been changed to use the list name as the
username passed to spamd, which means that separate
auto-whitelists and bayes databases can be maintained for
each list.

Installation is trivial.  Simply copy spamd.py and
SpamAssassin.py to the Mailman/Handlers directory and add
the following line to Mailman/mm_cfg.py:
  GLOBAL_PIPELINE.insert(1, 'SpamAssassin')


----------------------------------------------------------------------

Comment By: Sean Reifschneider (jafo)
Date: 2002-08-23 14:32

Message:
Logged In: YES 
user_id=81797

That last one had a missing quote.  Try this patch:

*** SpamAssassin.py.orig        Fri Aug 23 00:28:59 2002
--- SpamAssassin.py     Fri Aug 23 00:31:00 2002
***************
*** 30,45 ****
  from Mailman.Logging.Syslog import syslog
  from Hold import hold_for_approval
  
! SPAMD_PORT = 0
! try:
!     SPAMD_HOST = mm_cfg.SPAMASSASSIN_HOST
!     i = string.find(SPAMD_HOST, ':')
!     if i >= 0:
!         SPAMD_HOST, SPAMD_PORT = SPAMD_HOST[:i], host[i+1:]
!         try: SPAMD_PORT = int(SPAMD_PORT)
!         except: SPAMD_PORT = None
! except:
!     SPAMD_HOST = 'localhost'
  if not SPAMD_PORT: SPAMD_PORT = 783
  
  try:    DISCARD_SCORE = mm_cfg.SPAMASSASSIN_DISCARD_SCORE
--- 30,44 ----
  from Mailman.Logging.Syslog import syslog
  from Hold import hold_for_approval
  
! SPAMD_HOST = 'localhost'
! SPAMD_PORT = None
! if hasattr(mm_cfg, 'SPAMASSASSIN_HOST'):
!        SPAMD_HOST = mm_cfg.SPAMASSASSIN_HOST
!        try:
!                 SPAMD_HOST, SPAMD_PORT =
string.split(SPAMD_HOST, ':', 1)
!                 SPAMD_PORT = int(SPAMD_PORT)
!        except ValueError:
!                 SPAMD_PORT = None
  if not SPAMD_PORT: SPAMD_PORT = 783
  
  try:    DISCARD_SCORE = mm_cfg.SPAMASSASSIN_DISCARD_SCORE

Sean

----------------------------------------------------------------------

Comment By: Sean Reifschneider (jafo)
Date: 2002-08-23 14:19

Message:
Logged In: YES 
user_id=81797

How about changing that chunk of code to:

   SPAMD_HOST = 'localhost'
   SPAMD_PORT = None
   if hasattr(mm_cfg, 'SPAMASSASSIN_HOST):
       SPAMD_HOST = mm_cfg.SPAMASSASSIN_HOST
       try:
           SPAMD_HOST, SPAMD_PORT = string.split(SPAMD_HOST,
':', 1)
           SPAMD_PORT = int(SPAMD_PORT)
       except ValueError: 
           SPAMD_PORT = None
   if not SPAMD_PORT: SPAMD_PORT = 783

This gets rid of the "bare except"s, and I think it's a
little clearer than the previous code.  The ValueError will
be tripped if the string doesn't have a : in it, or if the
int coercion fails.  Though perhaps in that instance you'd
want to log an error or something...

Sean

----------------------------------------------------------------------

Comment By: Sean Reifschneider (jafo)
Date: 2002-08-23 14:18

Message:
Logged In: YES 
user_id=81797

How about changing that chunk of code to:

   SPAMD_HOST = 'localhost'
   SPAMD_PORT = None
   if hasattr(mm_cfg, 'SPAMASSASSIN_HOST):
       SPAMD_HOST = mm_cfg.SPAMASSASSIN_HOST
       try:
           SPAMD_HOST, SPAMD_PORT = string.split(SPAMD_HOST,
':', 1)
           SPAMD_PORT = int(SPAMD_PORT)
       except ValueError: 
           SPAMD_PORT = None
   if not SPAMD_PORT: SPAMD_PORT = 783

This gets rid of the "bare except"s, and I think it's a
little clearer than the previous code.  The ValueError will
be tripped if the string doesn't have a : in it, or if the
int coercion fails.  Though perhaps in that instance you'd
want to log an error or something...

Sean

----------------------------------------------------------------------

Comment By: dann frazier (dannf)
Date: 2002-08-18 02:11

Message:
Logged In: YES 
user_id=146718

hey James,
  found a typo.  also wanted to point out:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=139942&repeatmerged=yes

--- SpamAssassin.py.orig        Sat Aug 17 12:05:41 2002
+++ SpamAssassin.py     Sat Aug 17 12:06:13 2002
@@ -35,7 +35,7 @@
     SPAMD_HOST = mm_cfg.SPAMASSASSIN_HOST
     i = string.find(SPAMD_HOST, ':')
     if i >= 0:
-        SPAMD_HOST, SPAMD_PORT = SPAMD_HOST[:i], host[i+1:]
+        SPAMD_HOST, SPAMD_PORT = SPAMD_HOST[:i],
SPAMD_HOST[i+1:]
         try: SPAMD_PORT = int(SPAMD_PORT)
         except: SPAMD_PORT = None
 except:


----------------------------------------------------------------------

Comment By: James Henstridge (jhenstridge)
Date: 2002-07-25 17:00

Message:
Logged In: YES 
user_id=146903

remembering to check the "upload file" checkbox this time ...

----------------------------------------------------------------------

Comment By: James Henstridge (jhenstridge)
Date: 2002-07-25 16:59

Message:
Logged In: YES 
user_id=146903

Yet another new version that fixes a small typo.  With
previous messages, you couldn't approve messages that had
been identified as spam once (they would get identified
again when the queue got processed, instead of passing the
message through).

----------------------------------------------------------------------

Comment By: James Henstridge (jhenstridge)
Date: 2002-07-10 08:19

Message:
Logged In: YES 
user_id=146903

The Mailman installation on mail.gnome.org also uses this
filter.  I don't think there are any stability problems with
the filter.

----------------------------------------------------------------------

Comment By: Sean Reifschneider (jafo)
Date: 2002-07-10 05:16

Message:
Logged In: YES 
user_id=81797

FYI, I ran the previous version since installation and it
seemed to work fine.  I didn't run into any problems, with
probably 500 messages handled.  I've updated to the new
version and it seems ok so far, but I've only sent about 10
messages through.

Sean

----------------------------------------------------------------------

Comment By: James Henstridge (jhenstridge)
Date: 2002-07-03 12:02

Message:
Logged In: YES 
user_id=146903

Yet another version.  There were some bugs in handling of
certain error conditions when talking to spamd.  These would
result in exceptions and the messages staying in the
delivery queue :(

With the new version, the message will be passed through
unchecked  under these conditions, and a message will be
added to the error log.

----------------------------------------------------------------------

Comment By: Sean Reifschneider (jafo)
Date: 2002-06-13 05:48

Message:
Logged In: YES 
user_id=81797

FYI: I've been running the 2002-05-14 version of this patch
with spamassassin 2.20 for the last day on our main mailman
box and it seems to be working great.

----------------------------------------------------------------------

Comment By: James Henstridge (jhenstridge)
Date: 2002-05-14 14:04

Message:
Logged In: YES 
user_id=146903

This version is essentially the same as the previous
version, but adds compatibility with python > 1.5.2, which
doesn't like you passing two arguments to socket.connect().

----------------------------------------------------------------------

Comment By: James Henstridge (jhenstridge)
Date: 2002-04-27 14:17

Message:
Logged In: YES 
user_id=146903

Just attached my updated version of the patch.  This version
requires SpamAssassin 2.20 (for the extra commands that the
spamd daemon understands).  It now displays a list of which
rules were triggered for held messages, and can give
messages from list members a bonus (defaults to 2), so that
they are less likely to get held as spam.

----------------------------------------------------------------------

Comment By: James Henstridge (jhenstridge)
Date: 2002-03-26 09:21

Message:
Logged In: YES 
user_id=146903

There is a fairly easy optimisation for this filter that I
missed when writing it.  It calls str() on the message
object twice.  It would be quicker to call str() on the
message once.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=300103&aid=534577&group_id=103



More information about the Mailman-coders mailing list