[Mailman-Developers] spammers harvesting email'ids [was] UI for Mailman 3.0 update

Wed Jun 9 21:45:14 CEST 2010

On Jun 08, 2010, at 01:55 PM, Adam McGreggor wrote:

>The current "problem", is the order in which MM2 handles its
>non-members filters; and I guess what I'd welcome is an ability to
>finely control the order in which given rules are processed; I think
>that would help immensely.

Here's how Mailman 3 works.

First, where MM2 had 'handlers' which conflated rule checking with message
processing, MM3 separates these.  This means that the processing handlers are
called during a separate phase of message delivery, and not until the message
has been approved for delivery.

Rule checking itself happens by way of configurable 'chains'.  There are a
number of built-in chains, but you can always add new ones and you can
configure mailing lists, or the entire MM3 system to use your custom defined
chains.  Each chain consists of a series of 'links' where each link is
essentially a triplet of (rule, action, argument).

Rules are just the name of the rule, so custom rules must have unique names,
but they can be more or less arbitrary strings (similarly with chain names).
Rules are looked up globally by name.  Link actions are one of the following:

* jump - stop processing this chain and start processing from the beginning of
  the named chain; takes a chain name as argument
* stop - stop processing through this chain
* defer - make no decision (i.e. continue processing through the current
  chain)
* run - the argument will be a callable, so call it with the standard argument
  triple of (mailing list, message, message metadata dictionary)
* detour - this is like 'jump' except that processing returns to the next link
  in the original chain when the detour chain is finished; takes a chain name
  as argument

(chain processing loop is in mailman/core/chains.py)

From here on I'll talk about what happens by default...

The incoming queue runner is now very simple.  It asks the mailing list for
its 'start chain' and then processes the message through that chain.  By
default, this is the 'built-in' chain.

(built-in chain is defined in mailman/chains/builtin.py)

The built-in chain starts by running a few immediate actions:

* is the message pre-approved?  if so, jump to the 'accept' chain
* is the mailing list in emergency hold?  if so, jump to the 'hold' chain
* are we in a mailing list loop? if so, jump to the 'discard' chain

After this, some of the general checking rules get processed, but they all
defer action.  These are rules like the administrivia check, no-subject check,
member moderation rule, and so on.  Each of these rules marks the message
metadata with a 'hit' or 'miss' tag.  After these run, the 'any' rule runs and
it just looks to see if there were any rule hits.  If so, we jump to the
'hold' chain.

If a message makes it through this gauntlet, it then detours through a
dynamically created 'header-match' chain.  This chain is created the
configuration file, so it works globally.  This means you can define your
global header matches and decide which will be accepted, held, rejected, or
discarded, say to handle known spammers.  While not currently implemented,
a similar technique will be used to do per-list web-configured header
matching.  It should be fairly straight-forward to implement your request
using the above raw materials, and we should definitely do that.

Just to finish the story, the final action in the built-in chain is to accept
the message unconditionally.  I.e. it's made it through all the known checks,
so it should be good to go.

As you've seen above, there are other default chains, such as discard,
reject, hold, and accept.  Most of these are fairly simple, e.g. the discard
chain just logs the Message-ID, fires an event, and then does nothing, which
basically throws the message away.  The accept chain sets up a couple of
headers, logs the Message-ID, fires an event, and drops the message in the
'accept' queue - which is where the processing queue runner does all the other
message preparation tasks you're familiar with from MM2.  The hold chain is of
course the most complex one; for more details UTSL.

I've probably given you way too much detail, and this should definitely go
into a system architecture document, but hopefully it gives you an idea of the
power and flexibility of MM3.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/mailman-developers/attachments/20100609/ad970856/attachment.pgp>