GSOC idea: The central scrutinizer ;)
I have a partially-completed spec for a module that will examine messages for various issues but my Python-fu is likely not sufficient to realize it and I'm busy writing anyway. This is probably a GSOC-size and GSOC-scope project, so if anybody is game, below is a poorly-written and large incomplete description of what I have in mind.
- As I've said for years, anti-abuse controls should be layered and should start at the network perimeter (with things like the Spamhaus DROP list in border routers). Additional layers may be in firewalls, in the MTA, in the MLM (such as Mailman). No one layer can catch everything because it doesn't have contextual knowledge, e.g., the firewall doesn't know who the members of a mailing list are or even that a particular SMTP connection it's allowing contains traffic for a mailing list.
Unfortunately, email accounts have been hijacked by the billions (and that's just counting Yahoo). The new owners of those accounts likely enjoy the same SMTP reachability that the old owners did and can thus drop messages onto mailing lists. (I'm going to omit the long explanation about why context-sensitive measures in the MTA are not only inadequate to stop this, but are also highly undesirable.) To put it another way: we don't have many defenses against our friends. But we need them.
We all have our own views on proper netiquette for mail/mailing lists, and of course, mine are the only correct ones. ;) But regardless of those, it'd be useful to have a mechanism to scrutinize messages for things like "800 lines of quoted digest with one top-posted line above it". Of course some of this is easy to see and hard to code, but I think a modest attempt in this direction would be helpful. (Besides, if the action taken on detection of these is to merely hold the message, then little harm is done by false positives.)
1 and 2 are related. The same kinds of criteria that are useful in detecting and putting a hold on spam are useful in detecting undesirable content in messages, like URL shorteners and third-party tracking links. So it makes sense to have a highly configurable module that comes with a minimal/loose default configuration that can be salted to taste.
So what I have in mind is a module that scrutinizes messages based on a set of (enabled/disabled) criteria, each one of which is configurable.
I know that's not very clear. Let me make up an example and see if that helps.
Let's suppose we call this module the Central Scrutinizer (CS) because oblique tributes to Frank Zappa are always in order. The CS might have a list of dozens of checks like this:
- URL shorteners (1)
- tracking links (2)
- full digest quoting (3)
Check (1) would consult a list of known URL shortening domains and do pattern matching of any URLs in the message against them. Check (2) would attempt to detect tracking links/"web bugs". Check (3) would check messages to see if it includes an entire quoted digest.
Each one of these would be associated with an action -- and I strongly think that "hold" would be best, because these tests are going to make lots of mistakes for a while. The results would be presented to list owners (in email notifications and in the browser interface) with something like:
Central scrutinizer report:
- URL shorteners - fail, example.net detected
- tracking links - pass, none detected
- full digest quoting - pass, none detected
with appropriate presentation so that it's easy to read and so that when a test fails, it reports *why* it failed.
Let me note that the MTA is wrong place to do this stuff for a whole bunch of reasons. Among other things, lots and lots of people want different policies applied to mailing list traffic (which will result in outbound mail traffic) than they due to local traffic (which won't). This is a serious issue for anybody concerned with their mail system's reputation, which is based on what it emits, not what it accepts.
Of course not everyone will agree with every check, and not every check is appropriate on every mailing list. (For example, a one-way announce-only list is unlikely to need most of these.) But if this is modular, and if individual checks can be switched on/off readily, then it can ship with everything off and folks can enable whatever subset they find palatable.
Let me also note that the motivation for this comes not just from things I've bumped into while running lists, but things I've seen on other lists. (And I'm on rather a lot of them.) I've been thinking about this for years, but have become convinced in the last six months or so that the problem has now reached a point where a solution is worth constructing.
---rsk
participants (1)
-
Rich Kulawiec