spam gatewayed from Usenet to mail bypasses our spam filters
data:image/s3,"s3://crabby-images/cbbce/cbbced8c47f7bfb197ed1a768a6942977c050e7c" alt=""
With the help of the python.org postmasters I figured something out about spam on the python-list@python.org mailing list over the weekend. I was confused that some seemingly obvious spams made it through to the mailing list subscribers. Then I noticed that it didn't appear these spams had even seen the spam filter. The headers the filter adds were missing and I could never find their Message-Id's in the spam filter's logs.
Mailman operates the bidirectional gateway between the Usenet newsgroup comp.lang.python and the python-list@python.org mailing list. When it sees a Usenet message on the one side it distributes the posting directly to its subscribers, and vice versa. Because Usenet postings don't arrive by email the spam filtering we have in place (which occurs before Mailman sees a mail message) is not done. Mailman happily passes such unfiltered mail on to the list.
I still haven't figured out quite how to solve the problem. In theory we could use some other tool to perform the gateway operation. Instead of passing Usenet postings directly to Mailman it would mail them to python-list@python.org where they would get the spam filter treatment before Mailman sees them. I'm still thinking about the full ramifications of that. It might be easier to get Mailman's news-to-mail gateway to mail incoming Usenet messages to the list address instead of directly distributing them to the subscribers though. I don't believe Mailman does that out of the box (but I would love to be wrong here). Has anyone tried implementing that? If so, got a patch or a recipe for how to configure Mailman to operate this way?
Thanks,
-- Skip Montanaro - skip@pobox.com - http://smontanaro.dyndns.org/
data:image/s3,"s3://crabby-images/b96f7/b96f788b988da8930539f76bf56bada135c1ba88" alt=""
skip@pobox.com writes:
I still haven't figured out quite how to solve the problem. In theory we could use some other tool to perform the gateway operation. Instead of passing Usenet postings directly to Mailman it would mail them to python-list@python.org where they would get the spam filter treatment before Mailman sees them. I'm still thinking about the full ramifications of that.
Well, you want to watch out for greylisting. But running them through SpamBayes from Mailman should be trivial. You could also add (with a slight performance hit) a Handler that calls out to SpamBayes, SpamAssassin, etc, only when the MTA didn't already do that, or only when the message's from_usenet flag is set.
It might be easier to get Mailman's news-to-mail gateway to mail incoming Usenet messages to the list address instead of directly distributing them to the subscribers though.
That's probably not a great idea as ToUsenet comes pretty late in the pipeline.
data:image/s3,"s3://crabby-images/ec664/ec664667bb9cea54a75167301127704b33289f23" alt=""
on 11/18/08 1:28 AM, Stephen J. Turnbull said:
Well, you want to watch out for greylisting. But running them through SpamBayes from Mailman should be trivial. You could also add (with a slight performance hit) a Handler that calls out to SpamBayes, SpamAssassin, etc, only when the MTA didn't already do that, or only when the message's from_usenet flag is set.
We've got more to the anti-spam processing for our mail server infrastructure than just SpamBayes. We've got several components whose function I only know about slightly in passing, and they all have an impact on the incoming mail traffic. We wouldn't be doing them if they didn't have an impact.
I am a fan of SpamAssassin in general, but it is not part of that toolkit. I'm not sure how much some of those other components are specifically tied into postfix, and may be of limited use with any other program. And of course the host-level firewalling we're doing to reject connections from the worst abusers operates at a completely different level.
There's a lot to be said for the simplicity of just having all this traffic handled via e-mail, just like all the other bi-directional traffic we're doing.
We will have to be careful about loops, however. We wouldn't want to gateway the same thousand-articles-plus per day many times over.
-- Brad Knowles <brad@shub-internet.org> LinkedIn Profile: <http://tinyurl.com/y8kpxu>
data:image/s3,"s3://crabby-images/56955/56955022e6aae170f66577e20fb3ce4d8949255c" alt=""
Stephen J. Turnbull wrote:
skip@pobox.com writes:
It might be easier to get Mailman's news-to-mail gateway to mail incoming Usenet messages to the list address instead of directly distributing them to the subscribers though.
That's probably not a great idea as ToUsenet comes pretty late in the pipeline.
ToUsenet is only involved in gating mail from the list to the newsgroup. Newsgroup to mailman is handled by cron/gate_news which drops messages directly in Mailman's incoming queue.
It would be simple to have it "mail" the message instead, but the issues I see off the top are:
Does the MTA spam filtering process treat mail differently if it "originates" from the local machine?
Does the list accept news posted by non-list-members. If so, you'd need to flag the message in some way as being from usenet. This is done in standard gate_news by setting a fromusenet flag in the message's metadata when the message is queued, but if you are "mailing" the message, you can't do this.
As indicated in another reply, another approach would be to have gate_news run the message through the spam checks directly before queueing it.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
data:image/s3,"s3://crabby-images/56955/56955022e6aae170f66577e20fb3ce4d8949255c" alt=""
Mark Sapiro wrote:
Stephen J. Turnbull wrote:
skip@pobox.com writes:
It might be easier to get Mailman's news-to-mail gateway to mail incoming Usenet messages to the list address instead of directly distributing them to the subscribers though.
That's probably not a great idea as ToUsenet comes pretty late in the pipeline.
ToUsenet is only involved in gating mail from the list to the newsgroup. Newsgroup to mailman is handled by cron/gate_news which drops messages directly in Mailman's incoming queue.
It would be simple to have it "mail" the message instead, but the issues I see off the top are:
Does the MTA spam filtering process treat mail differently if it "originates" from the local machine?
Does the list accept news posted by non-list-members. If so, you'd need to flag the message in some way as being from usenet. This is done in standard gate_news by setting a fromusenet flag in the message's metadata when the message is queued, but if you are "mailing" the message, you can't do this.
There is another issue. Without the fromusenet flag in the metadata, if you are gating from the list to usenet, the message will be posted back to the news group. Ultimately, you'd need to add a custom handler to the pipeline to detect mail from gate_news and add the fromusenet flag.
I think the approach below would be better.
As indicated in another reply, another approach would be to have gate_news run the message through the spam checks directly before queueing it.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
data:image/s3,"s3://crabby-images/ffcb7/ffcb7f251a9dc1a7d8a9e512f97dde9771e65ff0" alt=""
Mark Sapiro wrote:
As indicated in another reply, another approach would be to have gate_news run the message through the spam checks directly before queueing it.
Another way could be to have a moderated group, with the moderation address set to the Mailman list post address, though this has its own problems.
The issue with running the spam checks is that the spam processor may be on a different machine to the host running the Mailman.
Andrew.
data:image/s3,"s3://crabby-images/cbbce/cbbced8c47f7bfb197ed1a768a6942977c050e7c" alt=""
cron/gate-news doesn't look all that obtuse. Perhaps I can simply import the SpamBayes machinery and run it before delivering the message to the list.
Skip
data:image/s3,"s3://crabby-images/b96f7/b96f788b988da8930539f76bf56bada135c1ba88" alt=""
Mark Sapiro writes:
That's probably not a great idea as ToUsenet comes pretty late in the pipeline.
ToUsenet is only involved in gating mail from the list to the newsgroup. Newsgroup to mailman is handled by cron/gate_news which drops messages directly in Mailman's incoming queue.
There is another issue. Without the fromusenet flag in the metadata,
Sorry about the bizarre phrasing. This is what I had in mind when I mentioned ToUsenet.
I think the approach below would be better.
As indicated in another reply, another approach would be to have gate_news run the message through the spam checks directly before queueing it.
As Brad points out, though, those checks are already done by the MTA, and in fact may not even be implementable outside the MTA (if they're implemented as milters, for example).
In general, there really is a tradeoff here. Although Skip seems happy enough, as I guess SpamBayes catches most (all?) of the spam his system catches.
data:image/s3,"s3://crabby-images/cbbce/cbbced8c47f7bfb197ed1a768a6942977c050e7c" alt=""
Stephen> In general, there really is a tradeoff here. Although Skip
Stephen> seems happy enough, as I guess SpamBayes catches most (all?) of
Stephen> the spam his system catches.
SpamBayes is one of the things that mail.python.org does for incoming mail. Based on my own personal filters I suspect it would stop much of what is currently leaking through to the list from usenet. Adding a SpamBayes check to gate-news to score the incoming usenet postings would be pretty trivial.
Skip
data:image/s3,"s3://crabby-images/ec664/ec664667bb9cea54a75167301127704b33289f23" alt=""
on 11/17/08 11:20 PM, skip@pobox.com said:
It might be easier to get Mailman's news-to-mail gateway to mail incoming Usenet messages to the list address instead of directly distributing them to the subscribers though. I don't believe Mailman does that out of the box (but I would love to be wrong here). Has anyone tried implementing that? If so, got a patch or a recipe for how to configure Mailman to operate this way?
Ironically, we run an external news-to-mail gateway at ntp.org, based on the long-existing standard tools in the INN toolbox. We do this because Mailman generates it's own message-ids when it gateways the articles from news to mail, and the news reading public for ntp.org complained violently. The mail-to-news gateway from INN re-uses the same message-id as was originally contained within the news posting, so when people refer to a given message-id, it's always the same regardless of whether or not they are using mail or news.
This is related to FAQ 4.59 at <http://wiki.list.org/pages/viewpage.action?pageId=4030712>.
If we implemented a proper server-side read/post interface for Mailman, we could fix a lot of other USENET gateway problems, too.
For one thing, we would no longer need high watermarks for article numbers, we would instead track whether or not we've seen a given message (and therefore whether or not it needs to be gatewayed) based on whether we've seen that message-id within our lifetime window, and any articles older than the lifetime window would get ignored.
A server-side implementation would also allow us to directly feed outgoing articles to news routing servers, potentially bypassing a large news reader server infrastructure and being both more robust and more efficient. And we could take feeds from multiple upstream news routing servers, too.
But as I said recently on this mailing list, I'm not a programmer and I don't have the necessary skills to write a proper server-side interface for USENET news to be incorporated into Mailman.
So, I let this issue drop a few years ago when I first brought it up, although it does seem to come back up every so often.
-- Brad Knowles <brad@shub-internet.org> LinkedIn Profile: <http://tinyurl.com/y8kpxu>
data:image/s3,"s3://crabby-images/7ff8e/7ff8ef37f8888acb3be19d9de7df69f6a7fb1316" alt=""
On 11/17/2008 11:20 PM, skip@pobox.com wrote:
I still haven't figured out quite how to solve the problem. In theory we could use some other tool to perform the gateway operation. Instead of passing Usenet postings directly to Mailman it would mail them to python-list@python.org where they would get the spam filter treatment before Mailman sees them. I'm still thinking about the full ramifications of that. It might be easier to get Mailman's news-to-mail gateway to mail incoming Usenet messages to the list address instead of directly distributing them to the subscribers though. I don't believe Mailman does that out of the box (but I would love to be wrong here). Has anyone tried implementing that? If so, got a patch or a recipe for how to configure Mailman to operate this way?
Few follow up questions:
Will someone please provide the Path: header for a number (3 - 8) of the spam messages? I suspect that the messages are originating from googlegroups.com which is /notorious/ for spam.
Does gate_news have the ability to filter (gateway or not) messages based on the contents of the Path: header in the Usenet message?
Is it possible to have gate_news call some of the other freely available Usenet spam filters that already exist? Why re-invent the wheel if we can hook in to already existing filters on the Usenet side.
Grant. . . .
data:image/s3,"s3://crabby-images/ec664/ec664667bb9cea54a75167301127704b33289f23" alt=""
on 11/20/08 12:54 AM, Grant Taylor said:
- Does gate_news have the ability to filter (gateway or not) messages based on the contents of the Path: header in the Usenet message?
Not inherently, no. But that feature could be added.
- Is it possible to have gate_news call some of the other freely available Usenet spam filters that already exist? Why re-invent the wheel if we can hook in to already existing filters on the Usenet side.
That's also a good idea, although not currently available in the existing code.
Call them before SpamBayes, I think.
-- Brad Knowles <brad@shub-internet.org> LinkedIn Profile: <http://tinyurl.com/y8kpxu>
data:image/s3,"s3://crabby-images/7ff8e/7ff8ef37f8888acb3be19d9de7df69f6a7fb1316" alt=""
On 11/20/2008 01:28 AM, Brad Knowles wrote:
Not inherently, no. But that feature could be added.
*nod*
That's also a good idea, although not currently available in the existing code.
*nod*
Call them before SpamBayes, I think.
Agreed.
I don't know for sure, but I'm betting that SpamBayes is wanting to filter the message as if it is email, rather than news. This means that it will inherently be later in the chain than other news filtering utilities.
Seeing as how there has been a fair amount of discussion about ""enhancing the Usenet support in Mailman, I as this question:
Is it appropriate to change Mailman such that it better supports Usenet or would we be better served by developing a gateway that will accept messages in on STDIN.
Doing things via STDIN would allow us to separate Mailman from the complexities of Usenet (or any thing else for that matter) and concentrate on mailing lists, what Mailman was designed to do.
Using the STDIN approach would dictate the need for a small standalone utility / gateway that could somehow get Usenet messages, be it a small NNTP server that receives a feed, some incarnation of suck, or something even more exotic retrieving messages and handing them out via STDOUT to Mailman's STDIN.
I believe this protocol independent interface between STDOUT and STDIN will make maintenance of things much simpler long term for both Mailman and what ever is used as the Usenet side of the gateway.
(Similar would be done to receive messages from Mailman and post them to Usenet.)
Grant. . . .
participants (6)
-
Andrew Hodgson
-
Brad Knowles
-
Grant Taylor
-
Mark Sapiro
-
skip@pobox.com
-
Stephen J. Turnbull