suggested improvement for Mailman's bounce processing
Consider the following situation:
1. Some users at our site are subscribed to external
Mailman-managed mailing lists that perform automatic bounce
processing.
2. Because the list owners are either unwilling are unable to
protect the lists from spam, the lists receive a fair amount
of spam.
3. Our MX servers are configured to reject outright incoming
messages that are obviously spam.
Most of you can see the impending train wreck already, but for completeness' sake, here's the problem:
1. Mailman distributes obvious spam to the list.
2. We detect the spam and reject it during the SMTP dialog with a
550 reply code (5.7.0 extended status code).
3. Mailman receives a DSN message because we bounced the message.
4. Mailman assumes that the bounce is due to the recipient's
address being invalid, and disables the subscription.
5. Much wailing and gnashing of teeth ensues.
From looking at Bouncers/DSN.py, although Mailman takes the time to pick apart each message/delivery-status subpart and extract the "action" field, it blissfully ignores the "status" field, which must be present and must contain the extended status code which generated the DSN.
The subject sub-code of the extended status code classifies the status as follows:
X.0.XXX Other or Undefined Status
X.1.XXX Addressing Status
X.2.XXX Mailbox Status
X.3.XXX Mail System Status
X.4.XXX Network and Routing Status
X.5.XXX Mail Delivery Protocol Status
X.6.XXX Message Content or Media Status
X.7.XXX Security or Policy Status
Of these, some subject sub-codes (e.g.: X.2.XXX) pertain directly to the validity of the destination address. But some do *not* pertain to the destination address: for example, X.6.XXX clearly means that the *content* of the message (not the source or destination address) caused the DSN.
Regardless, Mailman ignores all of the status field information: if the action is "failed", Mailman counts it as a bounce, and that's that.
IMHO, this is an error. I propose modifying Bouncers/DSN.py as follows:
1. Mailman tries to extract the status field from
message/delivery-status subpart.
2. If Mailman cannot extract the status field, it operates solely
on the action field.
2. If Mailman can extract the status field, and the subject
sub-code is X.6.XXX or X.7.XXX, Mailman assumes that the DSN
was generated by the fluke content of a specific message, and
ignores the DSN.
I admit that this algorithm isn't perfect. But I think it's better than what Mailman does currently, which is to ignore the status field entirely.
Thoughts? Arguments?
James
At 5:36 PM -0400 2006-07-28, James Ralston wrote:
I admit that this algorithm isn't perfect. But I think it's better than what Mailman does currently, which is to ignore the status field entirely.
Unfortunately, there are a whole host of seriously broken MTAs out there, and seriously broken configurations of otherwise good MTAs, and many sites return totally bogus status codes.
If everyone read and understood the RFCs half as well as you have done, then there wouldn't be any problem. But that's not what happens. In many cases, site admins will blindly copy stuff from somewhere else that was horribly broken to begin with and won't understand what's wrong with it before they do the cut-n-paste operation.
That said, I would not be opposed to seeing more data on this subject, and possibly giving site admins or list admins an option they can enable that would allow Mailman to pay attention to the status codes.
Once that's out there, we could let various people try it out and see how it works in the field, and I would be a very happy guy if I were to be proven wrong in this case.
-- Brad Knowles, <brad@stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
Founding Individual Sponsor of LOPSA. See <http://www.lopsa.org/>.
--On 28 July 2006 21:31:29 -0500 Brad Knowles <brad@stop.mail-abuse.org> wrote:
At 5:36 PM -0400 2006-07-28, James Ralston wrote:
I admit that this algorithm isn't perfect. But I think it's better than what Mailman does currently, which is to ignore the status field entirely.
Unfortunately, there are a whole host of seriously broken MTAs out there, and seriously broken configurations of otherwise good MTAs, and many sites return totally bogus status codes.
I don't see how that could create a problem. The worst thing that could happen is that someone remains subscribed to a list when they should not.
Alternately, they could be unsubscribed because their MTA is returning the wrong error codes - but then that would give their postmaster a good reason to fix the error codes. In this case, they'd be unsubscribed as things stand anyway.
-- Ian Eiloart IT Services, University of Sussex
Whis may be slightly OT, but I just encountered a major problem with the bounce processing on MM 2.16. Had a large list used for monthly announcements. I realized that since messages only went out once per month, I needed to tighten up the default bounce scoring system, so I set it to remove the user after 1 bounce (score of 1.0). The following day, I saw excessive CPU load on the box, so I checked into it. It was unsubscribing everyone from the list, saying due to a bounce in 2005, they were being unsubscribed!
The strange thing is that no posting had been done to this particular list during the time of the change of the bounce parameters, and the unsubscriptions.
BOb
Bob Puff wrote:
Whis may be slightly OT, but I just encountered a major problem with the bounce processing on MM 2.16. Had a large list used for monthly announcements. I realized that since messages only went out once per month, I needed to tighten up the default bounce scoring system, so I set it to remove the user after 1 bounce (score of 1.0). The following day, I saw excessive CPU load on the box, so I checked into it. It was unsubscribing everyone from the list, saying due to a bounce in 2005, they were being unsubscribed!
I guess this is really a bug. It is cron/disabled doing the unsubscribing.
Apparently, back in 2005, everyone bounced once and got a score of 1.0, probably due to some MTA issue. This bounce info is now stale, but stale info is not discarded until a subsequent bounce is received - none were received, so no info was reset, or if a bounce was received more recently, the old info was discarded and replaced with new info, still with score = 1.0. So everyone on the list except recent members who never bounced has a score of 1.0.
Then you set the threshold to 1.0 and triggered the code in cron/disabled which is there for just this case.
The comment in the code is
# Find all the members who are currently bouncing and see if
# they've reached the disable threshold but haven't yet been
# disabled. This is a sweep through the membership catching
# situations where they've bounced a bunch, then the list admin
# lowered the threshold, but we haven't (yet) seen more bounces
# from the member. Note: we won't worry about stale information
# or anything else since the normal bounce processing code will
# handle that.
In your case, the last sentence is the gotcha. It looks like we need to fix that.
-- Mark Sapiro <msapiro@value.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On 2006-07-28 at 21:31-05 Brad Knowles <brad@stop.mail-abuse.org> wrote:
Unfortunately, there are a whole host of seriously broken MTAs out there, and seriously broken configurations of otherwise good MTAs, and many sites return totally bogus status codes. In many cases, site admins will blindly copy stuff from somewhere else that was horribly broken to begin with and won't understand what's wrong with it before they do the cut-n-paste operation.
Perhaps, but we cannot solve this problem, and there's a fine line between working around stupidity and coddling it.
That said, I would not be opposed to seeing more data on this subject, and possibly giving site admins or list admins an option they can enable that would allow Mailman to pay attention to the status codes. Once that's out there, we could let various people try it out and see how it works in the field, and I would be a very happy guy if I were to be proven wrong in this case.
What further data do you wish to see? I think I've documented the problem well enough. There's no way we know many horribly broken sites are out there.
On 2006-07-31 at 10:56+01 Ian Eiloart <iane@sussex.ac.uk> wrote:
I don't see how that could create a problem. The worst thing that could happen is that someone remains subscribed to a list when they should not. Alternately, they could be unsubscribed because their MTA is returning the wrong error codes - but then that would give their postmaster a good reason to fix the error codes. In this case, they'd be unsubscribed as things stand anyway.
Right: the only risk is that bounces coming from a subscriber at a broken site might be ignored, because they look like they're being generated based on the content of certain messages.
IMHO, this risk is negligible. If the operators of the broken site in question get annoyed that Mailman keeps trying to send messages to a non-existent address, they should fix their broken site.
As a compromise, I suggest adding this feature as a bounce processing tunable; for example, "content bounce handling":
Setting:
How should Mailman handle bounces that appear to be related to
content?
Description:
Sometimes a message to a subscriber bounces due to the content
of the message, not because the subscriber's address is
invalid. This option controls how Mailman handles bounces
that appear to be related to the content of messages.
Picking "count the bounce" will cause Mailman to count any
bounce against the bounce threshold, regardless of the reason
why the message bounced. The advantage of this option is that
it is least likely to "miss" bounces. The disadvantage of
this option is that it penalizes subscribers at sites that
correctly indicate why a message bounced.
Picking "forward the bounce to the list owner" will cause
Mailman to forward bounces that seem to be related to the
content of specific messages to the list owner. The advantage
of this option is that the owner will be able to review the
bounce and take appropriate action; the disadvantage is that
the list owner might be overwhelmed by bounces.
Picking "ignore the bounce" will cause Mailman to ignore
bounces that appear to be related to the content of specific
messages. The advantage of this option is that subscribers at
sites that correctly indicate why a message bounced won't be
penalized. The disadvantage is that if a misconfigured site
erroneously indicates that *all* messages are due to content,
then Mailman will never detect bouncing subscribers at that
site.
Choices:
[X] Count the bounce against the threshold.
[ ] Forward the bounce to the list owner.
[ ] Ignore the bounce.
Comments?
James
On 2006-08-07 at 15:08-04 James Ralston <qralston+ml.mailman-developers@andrew.cmu.edu> wrote:
As a compromise, I suggest adding this feature as a bounce processing tunable; for example, "content bounce handling":
Upon reflection, this problem is yet another instance of Mailman's fundamental problem with bounce processing: Mailman only tracks bounces received per unit of time, not bounces received per messages sent.
As an illustration, consider how much easier life would be if bounce processing included an option like this:
Disable a subscription if [ ] percent or more of the last [ ]
messages bounce.
E.g.:
Disable a subscription if [20] percent or more of the last [10]
messages bounce.
Unfortunately, supporting a feature like this would require maintaining [n] separate databases for every mailing list (where [n] is the number of subscribers to the list), and updating all [n] databases every time a message is sent to the list.
I'm not sure if doing so would be feasible...
James
James Ralston wrote:
Choices: [X] Count the bounce against the threshold. [ ] Forward the bounce to the list owner. [ ] Ignore the bounce.
Comments?
I thought there already was a "Forward bounces to admin" setting. If not, there should be (and derfault should be off). Then, this question should be, "Try to interpret content bounces? Y/N".
It gets messy (and often confusing) combining two selections with one question.
IMHO, I really don't care to try to determine content bounces. I've seen many that give no indication of why the message bounced. The bounce processor I wrote for 2.0.x handles these in what I consider an appropriate way:
- If a user has bounces every day messages are delivered, and this continues for x days, they get canned.
- If a day goes by that messages are delivered, and no bounce occurs, bounce info is reset.
If people bounce a message every day for a couple weeks, I consider their ISP broken enough to warrant unsubscription.
Bob
P.S. Tokio & Barry: please don't forget to have a look at the bounce issue I posted a week or so regarding - that's a nasty one!
On 2006-08-07 at 15:28-04 Bob Puff@NLE <bob@nleaudio.com> wrote:
James Ralston wrote:
Choices:
[X] Count the bounce against the threshold. [ ] Forward the bounce to the list owner. [ ] Ignore the bounce.
I thought there already was a "Forward bounces to admin" setting.
No, that's another patch I'm working on. ;)
But that's not the goal here. The goal of the "forward the bounce" option here is "I want to see what type of stuff would be ignored if I picked 'Ignore the bounce'." In other words, it's a mechanism to permit a list owner to make a determination whether the "Ignore the bounce" setting is appropriate for his specific list, which Brad obliquely suggested.
IMHO, I really don't care to try to determine content bounces.
As a list owner, you shouldn't need to care. Mailman should just Do The Right Thing. My argument is that ignoring content-related bounces is the Right Thing.
Actually, I can phrase it more strongly: as it currently stands, Mailman's lack of attention to content-related bounces *requires* me to violate RFC2822. If I refuse to do so, our users' subscriptions at remote Mailman sites will be disabled.
James
At 4:26 PM -0400 2006-08-07, James Ralston wrote:
As a list owner, you shouldn't need to care. Mailman should just Do The Right Thing. My argument is that ignoring content-related bounces is the Right Thing.
The problem is determining, in a programmatic and systematic way, what really is a content-related bounce and what might mistakenly appear to be a content-related bounce, and the converse.
Then look at what happens when you make the guess the wrong way, what potential additional "cost" there may be to the system for a false positive versus a false negative, and add some weightings to the situation so as to try to minimize the overall drawbacks to such a technique.
The SpamAssassin people do this kind of analysis on a massive amount of spam that they have collected over the years, when re-running their complete collection of rule weightings to try to find an optimum setting.
Problem is, it takes them something like a month to make a single complete run through all the rules with all the input spam, to come up with a given set of proposed set of weightings -- and this is on a large set of distributed servers, in a manner somewhat akin to SETI@Home. At that point, they're ready to release a new version, because more rules and techniques have been introduced since the last version they released and the weights have also been updated, and they start the whole process all over again.
Now, we're not talking about something quite that intensive, but it could still be a pretty big affair to make sure that we're striking the proper balance of risking false positives versus false negatives.
As it stands today, it's just some people talking about abstract theory. No one has collected any appreciable amount of bounce information to tell us what the real-world picture is at their site.
If you want to move this discussion beyond the theory stage, I'd suggest that you start collecting some data.
-- Brad Knowles, <brad@stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
Founding Individual Sponsor of LOPSA. See <http://www.lopsa.org/>.
--On 7 August 2006 20:44:07 -0500 Brad Knowles <brad@stop.mail-abuse.org> wrote:
At 4:26 PM -0400 2006-08-07, James Ralston wrote:
As a list owner, you shouldn't need to care. Mailman should just Do The Right Thing. My argument is that ignoring content-related bounces is the Right Thing.
The problem is determining, in a programmatic and systematic way, what really is a content-related bounce and what might mistakenly appear to be a content-related bounce, and the converse.
No, that isn't the problem. The RFC says how to do this, and we should trust the RFC. If people have broken servers then actually there's nothing that can go wrong which isn't already going wrong.
....> If you want to move this discussion beyond the theory stage, I'd suggest that you start collecting some data.
I can't see that data is required. There are two categories of error, and the consequences are neutral in both cases:
A message is labelled as a content bounce when it's really a recipient bounce. The consequence is that the recipient stays subscribed. This isn't a real problem. The worst that happens is a bit of extra traffic, or that the admin reverts to the old behaviour.
The message is labelled as a recipient bounce when it's really a content bounce. This is the status quo. People may already be incorrectly unsubscribed. This is a real problem when it occurs. It can happen because a server refuses messages with illegal (RFC non-compliant) headers, as well as when the content is offensive.
-- Ian Eiloart IT Services, University of Sussex
At 10:55 AM +0100 2006-08-08, Ian Eiloart wrote:
The problem is determining, in a programmatic and systematic way, what really is a content-related bounce and what might mistakenly appear to be a content-related bounce, and the converse.
No, that isn't the problem. The RFC says how to do this, and we should trust the RFC. If people have broken servers then actually there's nothing that can go wrong which isn't already going wrong.
And if Yahoo jumps off a bridge because they think the RFC tells them to do that, what should we do? And what if AOL, pobox.com, hotmail.com, and all the other big providers make the same mistake?
When it comes to parsing the actual reasons behind a message bouncing, the RFC is not sufficient. Indeed, I'm not convinced that it's even necessary. And you'd have to be specific which RFC you're talking about, because some of them are mutually incompatible.
Trust me, this is a more complex subject than you think it is.
And just blindly applying what you think is the right solution is likely to cause a lot more pain for you and for everyone else, and not necessarily for any real good purpose when everything is said and done.
I can't see that data is required.
Then go ahead and make the change, and then tell us how it works out for you.
There are two categories of error, and
the consequences are neutral in both cases:
- A message is labelled as a content bounce when it's really a recipient bounce.
Or some other kind of bounce.
The consequence is that the recipient stays subscribed. This isn't a real problem. The worst that happens is a bit of extra traffic, or that the admin reverts to the old behaviour.
This can be a very real problem for admins that are running larger sites, and already handling large amounts of traffic. If the admin is forced to disable some new feature in order to restore his site to a reasonably well working state, then he's not likely to make that upgrade.
- The message is labelled as a recipient bounce when it's really a content bounce.
Or some other kind of bounce.
This is the status quo. People may already be incorrectly unsubscribed. This is a real problem when it occurs. It can happen because a server refuses messages with illegal (RFC non-compliant) headers, as well as when the content is offensive.
Or when the server looks up your IP address and finds it on a black list, or thinks it finds it on a blacklist which no longer exists, or any number of other problems.
We can't fix the entire Internet, and when people have misconfigured their servers to generate inappropriate types of bounces, there's not much we can do to help them.
In my experience, any kind of guess that we might be able to make programmatically is usually wrong for a statistically significant subset of the population.
Moreover, the potential damage from false positives or false negatives is usually worse for the collective whole than simply not trying to guess one way or the other, and to just give people enough lattitude that it shouldn't matter.
But you've got the opportunity here to generate real-world data on how this process works, and to put the whole issue to rest. Please let us know how it works out for you.
-- Brad Knowles, <brad@stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
Founding Individual Sponsor of LOPSA. See <http://www.lopsa.org/>.
--On 8 August 2006 05:10:41 -0500 Brad Knowles <brad@stop.mail-abuse.org> wrote:
Or some other kind of bounce.
This is the status quo. People may already be incorrectly unsubscribed. This is a real problem when it occurs. It can happen because a server refuses messages with illegal (RFC non-compliant) headers, as well as when the content is offensive.
Or when the server looks up your IP address and finds it on a black list, or thinks it finds it on a blacklist which no longer exists, or any number of other problems.
It doesn't matter. The point is that they're NOT saying the recipient doesn't exist. Currently, we treat this situation as if they do, and we shouldn't!
-- Ian Eiloart IT Services, University of Sussex
--On 8 August 2006 05:10:41 -0500 Brad Knowles <brad@stop.mail-abuse.org> wrote:
When it comes to parsing the actual reasons behind a message bouncing, the RFC is not sufficient. Indeed, I'm not convinced that it's even necessary. And you'd have to be specific which RFC you're talking about, because some of them are mutually incompatible
The original proposal referred to error codes defined in rfc1893. The parsing of such error codes is relatively trivial.
<http://www.ietf.org/rfc/rfc1893.txt>
Which RFCs are incompatible?
Here's a more specific proposal:
If there is no rfc1893 error code, then treat the message as a recipient bounce (count it against the recipient address).
If there is an rfc 1893 code, and the first digit is '2' or '4' then do nothing.
If there is an rfc 1893 code and the first digit is '5', then do nothing, except as noted below, where count+ means count against the recipient, count- means count in favour, admin means optionally notify the administrator (who may wish to notify the remote admin, or may even happen to be be the remote admin).
X.1.0 Other address status X.1.1 Bad destination mailbox address count+ X.1.2 Bad destination system address count+ X.1.3 Bad destination mailbox address syntax count+ X.1.4 Destination mailbox address ambiguous count+ X.1.5 Destination mailbox address valid count- X.1.6 Mailbox has moved count+ X.1.7 Bad sender's mailbox address syntax X.1.8 Bad sender's system address X.2.0 Other or undefined mailbox status X.2.1 Mailbox disabled, not accepting messages count+ X.2.2 Mailbox full X.2.3 Message length exceeds administrative limit. X.2.4 Mailing list expansion problem admin X.3.0 Other or undefined mail system status X.3.1 Mail system full admin X.3.2 System not accepting network messages X.3.3 System not capable of selected features X.3.4 Message too big for system X.4.0 Other or undefined network or routing status X.4.1 No answer from host X.4.2 Bad connection X.4.3 Routing server failure X.4.4 Unable to route X.4.5 Network congestion X.4.6 Routing loop detected X.4.7 Delivery time expired X.5.0 Other or undefined protocol status X.5.1 Invalid command X.5.2 Syntax error X.5.3 Too many recipients X.5.4 Invalid command arguments X.5.5 Wrong protocol version X.6.0 Other or undefined media error X.6.1 Media not supported X.6.2 Conversion required and prohibited X.6.3 Conversion required but not supported X.6.4 Conversion with loss performed X.6.5 Conversion failed
Actually, for the moment, I'd be happy if everything except count+ were ignored (but logged). In future, it may be possible to implement other desirable behaviours for other error codes. In particular, the local sysadmin may want to know about many of these things when the recipient is local.
-- Ian Eiloart IT Services, University of Sussex
I've deliberately not quoted the original message, but Brad is 100% on the money in his posts. I see a bunch of bounces that one of my highly-customized mailman lists get. At one point, I tried keeping up with just the parsing of the bounce messages, but soon gave up. There are too many strange ones, and ones that give you no clue for the reason of the bounce... some give you no clue WHO bounced! I've even seen some that come back from different domains.
Like Brad said, the KISS rule really is a good idea.
Bob
--On 8 August 2006 11:14:22 -0400 Bob Puff <bob@nleaudio.com> wrote:
I've deliberately not quoted the original message, but Brad is 100% on the money in his posts. I see a bunch of bounces that one of my highly-customized mailman lists get. At one point, I tried keeping up with just the parsing of the bounce messages, but soon gave up. There are too many strange ones, and ones that give you no clue for the reason of the bounce... some give you no clue WHO bounced! I've even seen some that come back from different domains.
Like Brad said, the KISS rule really is a good idea.
Bob
But, the idea is NOT to try to parse bounce *messages*, it's to parse bounce *codes*.
-- Ian Eiloart IT Services, University of Sussex
At 4:44 PM +0100 2006-08-08, Ian Eiloart wrote:
But, the idea is NOT to try to parse bounce *messages*, it's to parse bounce *codes*.
Here's the deal.
You think it's going to be trivially easy to add this new feature, and to parse the codes correctly, with the correct outcome, all you have to do is follow the RFC and everything will be hunky-dory.
I think that the problem is a lot more complex than that, with many sites giving totally inappapropriate response codes for the real underlying reason, and trying to parse them is likely to cause more problems than it solves. Moreover, I think this is going to add unneeded complexity to the system for what I believe will be, at best, relatively minimal benefit. Worse, in order to adhere to the spirit of this idea and make the concept actually work, we'll have to get into trying to parse the actual wording of the error messages, and then we'll have to get into internationalization issues of all those words we're trying to parse, because I'm pretty sure that words like "virus" are not the same in Polish, Chinese, Farsi, and whatever other various languages we have to support.
So, here's the solution. You go implement the code to do what you're talking about, and see how it works on your site. Make sure to collect all the bounce messages in question, and the action that was taken by the system. This way, we humans can compare the performance of your new code. Once you're done tweaking the system to work as well as you can manage, come back to us and show us your code and your input data, and prove to us how well it works.
But without a patch and a strong indication that this is a significant improvement for relatively little added complexity, I don't think you're going to get any further traction in this issue.
That's it. I've said my piece. Unless you have something new to add to the discussion, I'd suggest you do us all a favour and let this drop.
-- Brad Knowles, <brad@stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
Founding Individual Sponsor of LOPSA. See <http://www.lopsa.org/>.
--On 8 August 2006 12:16:35 -0500 Brad Knowles <brad@stop.mail-abuse.org> wrote:
I think that the problem is a lot more complex than that, with many sites giving totally inappapropriate response codes for the real underlying reason, and trying to parse them is likely to cause more problems than it solves.
So, give us one example of a problem that could arise.
-- Ian Eiloart IT Services, University of Sussex
At 10:49 AM +0100 2006-08-09, Ian Eiloart wrote:
So, give us one example of a problem that could arise.
I don't have to. If you want people to believe you, then you need to prove that you can create a significant enhancement to Mailman in this area, without creating a significant increase in the complexity of the system, or at least with a reasonable balance of complexity versus the modified features you're proposing.
I don't have anything to prove here. You do.
-- Brad Knowles, <brad@stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
Founding Individual Sponsor of LOPSA. See <http://www.lopsa.org/>.
--On 8 August 2006 12:16:35 -0500 Brad Knowles <brad@stop.mail-abuse.org> wrote:
Worse, in order to adhere to the spirit of this idea and make the concept actually work, we'll have to get into trying to parse the actual wording of the error messages, and then we'll have to get into internationalization issues of all those words we're trying to parse, because I'm pretty sure that words like "virus" are not the same in Polish, Chinese, Farsi, and whatever other various languages we have to support.
Rubbish. The codes are numbers.
-- Ian Eiloart IT Services, University of Sussex
At 10:49 AM +0100 2006-08-09, Ian Eiloart wrote:
Rubbish. The codes are numbers.
Right. But, as I said, the codes aren't going to be sufficient. They're going to be misleading in many cases, causing us to make inappropriate conclusions based on faulty information. Therefore, if you want to uphold the spirit of what you're asking for, you're going to have to look deeper and try to start parsing the actual text of the bounce message to try to better understand what the real reason was. And that way lies madness.
Otherwise, RFC-1893 would have been sufficient to answer all possible questions about this feature, and all MTA authors and all mail systems administrators would have been able to perfectly follow those guidelines. We wouldn't have needed RFC 3463, or the updates from RFCs 3886, 4468, etc....
The fact that there was some perceived ambiguity lead to confusion and inappropriate implementation, and incompatibility. Which lead to newer RFCs being written on this subject in order to try to clarify the situation and hopefully lead to greater compatibility. Unfortunately, there's still lots of old code and old installations out there, and they are unable or unwilling to upgrade, so now you've got all this legacy code you're saddled with, along with all this new code as well.
So, if you build your parser to handle exclusively RFC 4468 codes, and someone has written or implemented an MTA using codes from 1893 that they misinterpreted, you're probably going to have a hard time figuring out what they meant and why.
Keep in mind that you're not only fighting MTA authors here, but also the vast majority of clueless MTA administrators that take a recommended configuration from someone else that is likely to be wrong and apply it inappropriately at their site, and thus perpetuate and worsen the problem far beyond the level of damage that MTA authors would ever possibly be capable of -- and MTA authors are capable of screwing up a whole lot of stuff.
So, show me a parser that fully understands all possible correct interpretations of these RFCs, plus all possible incorrect but likely interpretations of these RFCs, and we might have something useful to talk about.
-- Brad Knowles, <brad@stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
Founding Individual Sponsor of LOPSA. See <http://www.lopsa.org/>.
--On 9 August 2006 12:18:13 -0500 Brad Knowles <brad@stop.mail-abuse.org> wrote:
Otherwise, RFC-1893 would have been sufficient to answer all possible questions about this feature, and all MTA authors and all mail systems administrators would have been able to perfectly follow those guidelines. We wouldn't have needed RFC 3463, or the updates from RFCs 3886, 4468, etc....
Neither RFC makes any significant changes here.
"Appendix B - Changes from RFC1893
Changed Authors contact information.
Updated required standards boilerplate.
Edited the text to make it spell-checker and grammar checker compliant.
Modified the text describing the persistent transient failure to more closely reflect current practice and understanding.
Eliminated the restriction on the X.4.7 codes limiting them to persistent transient errors."
RFC 4468 adds two new codes.
So, show me a parser that fully understands all possible correct interpretations of these RFCs, plus all possible incorrect but likely interpretations of these RFCs, and we might have something useful to talk about.
It's not necessary to understand all interpretations. There are a few codes that mean the remote address isn't available. When we see any other code, we should not count the bounce against the specific address, because the error isn't related to that address.
-- Ian Eiloart IT Services, University of Sussex
At 11:21 AM +0100 2006-08-10, Ian Eiloart wrote:
It's not necessary to understand all interpretations. There are a few codes that mean the remote address isn't available. When we see any other code, we should not count the bounce against the specific address, because the error isn't related to that address.
As I said, show me the code, and then show me all the actual bounces and how they were interpreted by the code. I'd like to see how closely reality actually hews to the RFCs.
-- Brad Knowles, <brad@stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
Founding Individual Sponsor of LOPSA. See <http://www.lopsa.org/>.
--On 7 August 2006 15:28:46 -0400 "Bob Puff@NLE" <bob@nleaudio.com> wrote:
James Ralston wrote:
Choices: [X] Count the bounce against the threshold. [ ] Forward the bounce to the list owner. [ ] Ignore the bounce.
Comments?
I thought there already was a "Forward bounces to admin" setting. If not, there should be (and derfault should be off). Then, this question should be, "Try to interpret content bounces? Y/N".
It gets messy (and often confusing) combining two selections with one question.
IMHO, I really don't care to try to determine content bounces. I've seen many that give no indication of why the message bounced. The bounce processor I wrote for 2.0.x handles these in what I consider an appropriate way:
- If a user has bounces every day messages are delivered, and this continues for x days, they get canned.
- If a day goes by that messages are delivered, and no bounce occurs, bounce info is reset.
I've been unsubscribed from Yahoo lists because my server rejects illegally formatted messages. That's good enough reason for me to support distinguishing between recipient errors and content errors.
If people bounce a message every day for a couple weeks, I consider their ISP broken enough to warrant unsubscription.
In my case, it wasn't my ISP, or my server that was at fault. It was the message *senders* mail client that was constructing faulty headers.
Bob
P.S. Tokio & Barry: please don't forget to have a look at the bounce issue I posted a week or so regarding - that's a nasty one!
Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-developers%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/iane%40sussex.a c.uk
Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp
-- Ian Eiloart IT Services, University of Sussex
At 10:46 AM +0100 2006-08-08, Ian Eiloart quoted "Bob Puff@NLE" <bob@nleaudio.com>:
If people bounce a message every day for a couple weeks, I consider their ISP broken enough to warrant unsubscription.
In my case, it wasn't my ISP, or my server that was at fault. It was the message *senders* mail client that was constructing faulty headers.
Right, but that's Yahoo. That's not Mailman. Mailman is unlikely to be doing this sort of thing. If anything, it would most likely be scrubbing the messages in order to remove illegal formatting.
I can understand the overall desire in this specific case, but I'm having a hard time painting Mailman with that same brush, which would then reasonably lead to a requirement to make significant changes to the Mailman bounce handling scheme in order to try and guess as to what was the real reason behind a particular type of bounce.
I'm not saying that this isn't something that we shouldn't at least look at seriously, I'm just saying I don't quite buy this particular motivation, at least not as it applies to Mailman.
-- Brad Knowles, <brad@stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
Founding Individual Sponsor of LOPSA. See <http://www.lopsa.org/>.
--On 8 August 2006 05:00:17 -0500 Brad Knowles <brad@stop.mail-abuse.org> wrote:
At 10:46 AM +0100 2006-08-08, Ian Eiloart quoted "Bob Puff@NLE" <bob@nleaudio.com>:
If people bounce a message every day for a couple weeks, I consider their ISP broken enough to warrant unsubscription.
In my case, it wasn't my ISP, or my server that was at fault. It was the message *senders* mail client that was constructing faulty headers.
Right, but that's Yahoo. That's not Mailman. Mailman is unlikely to be doing this sort of thing. If anything, it would most likely be scrubbing the messages in order to remove illegal formatting.
Really, Mailman is fixing up message header syntax?
I can understand the overall desire in this specific case, but I'm having a hard time painting Mailman with that same brush, which would then reasonably lead to a requirement to make significant changes to the Mailman bounce handling scheme in order to try and guess as to what was the real reason behind a particular type of bounce.
I'm not saying that this isn't something that we shouldn't at least look at seriously, I'm just saying I don't quite buy this particular motivation, at least not as it applies to Mailman.
-- Ian Eiloart IT Services, University of Sussex
At 3:08 PM -0400 2006-08-07, James Ralston wrote:
Perhaps, but we cannot solve this problem, and there's a fine line between working around stupidity and coddling it.
Right, but if we can't fix the problem of the multitude of broken MTAs out there, and the fact that most of them probably don't assign the appropriate extended response codes in accordance with the RFCs, then the likelihood is that we are going to be lead to make the wrong guesses based on the response we get.
I think the question is how damaging are those wrong guesses, and as compared to not making any attempt to guess one way or the other and just treat all bounces as the same?
Without any further detailed information, my gut feeling is that we're better off not trying to guess what the real reason was for a given bounce, but to just treat them all the same and to give enough lattitude that people don't get unsubscribed with just a single bounce (or whatever).
What further data do you wish to see? I think I've documented the problem well enough. There's no way we know many horribly broken sites are out there.
Save a copy of each and every bounce you get over an extended period of time (this may require modifications to the source code), and then try to categorize them by the easy-to-parse numeric response code versus the harder-to-parse description, and actually find out how the cookie crumbles.
Describing the one particular type of sub-problem that you've run into doesn't really help us in this situation, not when you're talking about changing the behaviour of an entire subsystem in order to accommodate your one specific issue.
Instead, you need to go on a quest to obtain large amounts of data that demonstrate how easy (or hard) it is to determine the real reason why some message bounced and then figure out how you can take that information and modify the source code to suit.
Right: the only risk is that bounces coming from a subscriber at a broken site might be ignored, because they look like they're being generated based on the content of certain messages.
I'm not convinced that's the only risk, and I'm not convinced that the potential consequences are that minor. But if you can provide sufficient evidence to show that you are correct, at least for the users on your site, I'm willing to be convinced.
IMHO, this risk is negligible. If the operators of the broken site in question get annoyed that Mailman keeps trying to send messages to a non-existent address, they should fix their broken site.
Well, if the windmill turns out to be Microsoft, you might want to seriously think about whether or not you really want to continue trying to tilt at that thing.
You might want to look into how big this problem could potentially be, before you decide to just casually blow off any possible consequences.
-- Brad Knowles, <brad@stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
Founding Individual Sponsor of LOPSA. See <http://www.lopsa.org/>.
--On 7 August 2006 20:35:06 -0500 Brad Knowles <brad@stop.mail-abuse.org> wrote:
At 3:08 PM -0400 2006-08-07, James Ralston wrote:
Perhaps, but we cannot solve this problem, and there's a fine line between working around stupidity and coddling it.
Right, but if we can't fix the problem of the multitude of broken MTAs out there, and the fact that most of them probably don't assign the appropriate extended response codes in accordance with the RFCs, then the likelihood is that we are going to be lead to make the wrong guesses based on the response we get.
We already do that. This is the problem that we're trying to solve, not a new problem introduced by the proposal!
-- Ian Eiloart IT Services, University of Sussex
At 10:56 AM +0100 2006-08-08, Ian Eiloart wrote:
Right, but if we can't fix the problem of the multitude of broken MTAs out there, and the fact that most of them probably don't assign the appropriate extended response codes in accordance with the RFCs, then the likelihood is that we are going to be lead to make the wrong guesses based on the response we get.
We already do that. This is the problem that we're trying to solve, not a new problem introduced by the proposal!
No, that's precisely the problem -- the proposal does cause new problems that have to be dealt with.
Because of all the broken MTAs out there, I believe that the probability is high that we will be unable to guess correctly what type of bounce we have for a statistically significant subsection of the population, and that the potential consequences of either a false negative or a false positive in this case are higher than taking the K.I.S.S. approach and not making any attempt to guess what type of bounce we're dealing with.
So, feel free to go ahead and make this change and to put this entire issue to rest, at least for the data you've collected from your site.
-- Brad Knowles, <brad@stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
Founding Individual Sponsor of LOPSA. See <http://www.lopsa.org/>.
--On 8 August 2006 05:13:37 -0500 Brad Knowles <brad@stop.mail-abuse.org> wrote:
At 10:56 AM +0100 2006-08-08, Ian Eiloart wrote:
Right, but if we can't fix the problem of the multitude of broken MTAs out there, and the fact that most of them probably don't assign the appropriate extended response codes in accordance with the RFCs, then the likelihood is that we are going to be lead to make the wrong guesses based on the response we get.
We already do that. This is the problem that we're trying to solve, not a new problem introduced by the proposal!
No, that's precisely the problem -- the proposal does cause new problems that have to be dealt with.
Well, that's not true if the new default behaviour is the current broken behaviour. Would you accept that?
Because of all the broken MTAs out there, I believe that the probability is high that we will be unable to guess correctly what type of bounce we have for a statistically significant subsection of the population, and that the potential consequences of either a false negative or a false positive in this case are higher than taking the K.I.S.S. approach and not making any attempt to guess what type of bounce we're dealing with.
So, feel free to go ahead and make this change and to put this entire issue to rest, at least for the data you've collected from your site.
-- Ian Eiloart IT Services, University of Sussex
--On 7 August 2006 15:08:56 -0400 James Ralston <qralston+ml.mailman-developers@andrew.cmu.edu> wrote:
Choices: [X] Count the bounce against the threshold. [ ] Forward the bounce to the list owner. [ ] Ignore the bounce.
Comments?
The default should NOT be to count the bounce against the threshold. Let's not assume that all remote servers are broken. I'd guess that the default should be to ignore the bounce.
Another option might be to bounce to the original sender. In fact, that could be a *per user* option, though that would require more work.
-- Ian Eiloart IT Services, University of Sussex
participants (6)
-
Bob Puff
-
Bob Puff@NLE
-
Brad Knowles
-
Ian Eiloart
-
James Ralston
-
Mark Sapiro