Mailman 3 Re: [Mailman-Users] Bounce Options - Mailman-Developers

newer
Re: Re: [Mailman-Developers] Re:...

Re: [Mailman-Users] Bounce Options

older
Quick script to update footers...

barry＠zope.com

Nov. 29, 2001

9:04 p.m.

[Changing followups to mailman-developers as this discussion really belongs there. -BAW]

...

...
...
...
...
"DW" == Dan Wilder <dan@ssc.com> writes:

DW> I guess I'm wondering if anybody recalls the intent of this
DW> code.

The only person who ever had a chance of understanding the intent is John Viega, but he's been removed from Mailman hacking for so long, I doubt even he remembers.

I've looked at this stuff too, and there are a number of things that seem just plain broken to me. I'd like to rewrite it all, but I'm not sure there will be time before 2.1.

-Barry

Show replies by date

Dale Newfield

November 2001

9:58 p.m.

New subject: [Mailman-Users] Bounce Options

...

I poked around in this myself a bit ago, w/o much benefit. It's a bit jumbled in there, and I thought it was an indication that either I just didn't understand it or it was broken. After a while I became convinced of the later, and I got discouraged. :-) Just so it's on your radar when you do take a look at it, I'll mention that we only sometimes get the notification of bounce removals.

-Dale

Bob Puff＠NLE

3:02 a.m.

New subject: [Mailman-Users] Bounce Options

Yeah, I was just chatting with Barry about this. There are some serious problems with the bounce handling. A user can be removed with only two bounces, if they occur 5 days apart (if you have your bounce handlign set to 5 days of continuous bouncing).

I have done some work in fixing up bounce -detection- (at least on my sites, Mailman was catching about 80% of the bounces... now it gets 99.9%). But after it detects it... that part needs help!

Bob

Dan Wilder

11:55 p.m.

New subject: [Mailman-Users] Bounce Options

On Thu, Nov 29, 2001 at 04:04:23PM -0500, Barry A. Warsaw wrote:

...

[Changing followups to mailman-developers as this discussion really belongs there. -BAW]

...
...
...
...
...
"DW" == Dan Wilder <dan@ssc.com> writes:
DW> I guess I'm wondering if anybody recalls the intent of this
DW> code.
The only person who ever had a chance of understanding the intent is John Viega, but he's been removed from Mailman hacking for so long, I doubt even he remembers.

Hmm.

So what's a reasonable intent for bounce handling?

Here's a sketch. No doubt I misunderstand important points. Perhaps others would be kind enough to comment.

Presuming the list is configured for automatic bounce handling at all, it would seem reasonable to claim that there are circumstances under which bounce handling might unsubscribe or disable mail to a subscriber.

The sporadic bounce probably shouldn't cause this sort of action. So, there should be some forgiveness mechanism in place.

Several bounces over a short period of time might reasonably be forgiven, or treated as a single bounce. Many situations that will cause a bounce involve some misconfiguration which the conscientious sysadmin will shamefacedly correct as soon as it is brought to his or her attention. A heavily trafficked list might not want to unsubscribe even members who cause several bounces, providing these fall within a short period of time.

Several bounces over a longer period of time might be cause for suspension, even if posts are accepted between.

The existing bounce handling makes some distinction I don't understand between "fatal" bounces and "nonfatal" bounces.
Is this "no such user" versus "host busy", for example?

--

Dan Wilder <dan@ssc.com> Technical Manager & Editor SSC, Inc. P.O. Box 55549 Phone: 206-782-8808 Seattle, WA 98155-0549 URL http://embedded.linuxjournal.com/

Nigel Metheringham

11:06 a.m.

New subject: [Mailman-Users] Bounce Options

On Thu, 2001-11-29 at 23:55, Dan Wilder wrote:

...

So what's a reasonable intent for bounce handling?

Here's a sketch. No doubt I misunderstand important points. Perhaps others would be kind enough to comment.

Presuming the list is configured for automatic bounce handling at all, it would seem reasonable to claim that there are circumstances under which bounce handling might unsubscribe or disable mail to a subscriber.

Definitely. In general I (as list admin) want almost zero involvement here.

...

The sporadic bounce probably shouldn't cause this sort of action. So, there should be some forgiveness mechanism in place.

Yes - I occaisionally bounce stuff from bugtraq because we have a filter on that bounces executable content (ie outlook virus hacks), but it also bounces some exploit code :-) So I sometimes bounce 10% of the messages in a day.

...

Several bounces over a short period of time might reasonably be forgiven, or treated as a single bounce. Many situations that will cause a bounce involve some misconfiguration which the conscientious sysadmin will shamefacedly correct as soon as it is brought to his or her attention. A heavily trafficked list might not want to unsubscribe even members who cause several bounces, providing these fall within a short period of time.

Several bounces over a longer period of time might be cause for suspension, even if posts are accepted between.

The existing bounce handling makes some distinction I don't understand between "fatal" bounces and "nonfatal" bounces.
Is this "no such user" versus "host busy", for example?

SMTP has immediate and retryable errors

For an announce list - I have a few (similar to mailman-announce) which have traffic in single digits per month (if that high), the rules may need to be different - 2 bounces from a user in 5 days is going to be close to impossible to achieve for example :-)

Unfortunately its hard to identify correctly delivered messages - so you cannot easily use a "if a message is delivered correctly, reset the death counter" approach - one UK ISP routinely bounces mail to their users that has not been picked up after 30 days (and their bounce messages are currently unparsable).

Nigel.

Dan Wilder

5:01 p.m.

New subject: [Mailman-Users] Bounce Options

On Fri, Nov 30, 2001 at 11:06:36AM +0000, Nigel Metheringham wrote:

...

On the other hand, removal after four bounces within two months might work OK for you. Provided there's some way to forgive those who also accepted delivery on, say, three consecutive posts.

...

Yes. That could be what's behind some of the difficult-to-understand and maybe broken stuff in Bouncer.py.

No news would seem to be good news. Provided there's been a post or two during the "no-news" time.

An adequate forgiveness mechanism should probably take list traffic into account, but I'll confess I'm having a hard time getting my head around just what would make sense. Everything I think of gets complex too fast. A sign of either an intractable problem or a wrong solution. I can't believe this problem is intractable.

...

Ugh!

--

Dan Wilder <dan@ssc.com> Technical Manager & Editor SSC, Inc. P.O. Box 55549 Phone: 206-782-8808 Seattle, WA 98155-0549 URL http://embedded.linuxjournal.com/

barry＠zope.com

7:36 p.m.

New subject: [Mailman-Users] Bounce Options

Here are some of my recent "shower" thoughts about bounce handling (i.e. what does Mailman do /after/ it detects a bounce?).

We can't do any positive delivery death-resets because in general we're never informed about successful deliveries. Anything that relies on such notices will be too unreliable.
We (can) know exactly two things:
1. how many messages we've sent per period of time
2. how many bounces we've in that same period of time from a specific user
It's probably infeasible to link specific deliveries with specific bounces (we could possibly do it w/VERP, but it'll make things too complicated).
For simplicity, let's treat non-fatal bounces (some temporary outage) the same as fatal bounces (user goes away)
We want to keep the knobs that a list admin can twiddle to a minimum, and make them completely obvious.
Provide a multi-phase disposition to bouncing addresses. I.e. at the first phase we disable them, then we send some disable notifications, then we remove them.
We need to differentiate b/w disabled-by-bounce and disabled-by-choice.

So, here's my proposal:

Each list has a "bounce start date", which can be the list's creation date if it has one (MM2.1), or some arbitrary time post where we start counting.

We already count the number of messages sent through the list via the post-id. We may need to reset this to zero if we're using an arbitrary t0.

These two pieces of information give us the number of messages/day average being sent though the list.

When a user starts bouncing, we record the start time. We continue to count the number of bounces from this address for some configurable amount of time.

List admin knob 1: For how many consecutive days should an address
be bouncing before we take action?   Proposed default: 14

After that, we look at how many bounces this person had, and the average delivery rate of the list. We can thus calculate roughly the percentage of deliveries that this user bounced.

List admin knob 2: Percentage of total deliveries to the list that
must bounce for an address, in the above time period, for that
address to be automatically disposed of.  Proposed default: 50%

Thus if we sent out an average of 500 messages to the list, and over the last 14 days we saw 250 bounces from bogus@dom.ain, we would dispose of this address. (Side note: we have to keep track of regular vs. digest deliveries separately).

List admin knob 3: Action to take when disposing of a bouncing
address.  "Disable w/ occasional reminders", "Disable w/ one last
notice", "Disable w/o notice", "Remove now w/ one last notice",
"Remove now w/o notice".  Proposed default: "Disable w/ occasional
reminders".

All but the first should be obvious. "Disable w/ occasional reminders" means we disable the address from regular delivery, but start sending them occasional reminders about their disabled delivery. The reminder will contain instructions for re-enabling, as well as a note like "You will receive 3 more reminders over the next 21 days unless you take action".

List admin knob 4: How many days should there be between reminders
to disabled addresses?  Proposed default: 7

List admin knob 5: Total number of reminders to send before second
order disposition occurs?  Proposed default: 4

A person who's membership has been disabled due to bounces must explicitly re-enable delivery via their options page, or via the confirmation cookie contained in the reminder messages.

List admin knob 6: Second order disposition for bouncing members
who do not re-enable their accounts: "Disable w/ one last notice",
"Disable w/o notice", "Remove now w/ one last notice", "Remove now
w/o notice".  Proposed default: Remove now w/ one last notice.

The last knob allows the list administrator to cull all disabled bouncing addresses in one fell swoop. This will be a volatile option that performs an immediate action. Note that only addresses disabled via bouncing will be removed.

If a member has been bouncing because of temporary problems, they will probably never reach the threshold for automatic phase 1 disposition, given a high enough percentage or a long enough tracking period. Even if they are, and we send out reminders, they should be able to just go to their options page and re-enable (the options page clearly tells them that their account is disabled).

Mailman has a problem currently that we have no idea whether an address is disabled-by-choice or disabled-by-bounce. We need to fix that, but what to do about all the addresses that are currently disabled? We could just treat them all as disabled-by-bounce, sending an occasional reminder, and asking them if they really want to keep their disabled membership, they should go to their options page and re-click on the no-mail option. Anybody who doesn't do this within the second-phase gets removed as per above.

Thoughts? -Barry

Bob Puff＠NLE

10:35 p.m.

New subject: [Mailman-Users] Bounce Options

Barry:

Good thoughts. Let me add something:

On all my lists, if a person is bouncing, I want them removed, not just set to nomail. Otherwise as you mentioned, the nomail list gets bigger and bigger. AFAIK, the only people that should be on the nomail list are those who have signed up as such.

So if you want to do the "I'm about to nuke you, one last chance" thing, I would suggest _another_ bit be used in their settings to identify this.

Also update list_members so that you can sort by these fields. (Shameless plug - my modified list_members already lets you sort by nomail, and digest type. Did this patch get merged into 2.08?)

I'm not so sure if doing an estimated average of messages sent is the right thing. How about something like this:

Each user has three entries: time/date stamp for first bounce, time/date stamp for last bounce, and a bounce counter (16 bits should be fine!)

Upon a bounce, do this:

Time/date stamp the last_bounce field with the current time/date.
Check the first_bounce field to see if it is null. If so, put the current time/date stamp there too, and set the bounce_counter to 0.
Increment the bounce counter. (actually not used in this text.)

Do nothing else at this point.

Now, we know when a regular message goes out, and we also know when a digest goes out. Keep a record of the last x (21?) days' worth of postings (see below).

Now! Once a day, set up a cron script to go through each user entry. Do the following:

Examine the first_bounce time/date stamp. If null, skip to the next user.
Check out post log to see how many days since first_bounce have had messages. Is it less than X days?
YES: does last_bounce = the last entry date in the posting log (or today)? YES: we're still bouncing, but haven't hit our cutoff yet. Skip to the next user. NO: NULL out the first_bounce, and go to the next user. We apparently stopped bouncing, so we need to reset. NO: we've hit our age limit. Let's see if he's still bouncing: Does last_bounce = the last entry date in the posting log (or today)? YES: NUKE EM!!! NO: Null out the first_bounce, and go to the next user. Apparently this guy really lucked out, and stopped bouncing the day before he would have been nuked (remember, we are doing this every day).

The posting log would only need to have a single entry of a date for each day that a message was sent thru the list, i.e., 11/28/01 11/29/01 11/30/01 etc...

This could be updated at the same time the daily_digest is dispatched, or it could be part of the main qrunner system. The key is having an entry here for each day a message is sent out. It is such a small file that you could suck it into memory, and "rotate" it so that it only keeps a history of x+1 entries (say 21).

This allows us to catch and properly detect the bounces for the inactive lists as well.

Notice that I didn't even use the bounce_counter. You could still test for it, but I think continuous bouncing over x days is a better method to use.

Using the above algorithm does have this flaw: if a user bounces one message per day, it will still remove them. But sporatic problems usually don't surface like that.

Bob

barry＠zope.com

December 2001

7:18 a.m.

New subject: [Mailman-Users] Bounce Options

...

...
...
...
...
"B" == Bob <bob@nleaudio.com> writes:

B> On all my lists, if a person is bouncing, I want them removed,
B> not just set to nomail.

I envision this as possible via the 1st phase disposition being set to "Remove now" (with or without one last notice).

B> Otherwise as you mentioned, the nomail list gets bigger and
B> bigger.  AFAIK, the only people that should be on the nomail
B> list are those who have signed up as such.

Yes, I want to separate the concepts of disabled-by-choice and disabled-by-bounce. Right now, you've got no idea.

B> So if you want to do the "I'm about to nuke you, one last
B> chance" thing, I would suggest _another_ bit be used in their
B> settings to identify this.

It will likely be implemented as a separate flag, mostly for backwards compatibility. Actually, it's possible that both disabled-by-* states will be represented by new flags so we can try to do /something/ with the current messy state of affairs.

B> Also update list_members so that you can sort by these fields.
B> (Shameless plug - my modified list_members already lets you
B> sort by nomail, and digest type.  Did this patch get merged
B> into 2.08?)

Nope, but it wouldn't anyway. I've consigned 2.0.x to critical (mostly security) fixes only. Otherwise, there's no hope 2.1 will get done. ;)

B> I'm not so sure if doing an estimated average of messages sent
B> is the right thing.

Me neither. It /seems/ reasonable...

B> How about something like this:

B> Each user has three entries: time/date stamp for first bounce,
B> time/date stamp for last bounce, and a bounce counter (16 bits
B> should be fine!)

B> Upon a bounce, do this:

B> 1. Time/date stamp the last_bounce field with the current
B> time/date.  2. Check the first_bounce field to see if it is
B> null.  If so, put the current time/date stamp there too, and
B> set the bounce_counter to 0.  3. Increment the bounce counter.
B> (actually not used in this text.)

B> Do nothing else at this point.

B> Now, we know when a regular message goes out, and we also know
B> when a digest goes out.  Keep a record of the last x (21?)
B> days' worth of postings (see below).

B> Now! Once a day, set up a cron script to go through each user
B> entry.  Do the following:

B> 1. Examine the first_bounce time/date stamp.  If null, skip to
B> the next user.  2. Check out post log to see how many days
B> since first_bounce have had messages.  Is it less than X days?
B>    YES: does last_bounce = the last entry date in the posting
B> log (or today)?  YES: we're still bouncing, but haven't hit our
B> cutoff yet.  Skip to the next user.  NO: NULL out the
B> first_bounce, and go to the next user.  We apparently stopped
B> bouncing, so we need to reset.  NO: we've hit our age limit.
B> Let's see if he's still bouncing: Does last_bounce = the last
B> entry date in the posting log (or today)?  YES: NUKE EM!!!  NO:
B> Null out the first_bounce, and go to the next user.  Apparently
B> this guy really lucked out, and stopped bouncing the day before
B> he would have been nuked (remember, we are doing this every
B> day).

If I understand your approach correctly, the problem is that bounces may be clumped and may not be correlated in time with the postings that causes them. You might send out 10 messages today, but you may not see the bounces for them for a couple of days. And then you might see all 10, one today and nine tomorrow, etc. And there's no feasible way to connect the bounces with the messages they were in reference to. By averaging things out, I'm trying to get a general sense of whether the user is receiving things or not.

The interesting idea you have is the post counter buckets. By keeping a sliding average of only the last 30 days or 60 days our bouncer might adjust better to longer term changes in mailing traffic, while still smoothing out the peaks and valleys.

B> Using the above algorithm does have this flaw: if a user
B> bounces one message per day, it will still remove them.  But
B> sporatic problems usually don't surface like that.

I'm more concerned with the user who fills up his disk and doesn't notice it for a week because they're on vacation. I'd like Mailman to be robust against this, and I think the average non-deliveries over a couple of weeks, with consignment to probation probably catches most of this use case.

Thanks, -Barry

Chuq Von Rospach

11:06 p.m.

New subject: [Mailman-Users] Bounce Options

On 11/30/01 11:18 PM, "Barry A. Warsaw" <barry@zope.com> wrote:

...

It is, but it's lots of work.

How about something simpler:

Timestamp the FIRST bounce on each day. Then, track bounces for N days (N being configurable per list). If you're allowing a user to bounce for 3 days, then if you have a bounce for day N, and day N-3, then you have a match and the user is disabled. At some time longer than N days, bounce timestamps are deleted, since they're no longer valid.

You can set a default value for the 'typical' list as the default. I tend to think that any moderately busy list (say, more than 2-3 messages a day) would be handle by N=3 to N=5, depending on whether you want to avoid the possibility of long-weekend-daemon breakage or not.

Then add in the "soft bounce = .5 bounce", and soft bounces would extend N. And digests, do they need to be handled differently? I tend to think so. For digests, to be careful, I'd probably start them at soft-bounce, and if their bounces ARE soft, go to .25 bounce...

Trying to count messages and then figure out some percentage of them as having bounced, and etc -- I think from my experience it's overkill. You get almost all of that by keeping track of "we have records of bouncing over this period of time....". You might occasionally nail a person who's system is bouncing every 8th message or some godawful thing, but IMHO, that person ought to thank you for helping them find that out...

For announce, moderated and or things like e-newsletters, with very low volumes, an admin would have to decide what N should be, and likely, set it longer. You might want to keep track of them for 14 days, and if you've bounced at least twice in that time (once 14 days ago, once today), that constitutes a legitimate bounce. For something like an e-newsletter, N might even be 45 or 60 days: imagine a once-a-month beast where you want at least two bounces to prove it's dead.

Hmm. That brings up a glitch in my design. If you have < 1 posting a day, that first posting has to be within some "window" around that N days, you can't depend on it being exactly N days away. So you really need some kind of way of saying "Bounces N days apart, with at least M bounces in that time, and the first bounce within Q days" -- 60 days, 3 or more recorded bounces, first bounce within 5 days of "N". Make sense?

It's still a much simpler accounting system than we currently have, or one that requires tracking what the list does, and trying to guess what bounces should do from that. Let the admin define the policy, which boils down to "how long", "minimum # of bounces in then" and "granularity" (for lack of a better word)

...

This is correct. There are some sites (not as prevalant as it once was) that only send back bounces and admin mail during "off" hours. Since most of them seemed to be european, of course, it meant their bounces showed up in MY prime time, but that's not their problem.... (grin). You can't depend on bounces returning in any kind of "reliable" stream.

...

The above handles that, especially if you throw in the "soft bounce" factoring... You can set "N" to a week, or ten days, if you don't mind having more bounces flowing in and out of the system. But users who need to minimize throughput or are more willing to be strict can cut it shorter.

(one other thought, which some mailers do. When bounces START, switch them to digest mode. It cuts the overhead on all sides, and allows them a cahnce to fix things before disabling, without all those bounces and etc rolling around... )

Bob Puff＠＠NLE

12:56 a.m.

New subject: [Mailman-Users] Bounce Options

Ok, I've thought more about this today, and optimized my code a little.. <g>

Check this out:

min_bounce_days = 5 (max # of days we say it will take for a bounce to come to us) max_bounce_days = 14 (number of days to allow bouncing)

Once every day (really doesn't matter when), post_counter is incremented ONLY if a message went out that day. This would probably be a 16 bit non-signed number.

There are only three numeric entries (16 bit unsigned as well) needed per user. They are: first_bounce, last_bounce, and bounce_count.

Here's the logic:

Upon receiving a bounce, the user record is called up.

if user(first_bounce) = null { user(first_bounce)=post_counter user(bounce_count)=0 } user(last_bounce)=post_counter; user(bounce_count)++; print to the log "User x Bounced "user(bounce_count)" times."

Now for the script that gets run once a day:

if there were any posts today, do this: post_counter++; for all users: if user(first_bounce) <> NULL { if (post_counter - user(first_bounce)) < min_bounce_days then break out, exit. # We don't want to do anything for the first few days.. just keep logging, and exit. if (post_counter - user(last_bounce)) > min_bounce_days { user(first_bounce) = user(last_bounce) = NULL; print to log, "User x stopped bouncing." } - exit out. # our last bounce was a while ago, so looks like we've been delivering # mail ok, so we reset the counters and exit. if (user(last_bounce) - user(first_bounce) > max_bounce_days { remove the sucker! or set him to "bouncing nomail" also don't forget to set user(first_bounce) = NULL } }

Pretty simple code, huh? One thing that needs fixing is the wrapping of the 16 bit numbers, and taking care of the proper subtraction when one number wraps.

The way it works is this: When a user fresh bounces, it sets their first_bounce and last_bounce to the current post_counter. Now let's say the user stops bouncing mail. After a while, we'll see that our last_bounce number is getting further from the current post_counter. So if we see that last_bounce gets more than say 5 counts (days) away from current, we can safely assume they haven't been bouncing, and can clear their entries.

If a user is still bouncing, we'll see the last_bounce stay pretty close to post_counter. So now we have to see how long they have been bouncing, which can be done by simply subtracting last_bounce from first_bounce. The code in the preceeding paragraph cleans up any old junk, so we can be certain this is current data.

The beauty of this is that even on a list that gets a post every other week, this code will work. Or if you get 100 messages/day, it still works, because it's based on message days. Those who have inactive lists may want to set max_bounce_days to something like 7-8, and will have to realize that this means 7-8 days messages went out.. so if you have 1 message per week, it would take 7-8 weeks for the first user to be flagged.

The other variable, min_bounce_days, lets you tune it for however long you think it would take a mail server to respond with a bounce. I've seen 5 day messages, so that might be a good initial value. The code will wait for that many "message days" before considering a one-time bouncer has stopped bouncing... which is no big deal, because we haven't done anything drastic at this point anyway.

What do you guys think?

The above code is not in any defined programming language.. :-)

Chuq: I do like the idea of the warning messages. Could probably scan for that "bounce-nomail" flag in the same once-a-day script and generate emails at specified days as you said.

Bob

barry＠zope.com

4:40 a.m.

New subject: [Mailman-Users] Bounce Options

This makes some sense. I'll try to put all the ideas together and implement some code. I'll make it as modular as possible so we can swap in different logic if we find the algorithm has flaws.

I'll concentrate on that tomorrow night. Right now, sleep. :) -Barry

John W Baxter

4:58 a.m.

New subject: [Mailman-Users] Bounce Options

At 2:18 -0500 12/1/01, Barry A. Warsaw wrote:

...

I'd prefer that such folks subscribe again if still interested. You've provided that option, so that's not a complaint about the design. Note that the "one last warning" may well bounce in this case.

On the other hand, I was on one list with a one bounce and you're out rule (clearly stated in the welcome message). That's too draconian for even me to use (I had to resubscribe about 4 times over 3 years). They were using Majordomo, and the person in charge had a quick trigger finger. ;-)

--John

-- John Baxter jwblist@olympus.net Port Ludlow, WA, USA

Les Niles

3:49 p.m.

New subject: [Mailman-Users] Bounce Options

On Sat, 1 Dec 2001 02:18:45 -0500 barry@zope.com (Barry A. Warsaw) wrote:

...

And having the status clearly displayed to the user as "you were disabled due to bounces" should eliminate a lot of queries to the list admin.

...

For the lists I run, a long probation period would work best: the overhead of re-subscribing, both for the user and the list admin, would outweigh any disadvantage of having a pile of disabled subscriptions laying around. IOW, the probation period ought to be configurable.

On a related note, keeping timestamps for the last time the various states changed -- not just disabled-by-bounce but also disabled-by- choice, etc. -- would be useful. It makes it possible to do things like build a vacation-hold interface that allows users to disable their subscriptions for a defined period.

-les

Dale Newfield

6:08 a.m.

New subject: [Mailman-Users] Bounce Options

On Fri, 30 Nov 2001, Barry A. Warsaw wrote:

...

Your scheme makes sense--I like the idea that subscribers can wind up "on probation" (assuming the list admin configures the list that way). I understand that this simplifying assumption makes the design much easier to think through. As a list admin, though, I'd like to be able to throw a switch that causes Mailman to completely ignore non-fatal bounces (Assuming it's possible to distinguish between them, which is maybe the can of worms you were trying to avoid opening).

-Dale

barry＠zope.com

7:27 a.m.

New subject: [Mailman-Users] Bounce Options

...

...
...
...
...
"DN" == Dale Newfield <Dale@Newfield.org> writes:

DN> Your scheme makes sense--I like the idea that subscribers can
DN> wind up "on probation" (assuming the list admin configures the
DN> list that way).  I understand that this simplifying assumption
DN> makes the design much easier to think through.  As a list
DN> admin, though, I'd like to be able to throw a switch that
DN> causes Mailman to completely ignore non-fatal bounces
DN> (Assuming it's possible to distinguish between them, which is
DN> maybe the can of worms you were trying to avoid opening).

I'm not sure it always /is/ possible to distinguish them, especially if you add VERP into the mix. AFAICT, that's the one flaw with VERP. With bounce detection at least you have a wild-ass chance of distinguishing permanent from temporary failures (e.g. DSN).

My simplification assumes that temporary failures really /are/ temporary! I think my approach is robust in the face of probably the most common temporary failure I see as a list admin: a user running out of disk space for a period of several days or a week. Once they get a clue and free up some room, and deliveries start to succeed, I want them to get off probation. Ideally automatically, but as a failsafe, by the probation notices.

-Barry

Bob Puff＠＠NLE

1:32 a.m.

New subject: [Mailman-Users] Bounce Options

...

I agree. Mailman has a hard enough time now detecting the bounces; trying to parse out permanent vs non-permanent is IMHO wasted code. I think just setting the bounce numbers higher will let this automagically happen.

Bob

Chuq Von Rospach

7:46 p.m.

New subject: [Mailman-Users] Bounce Options

On 11/30/01 10:08 PM, "Dale Newfield" <Dale@Newfield.org> wrote:

...

What you might consider is this:

If the bounce is RFC compliant, it's fairly simple to determine "hard" and "soft" bounces, and since they are following the standards, it's not a huge amount of work. Treat a "soft" bounce as half a bounce. That gives the soft bounce twice as long to actually come into effect.

If the bounce is one of the many non-RFC compliant mail systems, treat everything as hard bounces. You don't spend the work trying to read their non-compliant tea leaves, and they have some quiet encouragement to get their act together and become RFC compliant.

...

It'd be really nice if bounce-nomail and user-nomail are separate modes, so we can tell the difference. Beyond that, what would be optimum for me is if bounces went to nomail mode, and then if they're still nomail 30 days later, deleted from the system. That gives a user a chance to "come back" without losing their subscription state, but not hang around forever....

At some point, it'd be nice to be able to validate those other nomail addresses, similar to the monthly password reminder (or part of it). Something that says "you have this account sent to this mode. If you want this, click 'here'. If you don't, do nothing and we'll delete it. Where 'click here' takes you to a link that sets the "I'm okay" counter on that nomail status for another 90 days or something...

barry＠zope.com

7:53 p.m.

New subject: [Mailman-Users] Bounce Options

...

...
...
...
...
"CVR" == Chuq Von Rospach <chuqui@plaidworks.com> writes:

CVR> If the bounce is RFC compliant, it's fairly simple to
CVR> determine "hard" and "soft" bounces, and since they are
CVR> following the standards, it's not a huge amount of
CVR> work. Treat a "soft" bounce as half a bounce. That gives the
CVR> soft bounce twice as long to actually come into effect.

Nice. I like that.

CVR> If the bounce is one of the many non-RFC compliant mail
CVR> systems, treat everything as hard bounces. You don't spend
CVR> the work trying to read their non-compliant tea leaves, and
CVR> they have some quiet encouragement to get their act together
CVR> and become RFC compliant.

Yup.

>> I like the idea that subscribers can wind up "on probation"
>> (assuming the list admin configures the list that way).  I
>> understand that this simplifying assumption makes the design
>> much easier to think through.

CVR> It'd be really nice if bounce-nomail and user-nomail are
CVR> separate modes, so we can tell the difference. Beyond that,
CVR> what would be optimum for me is if bounces went to nomail
CVR> mode, and then if they're still nomail 30 days later, deleted
CVR> from the system. That gives a user a chance to "come back"
CVR> without losing their subscription state, but not hang around
CVR> forever....

Yes, definitely. That's part of the plan.

CVR> At some point, it'd be nice to be able to validate those
CVR> other nomail addresses, similar to the monthly password
CVR> reminder (or part of it).  Something that says "you have this
CVR> account sent to this mode. If you want this, click 'here'. If
CVR> you don't, do nothing and we'll delete it. Where 'click here'
CVR> takes you to a link that sets the "I'm okay" counter on that
CVR> nomail status for another 90 days or something...

Yes. That's why the current "disabled-by-idunno" state should be a third, transitional state. We can hook that in with some cronjob to convert folks.

Thanks, -Barry

Dale Newfield

November 2001

9:58 p.m.

New subject: [Mailman-Users] Bounce Options

...

-Dale

Bob Puff＠NLE

3:02 a.m.

New subject: [Mailman-Users] Bounce Options

I have done some work in fixing up bounce -detection- (at least on my sites, Mailman was catching about 80% of the bounces... now it gets 99.9%). But after it detects it... that part needs help!

Bob

Dan Wilder

11:55 p.m.

New subject: [Mailman-Users] Bounce Options

On Thu, Nov 29, 2001 at 04:04:23PM -0500, Barry A. Warsaw wrote:

...

[Changing followups to mailman-developers as this discussion really belongs there. -BAW]

...
...
...
...
...
"DW" == Dan Wilder <dan@ssc.com> writes:
DW> I guess I'm wondering if anybody recalls the intent of this
DW> code.
The only person who ever had a chance of understanding the intent is John Viega, but he's been removed from Mailman hacking for so long, I doubt even he remembers.

Hmm.

So what's a reasonable intent for bounce handling?

Here's a sketch. No doubt I misunderstand important points. Perhaps others would be kind enough to comment.

The sporadic bounce probably shouldn't cause this sort of action. So, there should be some forgiveness mechanism in place.

Several bounces over a longer period of time might be cause for suspension, even if posts are accepted between.

The existing bounce handling makes some distinction I don't understand between "fatal" bounces and "nonfatal" bounces.
Is this "no such user" versus "host busy", for example?

--

Dan Wilder <dan@ssc.com> Technical Manager & Editor SSC, Inc. P.O. Box 55549 Phone: 206-782-8808 Seattle, WA 98155-0549 URL http://embedded.linuxjournal.com/

Nigel Metheringham

11:06 a.m.

New subject: [Mailman-Users] Bounce Options

On Thu, 2001-11-29 at 23:55, Dan Wilder wrote:

...

So what's a reasonable intent for bounce handling?

Here's a sketch. No doubt I misunderstand important points. Perhaps others would be kind enough to comment.

Presuming the list is configured for automatic bounce handling at all, it would seem reasonable to claim that there are circumstances under which bounce handling might unsubscribe or disable mail to a subscriber.

Definitely. In general I (as list admin) want almost zero involvement here.

...

The sporadic bounce probably shouldn't cause this sort of action. So, there should be some forgiveness mechanism in place.

...

Several bounces over a short period of time might reasonably be forgiven, or treated as a single bounce. Many situations that will cause a bounce involve some misconfiguration which the conscientious sysadmin will shamefacedly correct as soon as it is brought to his or her attention. A heavily trafficked list might not want to unsubscribe even members who cause several bounces, providing these fall within a short period of time.

Several bounces over a longer period of time might be cause for suspension, even if posts are accepted between.

The existing bounce handling makes some distinction I don't understand between "fatal" bounces and "nonfatal" bounces.
Is this "no such user" versus "host busy", for example?

SMTP has immediate and retryable errors

Nigel.

Dan Wilder

5:01 p.m.

New subject: [Mailman-Users] Bounce Options

On Fri, Nov 30, 2001 at 11:06:36AM +0000, Nigel Metheringham wrote:

...

On the other hand, removal after four bounces within two months might work OK for you. Provided there's some way to forgive those who also accepted delivery on, say, three consecutive posts.

...

Yes. That could be what's behind some of the difficult-to-understand and maybe broken stuff in Bouncer.py.

No news would seem to be good news. Provided there's been a post or two during the "no-news" time.

...

Ugh!

--

Dan Wilder <dan@ssc.com> Technical Manager & Editor SSC, Inc. P.O. Box 55549 Phone: 206-782-8808 Seattle, WA 98155-0549 URL http://embedded.linuxjournal.com/

barry＠zope.com

7:36 p.m.

New subject: [Mailman-Users] Bounce Options

Here are some of my recent "shower" thoughts about bounce handling (i.e. what does Mailman do /after/ it detects a bounce?).

We can't do any positive delivery death-resets because in general we're never informed about successful deliveries. Anything that relies on such notices will be too unreliable.
We (can) know exactly two things:
1. how many messages we've sent per period of time
2. how many bounces we've in that same period of time from a specific user
It's probably infeasible to link specific deliveries with specific bounces (we could possibly do it w/VERP, but it'll make things too complicated).
For simplicity, let's treat non-fatal bounces (some temporary outage) the same as fatal bounces (user goes away)
We want to keep the knobs that a list admin can twiddle to a minimum, and make them completely obvious.
Provide a multi-phase disposition to bouncing addresses. I.e. at the first phase we disable them, then we send some disable notifications, then we remove them.
We need to differentiate b/w disabled-by-bounce and disabled-by-choice.

So, here's my proposal:

Each list has a "bounce start date", which can be the list's creation date if it has one (MM2.1), or some arbitrary time post where we start counting.

We already count the number of messages sent through the list via the post-id. We may need to reset this to zero if we're using an arbitrary t0.

These two pieces of information give us the number of messages/day average being sent though the list.

When a user starts bouncing, we record the start time. We continue to count the number of bounces from this address for some configurable amount of time.

List admin knob 1: For how many consecutive days should an address
be bouncing before we take action?   Proposed default: 14

After that, we look at how many bounces this person had, and the average delivery rate of the list. We can thus calculate roughly the percentage of deliveries that this user bounced.

List admin knob 2: Percentage of total deliveries to the list that
must bounce for an address, in the above time period, for that
address to be automatically disposed of.  Proposed default: 50%

List admin knob 3: Action to take when disposing of a bouncing
address.  "Disable w/ occasional reminders", "Disable w/ one last
notice", "Disable w/o notice", "Remove now w/ one last notice",
"Remove now w/o notice".  Proposed default: "Disable w/ occasional
reminders".

List admin knob 4: How many days should there be between reminders
to disabled addresses?  Proposed default: 7

List admin knob 5: Total number of reminders to send before second
order disposition occurs?  Proposed default: 4

A person who's membership has been disabled due to bounces must explicitly re-enable delivery via their options page, or via the confirmation cookie contained in the reminder messages.

List admin knob 6: Second order disposition for bouncing members
who do not re-enable their accounts: "Disable w/ one last notice",
"Disable w/o notice", "Remove now w/ one last notice", "Remove now
w/o notice".  Proposed default: Remove now w/ one last notice.

Thoughts? -Barry

Bob Puff＠NLE

November 2001

10:35 p.m.

New subject: [Mailman-Users] Bounce Options

Barry:

Good thoughts. Let me add something:

So if you want to do the "I'm about to nuke you, one last chance" thing, I would suggest _another_ bit be used in their settings to identify this.

Also update list_members so that you can sort by these fields. (Shameless plug - my modified list_members already lets you sort by nomail, and digest type. Did this patch get merged into 2.08?)

I'm not so sure if doing an estimated average of messages sent is the right thing. How about something like this:

Each user has three entries: time/date stamp for first bounce, time/date stamp for last bounce, and a bounce counter (16 bits should be fine!)

Upon a bounce, do this:

Time/date stamp the last_bounce field with the current time/date.
Check the first_bounce field to see if it is null. If so, put the current time/date stamp there too, and set the bounce_counter to 0.
Increment the bounce counter. (actually not used in this text.)

Do nothing else at this point.

Now, we know when a regular message goes out, and we also know when a digest goes out. Keep a record of the last x (21?) days' worth of postings (see below).

Now! Once a day, set up a cron script to go through each user entry. Do the following:

Examine the first_bounce time/date stamp. If null, skip to the next user.
Check out post log to see how many days since first_bounce have had messages. Is it less than X days?
YES: does last_bounce = the last entry date in the posting log (or today)? YES: we're still bouncing, but haven't hit our cutoff yet. Skip to the next user. NO: NULL out the first_bounce, and go to the next user. We apparently stopped bouncing, so we need to reset. NO: we've hit our age limit. Let's see if he's still bouncing: Does last_bounce = the last entry date in the posting log (or today)? YES: NUKE EM!!! NO: Null out the first_bounce, and go to the next user. Apparently this guy really lucked out, and stopped bouncing the day before he would have been nuked (remember, we are doing this every day).

The posting log would only need to have a single entry of a date for each day that a message was sent thru the list, i.e., 11/28/01 11/29/01 11/30/01 etc...

This allows us to catch and properly detect the bounces for the inactive lists as well.

Notice that I didn't even use the bounce_counter. You could still test for it, but I think continuous bouncing over x days is a better method to use.

Using the above algorithm does have this flaw: if a user bounces one message per day, it will still remove them. But sporatic problems usually don't surface like that.

Bob

barry＠zope.com

December 2001

7:18 a.m.

New subject: [Mailman-Users] Bounce Options

...

...
...
...
...
"B" == Bob <bob@nleaudio.com> writes:

B> On all my lists, if a person is bouncing, I want them removed,
B> not just set to nomail.

I envision this as possible via the 1st phase disposition being set to "Remove now" (with or without one last notice).

B> Otherwise as you mentioned, the nomail list gets bigger and
B> bigger.  AFAIK, the only people that should be on the nomail
B> list are those who have signed up as such.

Yes, I want to separate the concepts of disabled-by-choice and disabled-by-bounce. Right now, you've got no idea.

B> So if you want to do the "I'm about to nuke you, one last
B> chance" thing, I would suggest _another_ bit be used in their
B> settings to identify this.

B> Also update list_members so that you can sort by these fields.
B> (Shameless plug - my modified list_members already lets you
B> sort by nomail, and digest type.  Did this patch get merged
B> into 2.08?)

Nope, but it wouldn't anyway. I've consigned 2.0.x to critical (mostly security) fixes only. Otherwise, there's no hope 2.1 will get done. ;)

B> I'm not so sure if doing an estimated average of messages sent
B> is the right thing.

Me neither. It /seems/ reasonable...

B> How about something like this:

B> Each user has three entries: time/date stamp for first bounce,
B> time/date stamp for last bounce, and a bounce counter (16 bits
B> should be fine!)

B> Upon a bounce, do this:

B> 1. Time/date stamp the last_bounce field with the current
B> time/date.  2. Check the first_bounce field to see if it is
B> null.  If so, put the current time/date stamp there too, and
B> set the bounce_counter to 0.  3. Increment the bounce counter.
B> (actually not used in this text.)

B> Do nothing else at this point.

B> Now, we know when a regular message goes out, and we also know
B> when a digest goes out.  Keep a record of the last x (21?)
B> days' worth of postings (see below).

B> Now! Once a day, set up a cron script to go through each user
B> entry.  Do the following:

B> 1. Examine the first_bounce time/date stamp.  If null, skip to
B> the next user.  2. Check out post log to see how many days
B> since first_bounce have had messages.  Is it less than X days?
B>    YES: does last_bounce = the last entry date in the posting
B> log (or today)?  YES: we're still bouncing, but haven't hit our
B> cutoff yet.  Skip to the next user.  NO: NULL out the
B> first_bounce, and go to the next user.  We apparently stopped
B> bouncing, so we need to reset.  NO: we've hit our age limit.
B> Let's see if he's still bouncing: Does last_bounce = the last
B> entry date in the posting log (or today)?  YES: NUKE EM!!!  NO:
B> Null out the first_bounce, and go to the next user.  Apparently
B> this guy really lucked out, and stopped bouncing the day before
B> he would have been nuked (remember, we are doing this every
B> day).

B> Using the above algorithm does have this flaw: if a user
B> bounces one message per day, it will still remove them.  But
B> sporatic problems usually don't surface like that.

Thanks, -Barry

Chuq Von Rospach

11:06 p.m.

New subject: [Mailman-Users] Bounce Options

On 11/30/01 11:18 PM, "Barry A. Warsaw" <barry@zope.com> wrote:

...

It is, but it's lots of work.

How about something simpler:

...

Bob Puff＠＠NLE

12:56 a.m.

New subject: [Mailman-Users] Bounce Options

Ok, I've thought more about this today, and optimized my code a little.. <g>

Check this out:

min_bounce_days = 5 (max # of days we say it will take for a bounce to come to us) max_bounce_days = 14 (number of days to allow bouncing)

Once every day (really doesn't matter when), post_counter is incremented ONLY if a message went out that day. This would probably be a 16 bit non-signed number.

There are only three numeric entries (16 bit unsigned as well) needed per user. They are: first_bounce, last_bounce, and bounce_count.

Here's the logic:

Upon receiving a bounce, the user record is called up.

Now for the script that gets run once a day:

Pretty simple code, huh? One thing that needs fixing is the wrapping of the 16 bit numbers, and taking care of the proper subtraction when one number wraps.

What do you guys think?

The above code is not in any defined programming language.. :-)

Chuq: I do like the idea of the warning messages. Could probably scan for that "bounce-nomail" flag in the same once-a-day script and generate emails at specified days as you said.

Bob

barry＠zope.com

4:40 a.m.

New subject: [Mailman-Users] Bounce Options

This makes some sense. I'll try to put all the ideas together and implement some code. I'll make it as modular as possible so we can swap in different logic if we find the algorithm has flaws.

I'll concentrate on that tomorrow night. Right now, sleep. :) -Barry

John W Baxter

4:58 a.m.

New subject: [Mailman-Users] Bounce Options

At 2:18 -0500 12/1/01, Barry A. Warsaw wrote:

...

I'd prefer that such folks subscribe again if still interested. You've provided that option, so that's not a complaint about the design. Note that the "one last warning" may well bounce in this case.

--John

-- John Baxter jwblist@olympus.net Port Ludlow, WA, USA

Les Niles

December 2001

3:49 p.m.

New subject: [Mailman-Users] Bounce Options

On Sat, 1 Dec 2001 02:18:45 -0500 barry@zope.com (Barry A. Warsaw) wrote:

...

And having the status clearly displayed to the user as "you were disabled due to bounces" should eliminate a lot of queries to the list admin.

...

-les

Dale Newfield

6:08 a.m.

New subject: [Mailman-Users] Bounce Options

On Fri, 30 Nov 2001, Barry A. Warsaw wrote:

...

-Dale

barry＠zope.com

7:27 a.m.

New subject: [Mailman-Users] Bounce Options

...

...
...
...
...
"DN" == Dale Newfield <Dale@Newfield.org> writes:

DN> Your scheme makes sense--I like the idea that subscribers can
DN> wind up "on probation" (assuming the list admin configures the
DN> list that way).  I understand that this simplifying assumption
DN> makes the design much easier to think through.  As a list
DN> admin, though, I'd like to be able to throw a switch that
DN> causes Mailman to completely ignore non-fatal bounces
DN> (Assuming it's possible to distinguish between them, which is
DN> maybe the can of worms you were trying to avoid opening).

-Barry

Bob Puff＠＠NLE

1:32 a.m.

New subject: [Mailman-Users] Bounce Options

...

Bob

Chuq Von Rospach

7:46 p.m.

New subject: [Mailman-Users] Bounce Options

On 11/30/01 10:08 PM, "Dale Newfield" <Dale@Newfield.org> wrote:

...

What you might consider is this:

...

barry＠zope.com

7:53 p.m.

New subject: [Mailman-Users] Bounce Options

...

...
...
...
...
"CVR" == Chuq Von Rospach <chuqui@plaidworks.com> writes:

CVR> If the bounce is RFC compliant, it's fairly simple to
CVR> determine "hard" and "soft" bounces, and since they are
CVR> following the standards, it's not a huge amount of
CVR> work. Treat a "soft" bounce as half a bounce. That gives the
CVR> soft bounce twice as long to actually come into effect.

Nice. I like that.

CVR> If the bounce is one of the many non-RFC compliant mail
CVR> systems, treat everything as hard bounces. You don't spend
CVR> the work trying to read their non-compliant tea leaves, and
CVR> they have some quiet encouragement to get their act together
CVR> and become RFC compliant.

Yup.

>> I like the idea that subscribers can wind up "on probation"
>> (assuming the list admin configures the list that way).  I
>> understand that this simplifying assumption makes the design
>> much easier to think through.

CVR> It'd be really nice if bounce-nomail and user-nomail are
CVR> separate modes, so we can tell the difference. Beyond that,
CVR> what would be optimum for me is if bounces went to nomail
CVR> mode, and then if they're still nomail 30 days later, deleted
CVR> from the system. That gives a user a chance to "come back"
CVR> without losing their subscription state, but not hang around
CVR> forever....

Yes, definitely. That's part of the plan.

CVR> At some point, it'd be nice to be able to validate those
CVR> other nomail addresses, similar to the monthly password
CVR> reminder (or part of it).  Something that says "you have this
CVR> account sent to this mode. If you want this, click 'here'. If
CVR> you don't, do nothing and we'll delete it. Where 'click here'
CVR> takes you to a link that sets the "I'm okay" counter on that
CVR> nomail status for another 90 days or something...

Yes. That's why the current "disabled-by-idunno" state should be a third, transitional state. We can hook that in with some cronjob to convert folks.

Thanks, -Barry

8480

Age (days ago)

8488

Last active (days ago)

List overview

Download

18 comments

9 participants

participants (9)

barry＠zope.com
Bob Puff＠＠NLE
Bob Puff＠NLE
Chuq Von Rospach
Dale Newfield
Dan Wilder
John W Baxter
Les Niles
Nigel Metheringham

Re: [Mailman-Users] Bounce Options

Dan Wilder

--

Dan Wilder <dan@ssc.com> Technical Manager & Editor SSC, Inc. P.O. Box 55549 Phone: 206-782-8808 Seattle, WA 98155-0549 URL http://embedded.linuxjournal.com/

Nigel Metheringham

Dan Wilder

--

Dan Wilder <dan@ssc.com> Technical Manager & Editor SSC, Inc. P.O. Box 55549 Phone: 206-782-8808 Seattle, WA 98155-0549 URL http://embedded.linuxjournal.com/

Chuq Von Rospach

John W Baxter

Les Niles

Chuq Von Rospach

Dan Wilder

--

Dan Wilder <dan@ssc.com> Technical Manager & Editor SSC, Inc. P.O. Box 55549 Phone: 206-782-8808 Seattle, WA 98155-0549 URL http://embedded.linuxjournal.com/

Nigel Metheringham

Dan Wilder

--

Dan Wilder <dan@ssc.com> Technical Manager & Editor SSC, Inc. P.O. Box 55549 Phone: 206-782-8808 Seattle, WA 98155-0549 URL http://embedded.linuxjournal.com/

Chuq Von Rospach

John W Baxter

Les Niles

Chuq Von Rospach

tags

participants (9)