Mailman 3 Implementing `bounce_you_are_disabled_warnings_interval` in `Send_Warnings` function. - Mailman-Developers

newer
Error after starting mailman-suite

Implementing `bounce_you_are_disabled_warnings_interval` in `Send_Warnings` function.

Aaryan Bhagat

26 Jul 2019 26 Jul '19

11:52 a.m.

I am developing the processing of bounce messages as my GSoC project.
Currently the Bounce Functions are being developed in this pr on GitLab.
There is one bool attribute called bounce_you_are_disabled_warnings_interval in the Mailing List model.
It means

...

The number of days between each disabled notification.

Implementing this has a problem

Say in the case of process_bounces function which processes the BounceEvents function it is easy it just takes one by one the events from the database and processes them.
In the case of Send_Warnings function if the same approach was followed, meaning it would take one by one the Address instances, check tuples in the bounce_info attribute ( see the pr for this ) and see whether to send a warning mail or not to can be a slow method.
Let's suppose the Addresses list is very long, then an Address instance in the very bottom of the list whose subscription has been disabled and some warning emails send, now waits to receive another mail as the interval is more than bounce_you_are_disabled_warnings_interval.
If the function is enumerating from the top then in order to take action first it has to reach this Address instance but the interval is already crossed. This can cause slow performance as the warning mail will be sent way late than it actually should have been sent.
Also if I am implementing 2 functions process_bounces and send_warnings what if both of them attempted at the same Address instance?
Basically implementation on sending warning mails and implementation to increase bounce_score are separate things and they can cause problems if they processed the same instance.

Pointers on above will be helpful. Am I missing something above?

Show replies by thread

Mark Sapiro

26 Jul 26 Jul

4:59 p.m.

On 7/26/19 2:52 AM, Aaryan Bhagat wrote:

...

I am developing the processing of bounce messages as my GSoC project.
Currently the Bounce Functions are being developed in this pr on GitLab.
There is one bool attribute called bounce_you_are_disabled_warnings_interval in the Mailing List model.

It's actually defined as Column(Interval) in your MR which what it should be.

...

It means

...
The number of days between each disabled notification.

Implementing this has a problem

Say in the case of process_bounces function which processes the BounceEvents function it is easy it just takes one by one the events from the database and processes them.

In the case of Send_Warnings function if the same approach was followed, meaning it would take one by one the Address instances, check tuples in the bounce_info attribute ( see the pr for this ) and see whether to send a warning mail or not to can be a slow method.

Let's suppose the Addresses list is very long, then an Address instance in the very bottom of the list whose subscription has been disabled and some warning emails send, now waits to receive another mail as the interval is more than bounce_you_are_disabled_warnings_interval.

If the function is enumerating from the top then in order to take action first it has to reach this Address instance but the interval is already crossed. This can cause slow performance as the warning mail will be sent way late than it actually should have been sent.

Also if I am implementing 2 functions process_bounces and send_warnings what if both of them attempted at the same Address instance?

Basically implementation on sending warning mails and implementation to increase bounce_score are separate things and they can cause problems if they processed the same instance.

Pointers on above will be helpful. Am I missing something above?

You may consider doing this as it's done in MM 2.1. There, once a list's delivery is disabled by bounce for an address and the first notice sent, Mailman's processing has nothing further to do with it. There is a daily cron which is responsible for sending notices and ultimately removing disabled users.

A partially migrated version of this is at https://gitlab.com/mailman/mailman/blob/master/port_me/disabled.py

You might consider implementing this as a 'mailman' subcommand to be run daily by cron to do this task.

-- Mark Sapiro mark@msapiro.net The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Stephen J. Turnbull

29 Jul 29 Jul

11:28 a.m.

Aaryan Bhagat writes:

...

In the case of Send_Warnings function if the same approach was followed, meaning it would take one by one the Address instances, check tuples in the bounce_info attribute ( see the pr for this ) and see whether to send a warning mail or not to can be a slow method.

Let's suppose the Addresses list is very long, then an Address instance in the very bottom of the list whose subscription has been disabled and some warning emails send, now waits to receive another mail as the interval is more than bounce_you_are_disabled_warnings_interval.

If the function is enumerating from the top then in order to take action first it has to reach this Address instance but the interval is already crossed. This can cause slow performance as the warning mail will be sent way late than it actually should have been sent.

I assume that what you mean here is that this process is executed perhaps once per day and probably takes a few minutes at most in processing. Then we have 300s/86400s = 1/288 of a day = probability that during the interval that address's timer expires, and the message is delayed until the process triggers again, hypothetically a full day.

The odds that this happens is quite small unless you have a humongous list. Furthermore, the probability that the mail will go through this time if the account has disabled due to bouncing is presumably way less than 1. So my first-order response is this is a non-problem.

Note that you have basically the same problem if the Address's timer expires a few seconds after the send_warnings process finishes.

My second-order response is if you still care given that this is rarely going to happen, and even more rarely will it generate a message that reaches the user who reactivates the subscription, you can arrange to have an index on the Addresses (or a separate queue) which has the earliest timer first, and then process the Addresses in that order. You'll still have the same problem if the timer expires a few picoseconds after you start processing the particular address. :-) Basically the way to deal with this is to adjust the bounce_you_are_disabled_warnings_interval.

If you do implement the queue, another possibility comes up, which is that you process the queue continuously, with a process (a runner, I guess) that sleeps until it's time to process the next send_warning. This would have the advantage that if there were another "DMARC event" that caused large number of bounces, you wouldn't have a flood of warning messages are a particular time of day.

...

Also if I am implementing 2 functions process_bounces and send_warnings what if both of them attempted at the same Address instance?

What about it? This is even more unlikely, and I assume that process_bounces will have a read-write lock that prevents reading (and writing) while it's writing, while send_warnings will have a read lock that prevents writing while it's reading. I think all of the databases we support have such locks.

Aaryan Bhagat

9:03 p.m.

Stephen Writes:

...

The odds that this happens is quite small unless you have a humongous list. Furthermore, the probability that the mail will go through this time if the account has disabled due to bouncing is presumably way less than 1. So my first-order response is this is a non-problem.

Note that you have basically the same problem if the Address's timer expires a few seconds after the send_warnings process finishes.

I understand your point.

Stephen writes:

...

My second-order response is if you still care given that this is rarely going to happen, and even more rarely will it generate a message that reaches the user who reactivates the subscription, you can arrange to have an index on the Addresses (or a separate queue) which has the earliest timer first, and then process the Addresses in that order. You'll still have the same problem if the timer expires a few picoseconds after you start processing the particular address. :-) Basically the way to deal with this is to adjust the bounce_you_are_disabled_warnings_interval.

This looks fine as of now but I do not know exactly how long or how it will implement, also creating a new runner is also very resource-consuming so I think I will be stick to straightforward implementation as of now. I basically did not realize the type of organizations using Mailman ( especially if they have very large size Mailing List or not so I presumed this problem's existence ). I suggest we can actually see the feedback of customers regarding problems arising in this area and depending upon that we can choose to modify/not-modify this. If we choose to modify it, maybe we can integrate it some of next year's GSoC project.

Stephen writes:

...

What about it? This is even more unlikely, and I assume that process_bounces will have a read-write lock that prevents reading (and writing) while it's writing, while send_warnings will have a read lock that prevents writing while it's reading. I think all of the databases we support have such locks.

I do not understand this part, send_warnings has to increment the bounce_you_are_disabled_warnings of the Address and the process_bounce will have to set the processed attribute of BounceEvent after completely processing that entry to True and save it again. Both of them have to write after reading a single entry so I do understand the reason of the lock.

Stephen J. Turnbull

2 Aug 2 Aug

5:44 a.m.

Aaryan Bhagat writes:

...

Stephen writes:

...
My second-order response is if you still care [...], you can arrange to have an index on the Addresses (or a separate queue) which has the earliest timer first, and then process the Addresses in that order.

This looks fine as of now but I do not know exactly how long or how it will implement, also creating a new runner is also very resource-consuming so I think I will be stick to straightforward implementation as of now.

That is what I would recommend, too. For future reference:

...

I basically did not realize the type of organizations using Mailman (especially if they have very large size Mailing List or not so I presumed this problem's existence).

For Mailman 2, there are a number of organizations with lists with 100,000 members, and I think there were one or two claiming a million or more. There are also organizations with tens of thousands of lists, although I don't know what the average number of members is. I don't know of any 100,000+ scale lists using Mailman 3, but I know that there are organizations with large numbers of lists that are converting to Mailman 3.

For the particular issue regarding how long it takes to process disabled warnings for one large list, you can get a lower bound on the time by creating a few large lists (10,000, 20,000, 50,000, and 100,000 subscribers seem like a good selection) and see how long it takes to go through the whole list when *none* are disabled, and do the same with small lists (100, 200, 500, 1000) where *all* are disabled, forwarding the warnings to a local /dev/null address, presuming that as far as the simultaneity problem goes Mailman doesn't care how long the MTA takes to get the messages out, it only cares about how fast the MTA can absorb and queue them. I imagine an MTA with a full queue might be somewhat slower than one with an empty queue, but it shouldn't be very much.

I'm not suggesting that you implement this this summer, although if you have extra time for additional work, it's one idea.

...

I suggest we can actually see the feedback of customers regarding problems arising in this area

Unfortunately, this kind of feedback is likely to be very rare because neither the subscriber nor the admin is likely to notice that a "you are disabled" warning has been delayed by a day or even a couple of weeks. A very few will, but remember this situation is unlikely to manifest in the first place.

...

Stephen writes:

...
What about it? [...] I think all of the databases we support have [the needed] locks.

I do not understand this part, send_warnings has to increment the bounce_you_are_disabled_warnings of the Address and the process_bounce will have to set the processed attribute of BounceEvent after completely processing that entry to True and save it again. Both of them have to write after reading a single entry so I do understand the reason of the lock.

I don't understand why there's any problem here. In what scenario is the database corrupted relative to what is desired? What variables are set differently from desired, and how? What are the user consequences of the incorrect database?

Steve

Aaryan Bhagat

5:59 a.m.

Thanks for the reply, Stephen!

Stephen writes:

...

I don't understand why there's any problem here. In what scenario is the database corrupted relative to what is desired? What variables are set differently from desired, and how? What are the user consequences of the incorrect database?

What I mean is if we take the example of BounceEvents column which has all the bounce messages stored, after processing each message I have to set the processed attribute of that message as true, so that will require modification of the database.

Also in the case where I will be sending_warnings, there is warning_count and warning_limit ( not the exact names of the attributes ) of each Address instance with respect to each Mailing List so I have to increase the warning_count counter and save again in the database.

This is the pr which contains my modifications of the models which you can relate to the above context I wrote.

Aaryan Bhagat

8:50 p.m.

So it has taken me a couple of days regarding the right implementation.
According to me, I think a new command which takes all the Address instances and in each instances checks which tuple of bounce_info have disabled attribute true ( more regarding these attribute in this pr ). If it sees the disabled attribute of a tuple to be true it will roughly:

Send a warning
Increase warning counter
Check with the threshold
Updates last_warning_sent timestamp value

The newly updated tuple ( with the updated counter and timestamp ) will be stored again. I will add it to a cron job. Is the above implementation thought correct? Am I missing

Stephen J. Turnbull

9 Aug 9 Aug

8:16 a.m.

Sorry, this got hung up in my drafts folder. The basic coding probably is solved, I think, but I'm worried we're not communicating about what the design problems are. So here goes.

Aaryan Bhagat writes:

...

Thanks for the reply, Stephen!

Stephen writes:

...
I don't understand why there's any problem here. In what scenario is the database corrupted relative to what is desired? What variables are set differently from desired, and how? What are the user consequences of the incorrect database?

What I mean is if we take the example of BounceEvents column which has all the bounce messages stored, after processing each message I have to set the processed attribute of that message as true, so that will require modification of the database.

Also in the case where I will be sending_warnings, there is warning_count and warning_limit ( not the exact names of the attributes ) of each Address instance with respect to each Mailing List so I have to increase the warning_count counter and save again in the database.

I understand the details of the algorithm, that there are different attributes of certain objects that need to be modified.

I don't understand why you believe there are conditions under which Mailman will do something undesirable, such as create a very long delay from the time "something" (such as mailing a disabled warning) *should* happen until it *does* happen.

Stephen J. Turnbull

8:22 a.m.

Another one that got hung up in my drafts folder. Same as the other, I think the basic coding probably is solved by now, but I'm worried we're not communicating about what the design problems are.

Aaryan Bhagat writes:

...

According to me, I think a new command which takes all the Address instances and in each instances checks which tuple of bounce_info have disabled attribute true. If it sees the disabled attribute of a tuple to be true it will roughly:

...

Send a warning

Shouldn't you first check if the disabled_warning_interval has been exceeded, and if it has, send a warning? Or is that what you mean by "check with the threshold" below?

...

Increase warning counter

Incrementing the warning counter should always happen if the mail is successfully sent.

...

Check with the threshold

What "threshold"? The disabled_warning_interval?

...

Updates last_warning_sent timestamp value

Updated this timestamp should always happen if the mail is successfully sent.

I don't know offhand how any of the above operations could fail, and most likely to fail is sending mail, which is handled by the virgin, outgoing, and retry queues, not by the logic you're working on here. Still, perhaps some of the operations should be conditional on success. Also, you need to think about what happens if the warning fails for some address. Probably you just treat that as equivalent to a bounce in most cases, but what happens if it's a "no such address"? Is that already handled?

Note that adding to the virgin queue probably can't fail (except in really bad circumstances such as out of memory), but simply adding the message to the retry queue may not be appropriate (I forget how often that queue gets flushed).

If that doesn't make sense given what you know about how the queues work, then don't worry about it -- you probably know more about the queues than I do at this point.

...

The newly updated tuple ( with the updated counter and timestamp ) will be stored again.

...

I will add it to a cron job.

Cron job is what Mark suggested so that seems fine.

Steve

Aaryan Bhagat

12 Aug 12 Aug

10:54 a.m.

Stephen writes:

...

I don't understand why you believe there are conditions under which Mailman will do something undesirable, such as create a very long delay from the time "something" (such as mailing a disabled warning) should happen until it does happen.

I am sorry but I do not understand your concern. I just mentioned the cases where we actually have to make changes to the database after each processing of BounceEvents. I do not expect Mailman to show any undesirable behaviour here.

Aaryan Bhagat

12:08 p.m.

Stephen writes:

...

What "threshold"? The disabled_warning_interval?

Yes, exactly. I apologise for being informal.

Stephen writes:

...

Incrementing the warning counter should always happen if the mail is successfully sent.

Yes, I meant that only, I apologise for not explaining correctly.

Stephen writes:

...

I don't know offhand how any of the above operations could fail, and most likely to fail is sending mail, which is handled by the virgin, outgoing, and retry queues, not by the logic you're working on here. Still, perhaps some of the operations should be conditional on success.

I also realize that sending mail is the critical connection of this whole flow and most of the operations will work on certain conditionals, of course, I just showed a rough outline of the process. As for when each and individual operation will be executed depending/not-depending upon the success/failure of the previous one I am currently working on that.

...

Also, you need to think about what happens if the warning fails for some address. Probably you just treat that as equivalent to a bounce in most cases, but what happens if it's a "no such address"? Is that already handled?

Yes, that is correct, as of now, if bounces are received for warning_messages then they will be treated normally as it would have been done for other emails.
For the problem, you mentioned regarding no such address, it was something which occurred to me earlier but I had not pondered upon it so much. Let me clearly explain the problem.

Imagine if some email address goes wrong/nullified for some reasons.
Bounces will originate and after sometime, it will be disabled and warnings will be sent.
Warning emails will also be recorded as bounces and after all the number of warnings increase bounce_you_are_disabled_warnings no more warnings will be sent by Mailman.
But in the end, no action was taken and it will still be there in the roster and will again be called by the process responsible for sending warnings.
For this type of behaviour, a specific action is required?
I mean when the above (1-4) are satisfied I can remove or somehow disable it totally from the roster so even the process responsible for sending warnings will not recognize it.

What other implementation for this type of behaviour can be implemented? Some pointers are required on this.

Stephen J. Turnbull

13 Aug 13 Aug

5:52 p.m.

Aaryan Bhagat writes:

...

I am sorry but I do not understand your concern. I just mentioned the cases where we actually have to make changes to the database after each processing of BounceEvents. I do not expect Mailman to show any undesirable behaviour here.

You wrote some posts that indicated you were specifically worried about the timing of the warning mails. When the conversation continued without much explanation of your concerns, I assumed that you were still thinking about the concurrency issues that arise in an asynchronous system such as email.

Stephen J. Turnbull

5:52 p.m.

Aaryan Bhagat writes:

...

Stephen writes:

...

...
Also, you need to think about what happens if the warning fails for some address. Probably you just treat that as equivalent to a bounce in most cases, but what happens if it's a "no such address"? Is that already handled?

For the problem, you mentioned regarding no such address, it was something which occurred to me earlier but I had not pondered upon it so much. Let me clearly explain the problem.

[snipped]

...

What other implementation for this type of behaviour can be implemented? Some pointers are required on this.

If we receive a bounce that indicates that the address does not exist, we should stop sending warnings there. Somebody has screwed up pretty badly, and it wasn't us.

In most cases, this deletion will be valid and intended by the user (or well-deserved for abusing the account). In those cases, the address should be unsubscribed. Questions:

Do we unsubscribe on the theory that it's almost always correct to do so, or do we keep it around and just disable without further warnings in case of resurrection and to associate mail "From" that address with the user? (This has issues in case the address is resurrected but assigned to a different person.)
If the user has other addresses, should Mailman warn them about this situation? (This question also applies to ordinary disabled warnings that bounce.)
If the user wants to migrate the subscription to a working address, can we make this simpler? (Not in scope of your GSoC, mentioned because I thought of it and want it in public.)
Should the address be deleted? It's possible that the user will use this address in From. Technically this is probably invalid (the user doesn't have the right to use that address at all if they can't receive mail there). But I don't see a good reason not to associate such mail with their Mailman user.
Should the user be allowed to post "From" that address? Probably not.

Steve

-- Associate Professor Division of Policy and Planning Science http://turnbull.sk.tsukuba.ac.jp/ Faculty of Systems and Information Email: turnbull@sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN

Aaryan Bhagat

15 Aug 15 Aug

3:30 a.m.

Stephen writes:

...

You wrote some posts that indicated you were specifically worried about the timing of the warning mails. When the conversation continued without much explanation of your concerns, I assumed that you were still thinking about the concurrency issues that arise in an asynchronous system such as email.

Well, since you pointed out the rarity of this being very low in your replies, I then ignored this problem and started to ask more on the implementation. I apologize for not mentioning my transition.

Aaryan Bhagat

3:59 a.m.

Stephen writes:

...

If we receive a bounce that indicates that the address does not exist, we should stop sending warnings there. Somebody has screwed up pretty badly, and it wasn't us.

That is the main hurdle, bounce extraction cannot determine the exact cause so we cannot ever say with surety the reason.

Stephen writes:

...

In most cases, this deletion will be valid and intended by the user (or well-deserved for abusing the account). In those cases, the address should be unsubscribed.

Yes, this is the expected good behaviour which is most suitable in cases like these.

Stephen writes:

...

Do we unsubscribe on the theory that it's almost always correct to do so, or do we keep it around and just disable without further warnings in case of resurrection and to associate mail "From" that address with the user? (This has issues in case the address is resurrected but assigned to a different person.)

As mentioned above, since cannot know the reason so we should not rush with the actions like unsubscribing from the mlist or deleting the address. I think we should let the normal implementation run its course then make some actions. I mean:

First, bounce_score_threshold will be crossed.
Then a probe will be sent if this probe bounces then DeliveryStatus is disabled of that member.
Then warning emails will be sent, till the count reaches bounce_you_are_disabled_warnings.
When both of them are exhausted then we can unsubscribe the user.
I can make a separate template telling the reason for this type of unsubscription.
I think Address instances or the Member instances should not be deleted as when the person will subscribe with the same email, data will be restored, but as you pointed out if the email is different then nothing of that sort will happen and space will be wasted. This I am not sure of and Mark, Abhilash and Stephen please point your opinion on this. If not much is clear on this then the default is I will not delete the data.

...

If the user has other addresses, should Mailman warn them about

Stephen writes: this situation?

This seems to overcomplicate things, keeping the address instances separate will be simple and no problems will arise. It can be implemented but I am not sure about its priority as of now. Default I will not implement this.

Stephen writes:

...

Should the user be allowed to post "From" that address? Probably not.

For a case when only DeliveryStatus is disabled then I think the user can post. But when the case happens such that the condition which I explained above in the points are met then we will unsubscribe that user and then there won't be any problem.

Stephen writes:

...

Should the address be deleted? It's possible that the user will use this address in From. Technically this is probably invalid (the user doesn't have the right to use that address at all if they can't receive mail there). But I don't see a good reason not to associate such mail with their Mailman user.

As explained above, I am not sure whether the address should/shouldn't be deleted, but if the user is unsubscribed then there won't be any problem if the user uses the address in From as the mailing list will not recognize it.

If something is not explained well-enough please point out. Am I missing something here?

Stephen J. Turnbull

20 Aug 20 Aug

6:54 p.m.

Sorry for the delay, family issues and $WORK stuff.

Aaryan Bhagat writes:

...

Am I missing something here?

No, I don't think so. *We* (Mailman) may be able to improve something here, but it's not in scope of your GSoC work as far as I can see. Your opinions (and if you choose, post-GSoC work) on these questions will be appreciated since it's in the same area you're getting a lot of experience in.

Steve

1708

Age (days ago)

1733

Last active (days ago)

List overview

Download

15 comments

3 participants

participants (3)

Aaryan Bhagat
Mark Sapiro
Stephen J. Turnbull

Implementing `bounce_you_are_disabled_warnings_interval` in `Send_Warnings` function.

tags

participants (3)