
Hello,
I have posted here before about mail issues that may not be strictly a Mailman problem. However, this time there is something Mailman-related that I need to figure out.
A week or so ago, a remote mail server that all of our list mail goes through used a spam vendor that had slapped a signature on our server address, so mail was being held there. i think that was resolved.
After a week or so of mail successfully working, it has tailed off again. The remote mail server administrator can't identify that there is a spam problem, though Yahoo and Verizon have apparently temporaily blocked mail from our server due to the traffic.
In tests, I send various mail (list traffic, subscription requests, mail directly to the listname-owner address, etc) and it reaches the server, but goes nowhere from there.
At 12 noon, messages that had been sent but not delivered earlier today, suddenly were all sent. This sure seems like a cron (not a con) job, but I haven't a clue why. As far as I know the only Mailman cron job that runs at 12 noon is the digest generation.
I am really puzzled here. This may be a Postfix/MTA issue, maybe a local server issue, but I am mainly curious about the batch send that occurred.
Here is an entry from maillog. It shows a message being delivered to the alias, then removed from the queue. The message never gets to the person behind the alias.
May 9 12:57:49 swiki postfix/smtpd[26679]: E774019985DC: client= nm2-vm0.bullet.mail.bf1.yahoo.com[98.139.213.127] May 9 12:57:50 swiki postfix/cleanup[26968]: E774019985DC: message-id=< 1368129468.24511.YahooMailNeo@web164901.mail.bf1.yahoo.com> May 9 12:57:50 swiki postfix/qmgr[23217]: E774019985DC: from=<me@yahoo.com>, size=3404, nrcpt=1 (queue active) May 9 12:57:50 swiki postfix/local[27542]: E774019985DC: to=< test7-owner@mydomain.com>, relay=local, delay=0.9, delays=0.74/0/0/0.16, dsn=2.0.0, status=sent (delivered to command: /usr/local/mailman/mail/mailman owner test7) May 9 12:57:50 swiki postfix/qmgr[23217]: E774019985DC: removed
-- Christopher Adams adamsca@gmail.com

On 05/09/2013 01:30 PM, Christopher Adams wrote:
[...]
Does Mailman deliver directly to the remote server or via Postfix? If directly, what does Mailman's smtp log say about the posts that were delayed. I.e. a post to a list will have an entry like
May 09 12:53:47 2013 (6114) <009b01ce4cee$e4c00540$ae400fc0$@xxx@example.com> smtp to LISTNAME for 237 recips, completed in 4.337 seconds
which says the the message with message-id <009b01ce4cee$e4c00540$ae400fc0$@xxx@example.com> was delivered from the LISTNAME list to (in this case) 237 reqular subscribers and SMTP to the MTSA was completed at 12:53:47 (and started 4.337 seconds earlier).
I suspect they were delivered to the remote server well before and not all at once at noon.
If Mailman delivers via Postfix, what does the maillog say about those deliveries?
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Below is a log of a message sent from me. The remote server handles mail from my mail server. It is sent to the test7 mailing list address. From there, Mailman/Postfix handles the dissemination of mail. I can't figure out where it is getting hung up. Sometimes delivery takes 2-8 hours from the time it was posted.
May 13 08:02:09 swiki postfix/smtpd[15145]: 692E8199849A: client=remote.mail.server[xxx.xxx.xxx.xx] May 13 08:02:09 swiki postfix/cleanup[15168]: 692E8199849A: message-id=<0B03FE68E0E760478D413F045B39A028380320D3@OSLEXCHANGE.my.mail.server.local
May 13 08:02:09 swiki postfix/qmgr[17481]: 692E8199849A: from=<prvs=1845613428=my.address@my.mail.server>, size=4112, nrcpt=1 (queue active) May 13 08:02:09 swiki postfix/local[15169]: 692E8199849A: to=< test7@mailmanserver.com>, relay=local, delay=0.13, delays=0.02/0/0/0.1, dsn=2.0.0, status=sent (delivered to command: /usr/local/mailman/mail/mailman post test7) May 13 08:02:09 swiki postfix/qmgr[17481]: 692E8199849A: removed
On Thu, May 9, 2013 at 3:23 PM, Mark Sapiro <mark@msapiro.net> wrote:
-- Christopher Adams adamsca@gmail.com

On 05/13/2013 09:32 AM, Christopher Adams wrote:
Does Mailman send to your Postfix to relay to the remote server. If so, what are the Postfix log entries for the mail from Mailman through Postfix to the remote server?
In any case, what is Mailman's smtp log entry for the post (the one for the message-id of the post)?
The time stamp of the smtp log entry is the time that Mailman completed smtp delivery of the post to Postfix or the remote server as the case may be.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Postfix sends the list message it to the remote server, which shows that it was accepted by the Mailman server. The Mailman server has logs that show that the message was directed to the command 'post' to list 'test7'. However, nothing happens after that for 2-7 hours. Once it actually sends to the list, there are smpt.log entries that show it going to 'x recipients'
The problem is not non-delivery but extremely slow delivery, anywhere from 2-8 hrs,even for a small list (2 subscribers) like the test7 list. The Mailman server has over 600 lists and 250,000 subscribers. Some of the lists have 40,000 subscribers. Additionally, there is quite a bit of spam being sent to the list owner addresses and other attempts at posts by spammers, many which are flagged and blocked. That is just to point out that there is a lot going on with the Mailman server.
What would be some reasons for such long delays and why does it seem that they seem to be "released" all at the same time. If they have been handed off to Mailman,where are they held during this long delay? Are there some optimizations that could be made in mm_cfg.py and/or with Postfix? I saw some reference to some work that had already been done with this, but it appears to be from 2008.
Thank for any suggestions that you might have.
On Mon, May 13, 2013 at 3:57 PM, Mark Sapiro <mark@msapiro.net> wrote:
-- Christopher Adams adamsca@gmail.com

On 05/13/2013 08:39 PM, Christopher Adams wrote:
Postfix sends the list message it to the remote server, which shows that it was accepted by the Mailman server.
We are not communicating here. The above makes no sense to me. I think the sequence is:
- List mail arrives at the local Postfix.
- Postfix delivers to the local Mailman.
- Mailman delivers - here I don't know if to the local Postfix for relay to the remote server or directly to the remote server. The mm_cfg.py setting for SMTPHOST determines this. The Default is localhost meaning relay via the local Postfix.
OK, so here you are saying that the delay is definitely in Mailman. I.e. Postfix delivers to the mailman mail wrapper which presumably immediately queues the message in Mailman's in/ queue, and it's a long time later that Mailman logs the delivery.
What is the processing time in the smtp log entry. Is it a few seconds or less or hours?
I'm guessing that during this time IncomingRunner or OutgoingRunner or both are asleep. Do you have a setting in mm_cfg.py for QRUNNER_SLEEP_TIME? The default in Defaults.py is (or should be)
QRUNNER_SLEEP_TIME = seconds(1)
If you have set this to a few hours, it would explain the behavior you are seeing. If not, when does the post get archived? Is that delayed too?
Also, messages in the in/ queue are processed by IncomingRunner and then queued in the archive/ queue for ArchRunner and the out/ queue for OutgoingRunner. You could look and see if messages move from the in/ queue quickly or not.
Note that if you have set QRUNNER_SLEEP_TIME to say 4 hours, messages can sit in the in/ queue for an average of 2 and up to 4 hours before moving to the out/ queue where they may wait up to an additional 4 hours before being sent.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Thank you for your reply.
QRUNNER_SLEEP_TIME = seconds(1) in Defaults.py
Mesages are being archived almost immediately. There is nothing in the in/ queue.
There currently are about 954 files in the out/ queue and seemingly always growing. Is that significant?
A message sent yesterday at 12:30 was received by Mailman within seconds. It delivered at 3:30. The smtp.log for that message shows:
May 13 15:32:36 2013 (26531) <0B03FE68E0E760478D413F045B39A028380323EA@OSLEXCHANGE.my.mail.server.local> smtp to test7 for 2 recips, completed in 0.020 seconds
On Mon, May 13, 2013 at 9:03 PM, Mark Sapiro <mark@msapiro.net> wrote:
-- Christopher Adams adamsca@gmail.com

Mark, I was reading some previous posts about Mailman and qfiles. Late yesterday, I found a couple of *.bak files in the out/ queue, so I moved all the *.bak and *.pck files to another directory. I restarted Mailman. Mail was not still not being delivered in a timely way.
Then, this morning, I read another post about dns lookups in Postfix main.cf. A poster said that he modifed his smtpd_recipient_restrictions and put permit_mynetworks at the top. So, I followed this and restarted Postfix.
I immediately saw mail flowing and the out/q queue emptied and mail began to come in. I'm not sure exactly why it had that affect.
So, I have a final question. Should I move the .pck and .bak (renamed to .pck) files back to the out/ queue. If so, should I stop mailman before doing this.
BTW - The posts that I read above involved you. so thanks for the "indirect" assistance.
On Tue, May 14, 2013 at 8:26 AM, Christopher Adams <adamsca@gmail.com>wrote:
-- Christopher Adams adamsca@gmail.com

On 05/14/2013 09:28 AM, Christopher Adams wrote:
So, Mailman is delivering via the local Postfix and not directly to the remote server.
Postfix was doing DNS lookups on the sending domain for all messages which slowed it way down. You avoided this by the change you made.
You can move the .pck files back without stopping/starting Mailman.
If you move the .bak files without renaming them, they won't be processed until you restart Mailman. If you rename them to .pck, they will be processed along with the others without a restart.
BUT, the messages in the .bak files have been 'partially' processed and possibly delivered to some recipients. You can examine the .bak files with bin/dumpdb -p to see what they contain and decide if you want to reprocess them.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Thank you, Mark.
SMTPHOST = 'localhost' in Defaults.py
The out/ queue is now clear and processing mail. I'll see how it goes. Thanks, again.
On another note, what are thoughts about utilizing SpamAssasin or other spam software with Postfix and Mailman. It seems that a lot of the traffic that is going through the Mailman server is spam, quite a bit which is flagged and blocked by using RBLS in postfix smptd_recipient_restrictions. I am seeing upwards to 8,000 messages blocked every day. Is there a more efficient way to manage this without making it a full time job? :>)
On Tue, May 14, 2013 at 9:43 AM, Mark Sapiro <mark@msapiro.net> wrote:
-- Christopher Adams adamsca@gmail.com

On 05/14/2013 12:43 PM, Christopher Adams wrote:
I don't think I ever replied to this, but spam filtering is the server's job. It should be done at the 'front door' and mail determined to be spam should be rejected by the incoming MTA. This avoids the issue of accepting the mail and later determining that it is unacceptable and perhaps returning a DSN or responding in some other way to an innocent 3rd party's spoofed address.
SpamAssassin alone is not a total solution, but there are milters for incorporating SpamAssassin as well as ClamAV scanning into Postfix.
There are also techniques like greylisting which can help in some environments.
There are also multidimensional solutions such as MailScanner that incorporate spam and virus scanning with many other tests, but one issue with MailScanner is it doesn't evaluate the mail until after it is accepted by the MTA
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

On 05/14/2013 08:26 AM, Christopher Adams wrote:
There currently are about 954 files in the out/ queue and seemingly always growing. Is that significant?
Your out/ queue is hopelessly backlogged. I'm not sure why. See the post at <http://mail.python.org/pipermail/mailman-users/2012-January/072778.html>.
I am guessing that Mailman is delivering directly to the remote server (SMTPHOST = 'remote.server' in mm_cfg.py) and that the remote server is not keeping up with peak rates causing the queue to become backlogged.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

On 05/09/2013 01:30 PM, Christopher Adams wrote:
[...]
Does Mailman deliver directly to the remote server or via Postfix? If directly, what does Mailman's smtp log say about the posts that were delayed. I.e. a post to a list will have an entry like
May 09 12:53:47 2013 (6114) <009b01ce4cee$e4c00540$ae400fc0$@xxx@example.com> smtp to LISTNAME for 237 recips, completed in 4.337 seconds
which says the the message with message-id <009b01ce4cee$e4c00540$ae400fc0$@xxx@example.com> was delivered from the LISTNAME list to (in this case) 237 reqular subscribers and SMTP to the MTSA was completed at 12:53:47 (and started 4.337 seconds earlier).
I suspect they were delivered to the remote server well before and not all at once at noon.
If Mailman delivers via Postfix, what does the maillog say about those deliveries?
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Below is a log of a message sent from me. The remote server handles mail from my mail server. It is sent to the test7 mailing list address. From there, Mailman/Postfix handles the dissemination of mail. I can't figure out where it is getting hung up. Sometimes delivery takes 2-8 hours from the time it was posted.
May 13 08:02:09 swiki postfix/smtpd[15145]: 692E8199849A: client=remote.mail.server[xxx.xxx.xxx.xx] May 13 08:02:09 swiki postfix/cleanup[15168]: 692E8199849A: message-id=<0B03FE68E0E760478D413F045B39A028380320D3@OSLEXCHANGE.my.mail.server.local
May 13 08:02:09 swiki postfix/qmgr[17481]: 692E8199849A: from=<prvs=1845613428=my.address@my.mail.server>, size=4112, nrcpt=1 (queue active) May 13 08:02:09 swiki postfix/local[15169]: 692E8199849A: to=< test7@mailmanserver.com>, relay=local, delay=0.13, delays=0.02/0/0/0.1, dsn=2.0.0, status=sent (delivered to command: /usr/local/mailman/mail/mailman post test7) May 13 08:02:09 swiki postfix/qmgr[17481]: 692E8199849A: removed
On Thu, May 9, 2013 at 3:23 PM, Mark Sapiro <mark@msapiro.net> wrote:
-- Christopher Adams adamsca@gmail.com

On 05/13/2013 09:32 AM, Christopher Adams wrote:
Does Mailman send to your Postfix to relay to the remote server. If so, what are the Postfix log entries for the mail from Mailman through Postfix to the remote server?
In any case, what is Mailman's smtp log entry for the post (the one for the message-id of the post)?
The time stamp of the smtp log entry is the time that Mailman completed smtp delivery of the post to Postfix or the remote server as the case may be.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Postfix sends the list message it to the remote server, which shows that it was accepted by the Mailman server. The Mailman server has logs that show that the message was directed to the command 'post' to list 'test7'. However, nothing happens after that for 2-7 hours. Once it actually sends to the list, there are smpt.log entries that show it going to 'x recipients'
The problem is not non-delivery but extremely slow delivery, anywhere from 2-8 hrs,even for a small list (2 subscribers) like the test7 list. The Mailman server has over 600 lists and 250,000 subscribers. Some of the lists have 40,000 subscribers. Additionally, there is quite a bit of spam being sent to the list owner addresses and other attempts at posts by spammers, many which are flagged and blocked. That is just to point out that there is a lot going on with the Mailman server.
What would be some reasons for such long delays and why does it seem that they seem to be "released" all at the same time. If they have been handed off to Mailman,where are they held during this long delay? Are there some optimizations that could be made in mm_cfg.py and/or with Postfix? I saw some reference to some work that had already been done with this, but it appears to be from 2008.
Thank for any suggestions that you might have.
On Mon, May 13, 2013 at 3:57 PM, Mark Sapiro <mark@msapiro.net> wrote:
-- Christopher Adams adamsca@gmail.com

On 05/13/2013 08:39 PM, Christopher Adams wrote:
Postfix sends the list message it to the remote server, which shows that it was accepted by the Mailman server.
We are not communicating here. The above makes no sense to me. I think the sequence is:
- List mail arrives at the local Postfix.
- Postfix delivers to the local Mailman.
- Mailman delivers - here I don't know if to the local Postfix for relay to the remote server or directly to the remote server. The mm_cfg.py setting for SMTPHOST determines this. The Default is localhost meaning relay via the local Postfix.
OK, so here you are saying that the delay is definitely in Mailman. I.e. Postfix delivers to the mailman mail wrapper which presumably immediately queues the message in Mailman's in/ queue, and it's a long time later that Mailman logs the delivery.
What is the processing time in the smtp log entry. Is it a few seconds or less or hours?
I'm guessing that during this time IncomingRunner or OutgoingRunner or both are asleep. Do you have a setting in mm_cfg.py for QRUNNER_SLEEP_TIME? The default in Defaults.py is (or should be)
QRUNNER_SLEEP_TIME = seconds(1)
If you have set this to a few hours, it would explain the behavior you are seeing. If not, when does the post get archived? Is that delayed too?
Also, messages in the in/ queue are processed by IncomingRunner and then queued in the archive/ queue for ArchRunner and the out/ queue for OutgoingRunner. You could look and see if messages move from the in/ queue quickly or not.
Note that if you have set QRUNNER_SLEEP_TIME to say 4 hours, messages can sit in the in/ queue for an average of 2 and up to 4 hours before moving to the out/ queue where they may wait up to an additional 4 hours before being sent.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Thank you for your reply.
QRUNNER_SLEEP_TIME = seconds(1) in Defaults.py
Mesages are being archived almost immediately. There is nothing in the in/ queue.
There currently are about 954 files in the out/ queue and seemingly always growing. Is that significant?
A message sent yesterday at 12:30 was received by Mailman within seconds. It delivered at 3:30. The smtp.log for that message shows:
May 13 15:32:36 2013 (26531) <0B03FE68E0E760478D413F045B39A028380323EA@OSLEXCHANGE.my.mail.server.local> smtp to test7 for 2 recips, completed in 0.020 seconds
On Mon, May 13, 2013 at 9:03 PM, Mark Sapiro <mark@msapiro.net> wrote:
-- Christopher Adams adamsca@gmail.com

Mark, I was reading some previous posts about Mailman and qfiles. Late yesterday, I found a couple of *.bak files in the out/ queue, so I moved all the *.bak and *.pck files to another directory. I restarted Mailman. Mail was not still not being delivered in a timely way.
Then, this morning, I read another post about dns lookups in Postfix main.cf. A poster said that he modifed his smtpd_recipient_restrictions and put permit_mynetworks at the top. So, I followed this and restarted Postfix.
I immediately saw mail flowing and the out/q queue emptied and mail began to come in. I'm not sure exactly why it had that affect.
So, I have a final question. Should I move the .pck and .bak (renamed to .pck) files back to the out/ queue. If so, should I stop mailman before doing this.
BTW - The posts that I read above involved you. so thanks for the "indirect" assistance.
On Tue, May 14, 2013 at 8:26 AM, Christopher Adams <adamsca@gmail.com>wrote:
-- Christopher Adams adamsca@gmail.com

On 05/14/2013 09:28 AM, Christopher Adams wrote:
So, Mailman is delivering via the local Postfix and not directly to the remote server.
Postfix was doing DNS lookups on the sending domain for all messages which slowed it way down. You avoided this by the change you made.
You can move the .pck files back without stopping/starting Mailman.
If you move the .bak files without renaming them, they won't be processed until you restart Mailman. If you rename them to .pck, they will be processed along with the others without a restart.
BUT, the messages in the .bak files have been 'partially' processed and possibly delivered to some recipients. You can examine the .bak files with bin/dumpdb -p to see what they contain and decide if you want to reprocess them.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Thank you, Mark.
SMTPHOST = 'localhost' in Defaults.py
The out/ queue is now clear and processing mail. I'll see how it goes. Thanks, again.
On another note, what are thoughts about utilizing SpamAssasin or other spam software with Postfix and Mailman. It seems that a lot of the traffic that is going through the Mailman server is spam, quite a bit which is flagged and blocked by using RBLS in postfix smptd_recipient_restrictions. I am seeing upwards to 8,000 messages blocked every day. Is there a more efficient way to manage this without making it a full time job? :>)
On Tue, May 14, 2013 at 9:43 AM, Mark Sapiro <mark@msapiro.net> wrote:
-- Christopher Adams adamsca@gmail.com

On 05/14/2013 12:43 PM, Christopher Adams wrote:
I don't think I ever replied to this, but spam filtering is the server's job. It should be done at the 'front door' and mail determined to be spam should be rejected by the incoming MTA. This avoids the issue of accepting the mail and later determining that it is unacceptable and perhaps returning a DSN or responding in some other way to an innocent 3rd party's spoofed address.
SpamAssassin alone is not a total solution, but there are milters for incorporating SpamAssassin as well as ClamAV scanning into Postfix.
There are also techniques like greylisting which can help in some environments.
There are also multidimensional solutions such as MailScanner that incorporate spam and virus scanning with many other tests, but one issue with MailScanner is it doesn't evaluate the mail until after it is accepted by the MTA
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

On 05/14/2013 08:26 AM, Christopher Adams wrote:
There currently are about 954 files in the out/ queue and seemingly always growing. Is that significant?
Your out/ queue is hopelessly backlogged. I'm not sure why. See the post at <http://mail.python.org/pipermail/mailman-users/2012-January/072778.html>.
I am guessing that Mailman is delivering directly to the remote server (SMTPHOST = 'remote.server' in mm_cfg.py) and that the remote server is not keeping up with peak rates causing the queue to become backlogged.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (2)
-
Christopher Adams
-
Mark Sapiro