when does logs/post get updated?
![](https://secure.gravatar.com/avatar/30b3ac53296307ae5b47497280cc15df.jpg?s=120&d=mm&r=g)
Hi,
I'm having some problems with my lists recently. I see in my mail log that "mailman post" has been called. However, when I look in the logs/post file, it's been almost 24 hours since anything has been written there.
I'm running mailman on Ubuntu Hardy, and another weird thing is that whenever I stop mailman, it always leaves at least one process hanging around. I have to forcefully kill it. After I stop it, I still see:
list 3833 0.0 0.4 83076 7540 ? Ss 08:29 0:00 /usr/bin/python /usr/lib/mailman/bin/mailmanctl -s -q start list 3842 0.1 2.5 105372 38148 ? S 08:29 0:01 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=OutgoingRunner:0:1 -s
Even more strange when I reboot the machine, I'll see 2 entire sets of mailman processes, almost as if the mailman start had been called twice.
In any case, right now it seems that mailman has stopped accepting posts. Is there a way to get more debug from mailman when "mailman post" is called? I don't see any errors, yet I don't see the post log file updating.
Ricardo
![](https://secure.gravatar.com/avatar/56f108518d7ee2544412cc80978e3182.jpg?s=120&d=mm&r=g)
Ricardo Kleemann wrote:
The post log is written by SMTPDirect (under control of OutgoingRunner) when the outgoing message is delivered to the MTA.
It appears that SMTPDirect (actually the underlying Python smtplib) is hung waiting for a response from the MTA that isn't coming.
Even more strange when I reboot the machine, I'll see 2 entire sets of mailman processes, almost as if the mailman start had been called twice.
It seems like you have two init scripts for Mailman.
"mailman post" just puts the message in the in/ queue. I assume from what you say above that it gets processed by IncomingRunner and even archived and the problem is in OutgoingRunner.
See the FAQs at <http://wiki.list.org/x/A4E9> and <http://wiki.list.org/x/-IA9>.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
![](https://secure.gravatar.com/avatar/30b3ac53296307ae5b47497280cc15df.jpg?s=120&d=mm&r=g)
Hello Mark,
Thanks for your reply.
But the strange thing here is that I have no issues at all connecting to the smtp server at localhost. Defaults.py has the standard config, using localhost with SMTPDirect.
Whatever the OutgoingRunner is stuck on, it's definitely stuck. It won't go away unless I do a kill -9
I would have thought so... but there's only 1 script under /etc/init.d/ and no other scripts there reference mailman
I can see that the in queue is probably working. There are currently 129 files in the out/ queue.
There are 2 OutgoingRunner processes and apparently BOTH of them are doing something because strace does show some activity
Process 14078 attached - interrupt to quit recvfrom(7, "250 Ok. 0000000049303165.000045C"..., 8192, 0, NULL, NULL) = 35 sendto(7, "mail FROM:<reflexoes3-bounces@li"..., 65, 0, NULL, 0) = 65 recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(7, "rcpt TO:<chitopan_2000@yahoo.com"..., 35, 0, NULL, 0) = 35 recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(7, "rcpt TO:<bernadete.sorianoadv@gm"..., 42, 0, NULL, 0) = 42 recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(7, "rcpt TO:<iviv_8a@hotmail.com>\r\n", 31, 0, NULL, 0) = 31 recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(7, "data\r\n", 6, 0, NULL, 0) = 6 recvfrom(7, "354 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(7, "Received: from sr05-01.mta.terra"..., 8256, 0, NULL, 0) = 8256 recvfrom(7, "250 Ok. 0000000049303172.000045D"..., 8192, 0, NULL, NULL) = 35
root@email1:/var/lib/mailman# strace -p13800 Process 13800 attached - interrupt to quit recvfrom(8, "250 Ok. 00000000493031A7.0000464"..., 8192, 0, NULL, NULL) = 35 sendto(8, "mail FROM:<reflexoes3-bounces@li"..., 65, 0, NULL, 0) = 65 recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(8, "rcpt TO:<fesayao@gmail.com>\r\n", 29, 0, NULL, 0) = 29 recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(8, "rcpt TO:<bruno8d@hotmail.com>\r\n", 31, 0, NULL, 0) = 31 recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(8, "rcpt TO:<evangelismoemissoes@hot"..., 43, 0, NULL, 0) = 43 recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(8, "data\r\n", 6, 0, NULL, 0) = 6 recvfrom(8, "354 Ok.\r\n", 8192, 0, NULL, NULL) = 9
Yet even though it's being processed, the logs/post file isn't getting updated, and the number of messages in out/ doesn't decrease.
Could this be because mailman is processing a very large list (20,000 members) and it is just stuck on processing one message, while the other messages wait around?
But I've been handling these lists for a long time and never had these problems. Mailman doesn't seem to be getting much cpu usage.
![](https://secure.gravatar.com/avatar/56f108518d7ee2544412cc80978e3182.jpg?s=120&d=mm&r=g)
Ricardo Kleemann wrote:
Note that the above "--runner=OutgoingRunner:0:1" indicates that OutgoingRunner is not sliced - more below.
What do you see in Mailman's smtp-failure log.
Well, something is starting Mailman twice.
Since OutgoingRunner is not sliced, there should be only one OutgoingRunner process. This needs to be corrected. See <http://wiki.list.org/x/_4A9>.
So OutgoingRunner never finishes delivery of even one post. However, I see 'data' commands in the traces above, so presumably, some recipients are being delivered. Then when you kill OutgoingRunner and restart it, It recovers the .bak file from the queue and redelivers to the same recipients who by now have received multiple copies of the message.
It could be, but this would indicate that you need to do something to allow SMTP between Mailman and the MTA to proceed faster. Search the FAQ for "tuning".
But I've been handling these lists for a long time and never had these problems. Mailman doesn't seem to be getting much cpu usage.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
![](https://secure.gravatar.com/avatar/30b3ac53296307ae5b47497280cc15df.jpg?s=120&d=mm&r=g)
I have some more data on this...
I enabled the debug in SMTPDirect.py, and it shows that there really aren't any problems. It shows that slowly a message is being sent out.
I followed the performance tuning suggestions that had the MAX_RCPTS in Defaults.py at an optimal value from 2-5, so I set it to 3.
But does that mean that mailman is going to simply get stuck on one single message distribution and won't process any others until this one is finished?
I used to have these lists on another server and over there I had the MAX_RCPTS set to a high number, but my mail server is set to reject above 25 rcpts anyway so the end result that at max it would handle 25 rcpts. I remember that whenever a message arrived for the list (again 20,000 members) on the other server, the load average on the server would go pretty high as it processed the list.
But now on this new server I never see the load avg go up. Is this because of the MAX_RCPTS setting? What else would keep mailman from efficiently handling the messages?
The OutgoingRunner is just sitting there slowly distributing the message and never seems to get to the next one.
On Fri, 2008-11-28 at 09:43 -0800, Mark Sapiro wrote:
![](https://secure.gravatar.com/avatar/56f108518d7ee2544412cc80978e3182.jpg?s=120&d=mm&r=g)
Ricardo Kleemann wrote:
So your real issue is why is it proceeding so slowly.
I followed the performance tuning suggestions that had the MAX_RCPTS in Defaults.py at an optimal value from 2-5, so I set it to 3.
Is the MTA doing DNS verification on incoming mail from Mailman? Are you having some DNS issue?
Yes.
Slow response from the MTA. Even with SMTP_MAX_RCPTS set to 3, you should be delivering on the order of 100 or more recipients per second. What do you see in Mailman's smtp log for processing tomes for messages. How do the latest ones compare to those from days or a week ago?
The OutgoingRunner is just sitting there slowly distributing the message and never seems to get to the next one.
It will when it finishes this one.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
![](https://secure.gravatar.com/avatar/30b3ac53296307ae5b47497280cc15df.jpg?s=120&d=mm&r=g)
Mark, thanks for your help.
For the most part I did get around the SMTP issues last week, I disabled ident lookup on the smtp server. But it seems to be back today and I don't quite understand what could be the problem.
I wrote a tiny perl script to test the connection to the server via localhost and I can consistently do 500 connections in 30 seconds. I don't see any other evidence of the smtp hanging up.
But if I do an strace on the outgoing runner, it basically is working very slowly through a large memberlist. It will process a certain number, then hang a few seconds before it processes the next batch. The server is not loaded and I don't see any other evidence of problems.
Do you have a small python script I can run to test out the localhost smtp that can maybe output some diagnostic information?
Thanks Ricardo
----- Original Message ----- From: "Mark Sapiro" <mark@msapiro.net> To: "Ricardo Kleemann" <ricardo@americasnet.com> Cc: "mailman-users" <mailman-users@python.org> Sent: Friday, November 28, 2008 10:35 AM Subject: Re: [Mailman-Users] when does logs/post get updated?
![](https://secure.gravatar.com/avatar/56f108518d7ee2544412cc80978e3182.jpg?s=120&d=mm&r=g)
Ricardo Kleemann wrote:
The connection rate is generally not an issue as Mailman normall does all the transactions over a single connection. I.e. it connects, does a "MAIL FROM" one or more "RCPT TO", and "DATA", and then begins another "MAIL FROM", etc transaction without quitting or disconnectiing.
If there are significant delays between sending RCPT TO and the responce from the server, the server is probably doing DNS verifies on the recipients. This kills performance on large lists.
What's it doing when it "hangs for a few seconds"? Presumably it is either sending the message content or waiting for a reply from the MTA.
Do you have a small python script I can run to test out the localhost smtp that can maybe output some diagnostic information?
No. I don't have any such script. As I said in previous posts
and
Search the FAQ <http://wiki.list.org/x/AgA3> for "tuning".
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
![](https://secure.gravatar.com/avatar/7bdecdef03708b218939094eb05e8b35.jpg?s=120&d=mm&r=g)
on 12/1/08 1:50 PM, Ricardo Kleemann said:
That sounds like your MTA is doing DNS lookups for each and every envelope recipient as you try to transmit mail. That needs to be fixed, and is one of the many "performance" tuning techniques that is discussed in the FAQ Wiki.
-- Brad Knowles <brad@shub-internet.org> LinkedIn Profile: <http://tinyurl.com/y8kpxu>
![](https://secure.gravatar.com/avatar/56f108518d7ee2544412cc80978e3182.jpg?s=120&d=mm&r=g)
Ricardo Kleemann wrote:
The post log is written by SMTPDirect (under control of OutgoingRunner) when the outgoing message is delivered to the MTA.
It appears that SMTPDirect (actually the underlying Python smtplib) is hung waiting for a response from the MTA that isn't coming.
Even more strange when I reboot the machine, I'll see 2 entire sets of mailman processes, almost as if the mailman start had been called twice.
It seems like you have two init scripts for Mailman.
"mailman post" just puts the message in the in/ queue. I assume from what you say above that it gets processed by IncomingRunner and even archived and the problem is in OutgoingRunner.
See the FAQs at <http://wiki.list.org/x/A4E9> and <http://wiki.list.org/x/-IA9>.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
![](https://secure.gravatar.com/avatar/30b3ac53296307ae5b47497280cc15df.jpg?s=120&d=mm&r=g)
Hello Mark,
Thanks for your reply.
But the strange thing here is that I have no issues at all connecting to the smtp server at localhost. Defaults.py has the standard config, using localhost with SMTPDirect.
Whatever the OutgoingRunner is stuck on, it's definitely stuck. It won't go away unless I do a kill -9
I would have thought so... but there's only 1 script under /etc/init.d/ and no other scripts there reference mailman
I can see that the in queue is probably working. There are currently 129 files in the out/ queue.
There are 2 OutgoingRunner processes and apparently BOTH of them are doing something because strace does show some activity
Process 14078 attached - interrupt to quit recvfrom(7, "250 Ok. 0000000049303165.000045C"..., 8192, 0, NULL, NULL) = 35 sendto(7, "mail FROM:<reflexoes3-bounces@li"..., 65, 0, NULL, 0) = 65 recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(7, "rcpt TO:<chitopan_2000@yahoo.com"..., 35, 0, NULL, 0) = 35 recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(7, "rcpt TO:<bernadete.sorianoadv@gm"..., 42, 0, NULL, 0) = 42 recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(7, "rcpt TO:<iviv_8a@hotmail.com>\r\n", 31, 0, NULL, 0) = 31 recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(7, "data\r\n", 6, 0, NULL, 0) = 6 recvfrom(7, "354 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(7, "Received: from sr05-01.mta.terra"..., 8256, 0, NULL, 0) = 8256 recvfrom(7, "250 Ok. 0000000049303172.000045D"..., 8192, 0, NULL, NULL) = 35
root@email1:/var/lib/mailman# strace -p13800 Process 13800 attached - interrupt to quit recvfrom(8, "250 Ok. 00000000493031A7.0000464"..., 8192, 0, NULL, NULL) = 35 sendto(8, "mail FROM:<reflexoes3-bounces@li"..., 65, 0, NULL, 0) = 65 recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(8, "rcpt TO:<fesayao@gmail.com>\r\n", 29, 0, NULL, 0) = 29 recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(8, "rcpt TO:<bruno8d@hotmail.com>\r\n", 31, 0, NULL, 0) = 31 recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(8, "rcpt TO:<evangelismoemissoes@hot"..., 43, 0, NULL, 0) = 43 recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 sendto(8, "data\r\n", 6, 0, NULL, 0) = 6 recvfrom(8, "354 Ok.\r\n", 8192, 0, NULL, NULL) = 9
Yet even though it's being processed, the logs/post file isn't getting updated, and the number of messages in out/ doesn't decrease.
Could this be because mailman is processing a very large list (20,000 members) and it is just stuck on processing one message, while the other messages wait around?
But I've been handling these lists for a long time and never had these problems. Mailman doesn't seem to be getting much cpu usage.
![](https://secure.gravatar.com/avatar/56f108518d7ee2544412cc80978e3182.jpg?s=120&d=mm&r=g)
Ricardo Kleemann wrote:
Note that the above "--runner=OutgoingRunner:0:1" indicates that OutgoingRunner is not sliced - more below.
What do you see in Mailman's smtp-failure log.
Well, something is starting Mailman twice.
Since OutgoingRunner is not sliced, there should be only one OutgoingRunner process. This needs to be corrected. See <http://wiki.list.org/x/_4A9>.
So OutgoingRunner never finishes delivery of even one post. However, I see 'data' commands in the traces above, so presumably, some recipients are being delivered. Then when you kill OutgoingRunner and restart it, It recovers the .bak file from the queue and redelivers to the same recipients who by now have received multiple copies of the message.
It could be, but this would indicate that you need to do something to allow SMTP between Mailman and the MTA to proceed faster. Search the FAQ for "tuning".
But I've been handling these lists for a long time and never had these problems. Mailman doesn't seem to be getting much cpu usage.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
![](https://secure.gravatar.com/avatar/30b3ac53296307ae5b47497280cc15df.jpg?s=120&d=mm&r=g)
I have some more data on this...
I enabled the debug in SMTPDirect.py, and it shows that there really aren't any problems. It shows that slowly a message is being sent out.
I followed the performance tuning suggestions that had the MAX_RCPTS in Defaults.py at an optimal value from 2-5, so I set it to 3.
But does that mean that mailman is going to simply get stuck on one single message distribution and won't process any others until this one is finished?
I used to have these lists on another server and over there I had the MAX_RCPTS set to a high number, but my mail server is set to reject above 25 rcpts anyway so the end result that at max it would handle 25 rcpts. I remember that whenever a message arrived for the list (again 20,000 members) on the other server, the load average on the server would go pretty high as it processed the list.
But now on this new server I never see the load avg go up. Is this because of the MAX_RCPTS setting? What else would keep mailman from efficiently handling the messages?
The OutgoingRunner is just sitting there slowly distributing the message and never seems to get to the next one.
On Fri, 2008-11-28 at 09:43 -0800, Mark Sapiro wrote:
![](https://secure.gravatar.com/avatar/56f108518d7ee2544412cc80978e3182.jpg?s=120&d=mm&r=g)
Ricardo Kleemann wrote:
So your real issue is why is it proceeding so slowly.
I followed the performance tuning suggestions that had the MAX_RCPTS in Defaults.py at an optimal value from 2-5, so I set it to 3.
Is the MTA doing DNS verification on incoming mail from Mailman? Are you having some DNS issue?
Yes.
Slow response from the MTA. Even with SMTP_MAX_RCPTS set to 3, you should be delivering on the order of 100 or more recipients per second. What do you see in Mailman's smtp log for processing tomes for messages. How do the latest ones compare to those from days or a week ago?
The OutgoingRunner is just sitting there slowly distributing the message and never seems to get to the next one.
It will when it finishes this one.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
![](https://secure.gravatar.com/avatar/30b3ac53296307ae5b47497280cc15df.jpg?s=120&d=mm&r=g)
Mark, thanks for your help.
For the most part I did get around the SMTP issues last week, I disabled ident lookup on the smtp server. But it seems to be back today and I don't quite understand what could be the problem.
I wrote a tiny perl script to test the connection to the server via localhost and I can consistently do 500 connections in 30 seconds. I don't see any other evidence of the smtp hanging up.
But if I do an strace on the outgoing runner, it basically is working very slowly through a large memberlist. It will process a certain number, then hang a few seconds before it processes the next batch. The server is not loaded and I don't see any other evidence of problems.
Do you have a small python script I can run to test out the localhost smtp that can maybe output some diagnostic information?
Thanks Ricardo
----- Original Message ----- From: "Mark Sapiro" <mark@msapiro.net> To: "Ricardo Kleemann" <ricardo@americasnet.com> Cc: "mailman-users" <mailman-users@python.org> Sent: Friday, November 28, 2008 10:35 AM Subject: Re: [Mailman-Users] when does logs/post get updated?
![](https://secure.gravatar.com/avatar/56f108518d7ee2544412cc80978e3182.jpg?s=120&d=mm&r=g)
Ricardo Kleemann wrote:
The connection rate is generally not an issue as Mailman normall does all the transactions over a single connection. I.e. it connects, does a "MAIL FROM" one or more "RCPT TO", and "DATA", and then begins another "MAIL FROM", etc transaction without quitting or disconnectiing.
If there are significant delays between sending RCPT TO and the responce from the server, the server is probably doing DNS verifies on the recipients. This kills performance on large lists.
What's it doing when it "hangs for a few seconds"? Presumably it is either sending the message content or waiting for a reply from the MTA.
Do you have a small python script I can run to test out the localhost smtp that can maybe output some diagnostic information?
No. I don't have any such script. As I said in previous posts
and
Search the FAQ <http://wiki.list.org/x/AgA3> for "tuning".
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
![](https://secure.gravatar.com/avatar/7bdecdef03708b218939094eb05e8b35.jpg?s=120&d=mm&r=g)
on 12/1/08 1:50 PM, Ricardo Kleemann said:
That sounds like your MTA is doing DNS lookups for each and every envelope recipient as you try to transmit mail. That needs to be fixed, and is one of the many "performance" tuning techniques that is discussed in the FAQ Wiki.
-- Brad Knowles <brad@shub-internet.org> LinkedIn Profile: <http://tinyurl.com/y8kpxu>
participants (3)
-
Brad Knowles
-
Mark Sapiro
-
Ricardo Kleemann