Mailman 3 Hello all - mailman down after power failure and hard shutdown - Mailman-Users

Hello all - mailman down after power failure and hard shutdown

Tanstaafl

June 8, 2013

6:43 a.m.

Hello,

Ok, we had a power failure, and apparently my UPS thought it had more time left than it did, as the UPS shut down before it shut down the system.

Everything is back up and running, and postfix is running fine for all other mail, except list/mailman mail.

I'm getting the following error when trying to send an email to one of the lists:

2013-06-08T06:30:47-04:00 myhost postfix/postsuper[29691]: Requeued: 1 message 2013-06-08T06:31:12-04:00 myhost postfix/pickup[3124]: D55D7B7D175: uid=207 from=<valid-list@media-brokers.com> orig_id=45BF8B7B393 2013-06-08T06:31:12-04:00 myhost postfix/cleanup[29631]: D55D7B7D175: message-id=<51B30786.7020805@media-brokers.com> 2013-06-08T06:31:12-04:00 myhost postfix/qmgr[3126]: D55D7B7D175: from=<valid-list-bounces@media-brokers.com>, size=4065, nrcpt=6 (queue active) 2013-06-08T06:31:12-04:00 myhost postfix/qmgr[3126]: warning: connect to transport private/local: Resource temporarily unavailable 2013-06-08T06:31:12-04:00 myhost postfix/qmgr[3126]: warning: connect to transport private/retry: Resource temporarily unavailable

I've run check_perms and it says 'No problems found'...

Anyone have any suggestions?

Thanks,

charles

Show replies by date

Larry Kuenning

June 2013

8:10 a.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 6/8/2013 6:43 AM, Tanstaafl wrote:

...

Is mailman possibly not running? Try this: ps -A | grep mailmanctl

If that gives blank output, try this: /usr/lib/mailman/bin/mailmanctl start

(This was the solution for me when I had a similar problem a month and a half ago. I would like to know where to plug this in so it happens automatically on reboot. That should be an elementary question but I'm still not familiar with all these sysadmin tasks.)

-- Larry Kuenning larry@qhpress.org

Tanstaafl

8:52 a.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 2013-06-08 8:10 AM, Larry Kuenning <larry@qhpress.org> wrote:

...

Not blank - but what does the question mark mean?

# ps -A | grep mailmanctl 2600 ? 00:00:00 mailmanctl

...

I've tried restarting mailman (appears to work), and even tried rebooting...

Thanks for the assist - any other ideas?

Note: I think this is related to the three postfix errors I posted regarding a problem with the local transport - but I've googled and can't find a solution for that either...

I only posted two of these here:

2013-06-08T06:31:12-04:00 myhost postfix/qmgr[3126]: warning: connect to transport private/local: Resource temporarily unavailable 2013-06-08T06:31:12-04:00 myhost postfix/qmgr[3126]: warning: connect to transport private/retry: Resource temporarily unavailable

The third, which I don't see every time, is:

postfix/master[29913]: warning: master_wakeup_timer_event: service tlsmgr(private/tlsmgr): Resource temporarily unavailable

Larry Kuenning

11:33 a.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 6/8/2013 8:52 AM, Tanstaafl wrote:

...

Leaving out the "grep" to get the header ("ps -C mailmanctl" would have been better to start with) I see that that column is headed "TTY". I guess the question mark means the process is not tied to a terminal and so will continue running even if all users log out. Which is the behavior you want, so the problem must be elsewhere.

...

Thanks for the assist - any other ideas?

Now you need help from somebody who actually knows how Mailman works.

-- Larry Kuenning larry@qhpress.org

Mark Sapiro

1:04 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 06/08/2013 08:33 AM, Larry Kuenning wrote:

...

And 'ps -fwC python' or 'ps -fwu mailman' will show the qrunners too, but all this is moot as it is extremely unlikely that Postfix errors have anything to do with whether or not Mailman is actually running.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Richard Shetron

2:20 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

I'd suggest trying "ps -auxww|grep mailman" to seem if any mailman processes are running, this assumes mailman runs as its own user id. Some installs use the username list or lists instead of mailman.

If nothing show up then I'd check:

/etc/postfix/*.cf and /etc/postfix/transport and diff them with an older copy to make sure they haven't changed.
check /var/lib/mailman/qfiles/maildir The actual location may depend on your version and installation options. If mailman is NOT running then the cur subdir should be empty. I've found mailman will not restart if there is anything in the directory cur. I'd check the files, if any, in both new and cur and tmp just to see what's there.

Tanstaafl

9:14 a.m.

New subject: Hello all - mailman down after power failure and hard shutdown

Thanks for trying, but mailman is running fine.

Lists that have only real email addresses work fine.

Also, individual messages invoking postfix/local also work fine, (ie, emails sent from cron (8 from last night and this morning), etc)...

Mark has helped me narrow the problem down to whenever multiple messages are submitted to postfix/local simultaneously.

On 2013-06-08 2:20 PM, Richard Shetron <guest2@sgeinc.com> wrote:

...

Mark Sapiro

1:01 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 06/08/2013 05:10 AM, Larry Kuenning wrote:

...

The GNU Mailman tarball distribution contains misc/mailman.in which configure uses to make misc/mailman.

This is a sample init.d script for Mailman and it contains instructions for installing and activating it on RedHat/CentOS and Debian/Ubuntu.

And if you installed Mailman from a package, your packager should have provided this or something similar.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Mark Sapiro

1:20 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 06/08/2013 03:43 AM, Tanstaafl wrote:

...

I'm getting the following error when trying to send an email to one of the lists:

2013-06-08T06:30:47-04:00 myhost postfix/postsuper[29691]: Requeued: 1 message 2013-06-08T06:31:12-04:00 myhost postfix/pickup[3124]: D55D7B7D175: uid=207 from=<valid-list@media-brokers.com> orig_id=45BF8B7B393 2013-06-08T06:31:12-04:00 myhost postfix/cleanup[29631]: D55D7B7D175: message-id=<51B30786.7020805@media-brokers.com> 2013-06-08T06:31:12-04:00 myhost postfix/qmgr[3126]: D55D7B7D175: from=<valid-list-bounces@media-brokers.com>, size=4065, nrcpt=6 (queue active)

Postfix has received the message and is trying to deliver it via the local transport which is good.

...

2013-06-08T06:31:12-04:00 myhost postfix/qmgr[3126]: warning: connect to transport private/local: Resource temporarily unavailable

but Postfix can't find the local transport or more likely there is a stale lock on the transport left over from before the crash, so Postfix tries to queue the message for retry.

...

2013-06-08T06:31:12-04:00 myhost postfix/qmgr[3126]: warning: connect to transport private/retry: Resource temporarily unavailable

but it can't access the retry transport either ...

...

I've run check_perms and it says 'No problems found'...

Because this isn't a Mailman problem. It's a Postfix problem. I don't know enough Postfix to point directly at a solution, but I doubt that Postfix can deliver any mail via the local transport.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Mark Sapiro

1:58 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 06/08/2013 10:20 AM, Mark Sapiro wrote:

...

Actually, private/local and private/retry refer to the sockets used for communication between the Postfix master and the various daemons. If you do 'netstat -l' you should see these and many others 'LISTENING', Do you?

I don't know why a reboot or even just a stop and start of Postfix doesn't fix this. If you stop and start Postfix, are there any messages in the mail logs beyond the "postfix/master[pppp]: daemon started ..." message?

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Tanstaafl

5:21 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 2013-06-08 1:58 PM, Mark Sapiro <mark@msapiro.net> wrote:

...

Actually, private/local and private/retry refer to the sockets used for communication between the Postfix master and the various daemons. If you do 'netstat -l' you should see these and many others 'LISTENING', Do you?

Yep, they're all there. And local is working - at least sometimes (see below) :(

...

I don't know why a reboot or even just a stop and start of Postfix doesn't fix this. If you stop and start Postfix, are there any messages in the mail logs beyond the "postfix/master[pppp]: daemon started ..." message?

Nothing more than the three warnings I already posted, two of which you see below, and the third being:

...

2013-06-08T13:10:19-04:00 myhost postfix/master[4076]: warning: master_wakeup_timer_event: service tlsmgr(private/tlsmgr): Resource temporarily unavailable

But, I have more details after some testing...

First, mailman is definitely working. I tested sending to one of my test lists with just two people on it, and it works fine:

...

2013-06-08T16:28:31-04:00 myhost postfix/qmgr[4078]: 88BA3831DC: from=<CMarcus@Media-Brokers.com>, size=743, nrcpt=1 (queue active) 2013-06-08T16:28:31-04:00 myhost postfix-587/smtpd[5878]: disconnect from client.atl.media-brokers.com[192.168.1.110] 2013-06-08T16:28:31-04:00 myhost postfix/local[5884]: 88BA3831DC: to=<test-list@smtp.media-brokers.com>, orig_to=<test-list@media-brokers.com>, relay=local, delay=0.31, delays=0.08/0/0/0.23, dsn=2.0.0, status=sent (delivered to command: /usr/lib64/mailman/mail/mailman post test-list) 2013-06-08T16:28:31-04:00 myhost postfix/qmgr[4078]: 88BA3831DC: removed 2013-06-08T16:28:32-04:00 myhost dovecot: imap(cmarcus@media-brokers.com): Connection closed in=1013 out=1725269 2013-06-08T16:28:32-04:00 myhost dovecot: imap-login: Login: user=<cmarcus@media-brokers.com>, method=PLAIN, rip=192.168.1.110, lport=993, mpid=5900, TLS, session=<T7IhZKreIgDAqAFu> 2013-06-08T16:28:33-04:00 myhost postfix-25/smtpd[5887]: connect from myhost.media-brokers.com[127.0.0.1] 2013-06-08T16:28:33-04:00 myhost postfix-25/smtpd[5887]: 668EA831DC: client=myhost.media-brokers.com[127.0.0.1] 2013-06-08T16:28:33-04:00 myhost postfix/cleanup[5883]: 668EA831DC: message-id=<51B393EF.2010908@Media-Brokers.com> 2013-06-08T16:28:33-04:00 myhost postfix/qmgr[4078]: 668EA831DC: from=<test-list-bounces@media-brokers.com>, size=1269, nrcpt=1 (queue active) 2013-06-08T16:28:33-04:00 myhost postfix-25/smtpd[5887]: 77983189530: client=myhost.media-brokers.com[127.0.0.1] 2013-06-08T16:28:33-04:00 myhost postfix/cleanup[5883]: 77983189530: message-id=<51B393EF.2010908@Media-Brokers.com> 2013-06-08T16:28:33-04:00 myhost postfix/qmgr[4078]: 77983189530: from=<test-list-bounces@media-brokers.com>, size=1271, nrcpt=2 (queue active) 2013-06-08T16:28:33-04:00 myhost postfix-25/smtpd[5887]: disconnect from myhost.media-brokers.com[127.0.0.1] 2013-06-08T16:28:33-04:00 myhost postfix/virtual[5889]: 77983189530: to=<recipient@media-brokers.com>, relay=virtual, delay=0.2, delays=0.05/0/0/0.15, dsn=2.0.0, status=sent (delivered to maildir) 2013-06-08T16:28:33-04:00 myhost postfix/pipe[5890]: 77983189530: to=<recipient#media-brokers.com@autoreply.media-brokers.com>, orig_to=<recipient@media-brokers.com>, relay=vacation, delay=0.4, delays=0.05/0/0/0.35, dsn=2.0.0, status=sent (delivered via vacation service) 2013-06-08T16:28:33-04:00 myhost postfix/qmgr[4078]: 77983189530: removed 2013-06-08T16:28:35-04:00 myhost postfix/smtp[5888]: 668EA831DC: to=<recipient@example.org>, relay=filtered.maildistiller.com[176.31.241.80]:25, delay=1.8, delays=0.07/0/0.62/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 0F38F157) 2013-06-08T16:28:35-04:00 myhost postfix/qmgr[4078]: 668EA831DC: removed

I tested with another list that has 6 people on it, two of whom have their vacation message enabled (I use postfixadmin vacation), and while all 6 recipients got the message, there were two messages that got stuck in the queue that are related to the vacation message:

...

2013-06-08T16:36:50-04:00 myhost postfix/qmgr[4078]: 4F86719D832: from=<CMarcus@Media-Brokers.com>, size=935, nrcpt=1 (queue active) 2013-06-08T16:36:50-04:00 myhost postfix-587/smtpd[5968]: disconnect from client.atl.media-brokers.com[192.168.1.110] 2013-06-08T16:36:50-04:00 myhost postfix/local[5970]: 4F86719D832: to=<test-list2@smtp.media-brokers.com>, orig_to=<test-list2@media-brokers.com>, relay=local, delay=0.28, delays=0.09/0.01/0/0.18, dsn=2.0.0, status=sent (delivered to command: /usr/lib64/mailman/mail/mailman post test-list2) 2013-06-08T16:36:50-04:00 myhost postfix/qmgr[4078]: 4F86719D832: removed 2013-06-08T16:36:52-04:00 myhost postfix-25/smtpd[5973]: connect from myhost.media-brokers.com[127.0.0.1] 2013-06-08T16:36:52-04:00 myhost postfix-25/smtpd[5973]: 22FAC19D832: client=myhost.media-brokers.com[127.0.0.1] 2013-06-08T16:36:52-04:00 myhost postfix/cleanup[5969]: 22FAC19D832: message-id=<51B395E2.7070107@Media-Brokers.com> 2013-06-08T16:36:52-04:00 myhost postfix/qmgr[4078]: 22FAC19D832: from=<test-list2-bounces@media-brokers.com>, size=1442, nrcpt=9 (queue active) 2013-06-08T16:36:52-04:00 myhost postfix-25/smtpd[5973]: disconnect from myhost.media-brokers.com[127.0.0.1] 2013-06-08T16:36:52-04:00 myhost postfix/qmgr[4078]: warning: connect to transport private/vacation: Resource temporarily unavailable 2013-06-08T16:36:52-04:00 myhost postfix/qmgr[4078]: warning: connect to transport private/retry: Resource temporarily unavailable 2013-06-08T16:36:52-04:00 myhost postfix/qmgr[4078]: 22FAC19D832: to=<validuser1#media-brokers.com@autoreply.media-brokers.com>, orig_to=<validuser@media-brokers.com>, relay=none, delay=0.15, delays=0.07/0.08/0/0, dsn=4.3.0, status=deferred (mail transport unavailable) 2013-06-08T16:36:52-04:00 myhost postfix/qmgr[4078]: 22FAC19D832: to=<validuser2#media-brokers.com@autoreply.media-brokers.com>, orig_to=<validuser2@media-brokers.com>, relay=none, delay=0.21, delays=0.07/0.14/0/0, dsn=4.3.0, status=deferred (mail transport unavailable) 2013-06-08T16:36:52-04:00 myhost postfix/virtual[5974]: 22FAC19D832: to=<validuser3@media-brokers.com>, relay=virtual, delay=0.28, delays=0.07/0.14/0/0.07, dsn=2.0.0, status=sent (delivered to maildir) 2013-06-08T16:36:52-04:00 myhost postfix/virtual[5974]: 22FAC19D832: to=<validuser4@media-brokers.com>, relay=virtual, delay=0.37, delays=0.07/0.14/0/0.16, dsn=2.0.0, status=sent (delivered to maildir) 2013-06-08T16:36:52-04:00 myhost postfix/virtual[5974]: 22FAC19D832: to=<validuser5@media-brokers.com>, relay=virtual, delay=0.47, delays=0.07/0.14/0/0.25, dsn=2.0.0, status=sent (delivered to maildir) 2013-06-08T16:36:52-04:00 myhost postfix/virtual[5974]: 22FAC19D832: to=<validuser1@media-brokers.com>, relay=virtual, delay=0.56, delays=0.07/0.14/0/0.35, dsn=2.0.0, status=sent (delivered to maildir) 2013-06-08T16:36:52-04:00 myhost postfix/virtual[5974]: 22FAC19D832: to=<validuser6@media-brokers.com>, relay=virtual, delay=0.65, delays=0.07/0.14/0/0.43, dsn=2.0.0, status=sent (delivered to maildir) 2013-06-08T16:36:52-04:00 myhost postfix/virtual[5974]: 22FAC19D832: to=<validuser7@media-brokers.com>, relay=virtual, delay=0.72, delays=0.07/0.14/0/0.51, dsn=2.0.0, status=sent (delivered to maildir) 2013-06-08T16:36:53-04:00 myhost postfix/virtual[5974]: 22FAC19D832: to=<validuser2@media-brokers.com>, relay=virtual, delay=0.88, delays=0.07/0.14/0/0.66, dsn=2.0.0, status=sent (delivered to maildir)

As you can see, only the two vacation messages are deferred with transport unavailable.

It also appears that the problem manifests with NESTED lists:

...

2013-06-08T17:07:43-04:00 myhost postfix/qmgr[4078]: 8485738737A: from=<CMarcus@Media-Brokers.com>, size=3474, nrcpt=1 (queue active) 2013-06-08T17:07:43-04:00 myhost postfix-587/smtpd[6187]: disconnect from client.atl.media-brokers.com[192.168.1.110] 2013-06-08T17:07:43-04:00 myhost postfix/local[6190]: 8485738737A: to=<lists-all@smtp.media-brokers.com>, orig_to=<lists-all@Media-Brokers.com>, relay=local, delay=0.3, delays=0.08/0/0/0.22, dsn=2.0.0, status=sent (delivered to command: /usr/lib64/mailman/mail/mailman post lists-all) 2013-06-08T17:07:43-04:00 myhost postfix/qmgr[4078]: 8485738737A: removed 2013-06-08T17:07:44-04:00 myhost postfix-25/smtpd[6206]: connect from myhost.media-brokers.com[127.0.0.1] 2013-06-08T17:07:44-04:00 myhost postfix-25/smtpd[6206]: D682B38737A: client=myhost.media-brokers.com[127.0.0.1] 2013-06-08T17:07:44-04:00 myhost postfix/cleanup[6182]: D682B38737A: message-id=<51B39D1F.3050404@Media-Brokers.com> 2013-06-08T17:07:44-04:00 myhost postfix/qmgr[4078]: D682B38737A: from=<lists-all-bounces@media-brokers.com>, size=3933, nrcpt=6 (queue active) 2013-06-08T17:07:44-04:00 myhost postfix-25/smtpd[6206]: disconnect from myhost.media-brokers.com[127.0.0.1] 2013-06-08T17:07:44-04:00 myhost postfix/qmgr[4078]: warning: connect to transport private/local: Resource temporarily unavailable 2013-06-08T17:07:44-04:00 myhost postfix/qmgr[4078]: warning: connect to transport private/retry: Resource temporarily unavailable 2013-06-08T17:07:45-04:00 myhost postfix/qmgr[4078]: D682B38737A: to=<list-1@smtp.media-brokers.com>, orig_to=<list-1@media-brokers.com>, relay=none, delay=0.17, delays=0.1/0.08/0/0, dsn=4.3.0, status=deferred (mail transport unavailable) 2013-06-08T17:07:45-04:00 myhost postfix/qmgr[4078]: D682B38737A: to=<list-2@smtp.media-brokers.com>, orig_to=<list-2@media-brokers.com>, relay=none, delay=0.22, delays=0.1/0.13/0/0, dsn=4.3.0, status=deferred (mail transport unavailable) 2013-06-08T17:07:45-04:00 myhost postfix/qmgr[4078]: D682B38737A: to=<list-3@smtp.media-brokers.com>, orig_to=<list-3@media-brokers.com>, relay=none, delay=0.3, delays=0.1/0.2/0/0, dsn=4.3.0, status=deferred (mail transport unavailable) 2013-06-08T17:07:45-04:00 myhost postfix/qmgr[4078]: D682B38737A: to=<list-4@smtp.media-brokers.com>, orig_to=<list-4@media-brokers.com>, relay=none, delay=0.36, delays=0.1/0.26/0/0, dsn=4.3.0, status=deferred (mail transport unavailable) 2013-06-08T17:07:45-04:00 myhost postfix/qmgr[4078]: D682B38737A: to=<list-5@smtp.media-brokers.com>, orig_to=<list-5@media-brokers.com>, relay=none, delay=0.41, delays=0.1/0.32/0/0, dsn=4.3.0, status=deferred (mail transport unavailable) 2013-06-08T17:07:45-04:00 myhost postfix/qmgr[4078]: D682B38737A: to=<list-6@smtp.media-brokers.com>, orig_to=<list-6@media-brokers.com>, relay=none, delay=0.47, delays=0.1/0.37/0/0, dsn=4.3.0, status=deferred (mail transport unavailable)

I imagine that the two problems are being caused by the same problem, whatever it is...

It also seems to be something to do with how many recipients are involved. One or two appear to be ok, but more than that and it gets iffy...

Appreciate any more thoughts on this weirdness, because I'm stumped....

Mark Sapiro

5:44 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 06/08/2013 02:21 PM, Tanstaafl wrote:

...

I think that's a coincidence. The biggest problem is with delivery from Postfix to Mailman, at which point nothing knows how many list members there are or how many messages Mailman will send.

...

Appreciate any more thoughts on this weirdness, because I'm stumped....

See this <http://tech.groups.yahoo.com/group/postfix-users/message/245375>, particularly the replies from Wietse.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Tanstaafl

6:10 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 2013-06-08 5:44 PM, Mark Sapiro <mark@msapiro.net> wrote:

...

I read them all, and I don't think it is relevant (this is the same kernel and same versions of postfix dovecot and mailman for some time now), but, I changed the default limit to 10 and reloaded postfix, with the same error when sending to my 'All' list (that has only 6 members, all lists).

Also, as I said, lists that only have individual recipients work just fine, even with 30+ recipients.

Also the weirdness when a list member has their vacation enabled - they get the original list message, but the vacation message gets stuck in the queue with the error.

I'm thinking of trying to reinstalling (this is gentoo, so that will be easy) first mailman, then postfix... I'll probably try that tomorrow if no other solution presents itself.

Thanks for your help, Mark, much appreciated...

Mark Sapiro

7:13 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 06/08/2013 03:10 PM, Tanstaafl wrote:

...

I read them all, and I don't think it is relevant (this is the same kernel and same versions of postfix dovecot and mailman for some time now), but, I changed the default limit to 10 and reloaded postfix, with the same error when sending to my 'All' list (that has only 6 members, all lists).

How long was the system up before the crash, and during that time did you change any dynamic configuration parameters the would have been reverted by the crash.

...

Also, as I said, lists that only have individual recipients work just fine, even with 30+ recipients.

So, the doesn't occur with a single message to a single list, but it occurs when Postfix receives six messages at once FROM the lists-all list. Also your deliveries to to=<validuser1@media-brokers.com> et al are via the virtual transport which is apparently unaffected.

...

Also the weirdness when a list member has their vacation enabled - they get the original list message, but the vacation message gets stuck in the queue with the error.

Another case of multiple messages to be handled by the local transport.

...

I'm thinking of trying to reinstalling (this is gentoo, so that will be easy) first mailman, then postfix... I'll probably try that tomorrow if no other solution presents itself.

If you reinstall Mailman without touching Postfix and that fixes this, I'll be incredibly surprised.

All the evidence you've presented together with everything I know says this is a Postfix issue, not a Mailman issue. If I knew Postfix as well as I know Mailman, I could probably tell you how to fix this.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Tanstaafl

8:49 a.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 2013-06-08 7:13 PM, Mark Sapiro <mark@msapiro.net> wrote:

...

So, the doesn't occur with a single message to a single list, but it occurs when Postfix receives six messages at once FROM the lists-all list. Also your deliveries to to=<validuser1@media-brokers.com> et al are via the virtual transport which is apparently unaffected.

Hmm, so when a list only contains other lists as members, those will use postfix's local transport, but when the members are individuals (for final delivery), it uses virtual. Ok, that makes sense then.

...

...
Also the weirdness when a list member has their vacation enabled - they get the original list message, but the vacation message gets stuck in the queue with the error.

...

Another case of multiple messages to be handled by the local transport.

Ok, yeah, I think you've nailed it... the problem is when more than one message at a time is passed to postfix/local...

...

...
I'm thinking of trying to reinstalling (this is gentoo, so that will be easy) first mailman, then postfix... I'll probably try that tomorrow if no other solution presents itself.

...

If you reinstall Mailman without touching Postfix and that fixes this, I'll be incredibly surprised.

I think you're right, I'll do postfix first.

...

All the evidence you've presented together with everything I know says this is a Postfix issue, not a Mailman issue. If I knew Postfix as well as I know Mailman, I could probably tell you how to fix this.

Wish I did... I did get a comment from Victor on the postfix list to check all of my aliases, so I ran newaliases but that didn't help. Is there anything else I can do to test the mailman aliases? Since the individual lists work - confirmed because I sent the mass email I've been trying to send since this happened to each individual list that is a member of the lists-all list, and those all worked fine.

I agree with you that this seems to be a postfix problem, but is it possible that some kind of corruption in a userb could cause these warnings? To recap, they are:

The first one from postfix/master only shows up rarely - 11 times since I got the system back up, and within 5 or 10 minutes (but usually with 5 or 10 seconds) of postfix being restarted:

postfix/master[6406]: warning: master_wakeup_timer_event: service tlsmgr(private/tlsmgr): Resource temporarily unavailable

Then these (only when I try to send to my lists-all list):

warning: connect to transport private/local: Resource temporarily unavailable warning: connect to transport private/retry: Resource temporarily unavailable

I do have backups of my mysql userdb, as well as all others (mailman aliases/dbs, etc), so I can replace any of these from backups if it will fix the problem.

Thanks again for your time and help Mark...

Charles

Mark Sapiro

10:33 a.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 06/09/2013 05:49 AM, Tanstaafl wrote:

...

I thought about aliases, but aliases are only consulted by the local transport, and the issue is in passing the message to the local transport (and also the retry transport and the vacation transport). Thus, I don't think aliases could be involved.

However, if aliases were involved, the thing to run is Mailman's bin/genaliases, but we know aliases are not the problem, both from the above and the fact that the lists all work 'one at a time'

There is definitely some resource contention issue when Postfix is trying to access the same socket for multiple messages.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Tanstaafl

4:20 p.m.

New subject: SOLVED - Re: Hello all - mailman down after power failure and hard shutdown

Ok, facepalm time...

I had forgotten that I had built a new kernel a few weeks ago, and changed it to the default - but hadn't properly tested it yet.

Reverting to the previous kernel resolved the problem.

I'm not sure what the heck I changed to cause this, but that'll sure tech me to never change the kernel boot default without proper testing.

Anyway, thanks for the assist and sorry for the noise.

Charles

On 2013-06-09 10:33 AM, Mark Sapiro <mark@msapiro.net> wrote:

...

Larry Kuenning

June 2013

8:10 a.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 6/8/2013 6:43 AM, Tanstaafl wrote:

...

Is mailman possibly not running? Try this: ps -A | grep mailmanctl

If that gives blank output, try this: /usr/lib/mailman/bin/mailmanctl start

-- Larry Kuenning larry@qhpress.org

Tanstaafl

8:52 a.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 2013-06-08 8:10 AM, Larry Kuenning <larry@qhpress.org> wrote:

...

Not blank - but what does the question mark mean?

# ps -A | grep mailmanctl 2600 ? 00:00:00 mailmanctl

...

I've tried restarting mailman (appears to work), and even tried rebooting...

Thanks for the assist - any other ideas?

Note: I think this is related to the three postfix errors I posted regarding a problem with the local transport - but I've googled and can't find a solution for that either...

I only posted two of these here:

The third, which I don't see every time, is:

postfix/master[29913]: warning: master_wakeup_timer_event: service tlsmgr(private/tlsmgr): Resource temporarily unavailable

Larry Kuenning

11:33 a.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 6/8/2013 8:52 AM, Tanstaafl wrote:

...

Thanks for the assist - any other ideas?

Now you need help from somebody who actually knows how Mailman works.

-- Larry Kuenning larry@qhpress.org

Mark Sapiro

1:04 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 06/08/2013 08:33 AM, Larry Kuenning wrote:

...

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Richard Shetron

2:20 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

If nothing show up then I'd check:

/etc/postfix/*.cf and /etc/postfix/transport and diff them with an older copy to make sure they haven't changed.
check /var/lib/mailman/qfiles/maildir The actual location may depend on your version and installation options. If mailman is NOT running then the cur subdir should be empty. I've found mailman will not restart if there is anything in the directory cur. I'd check the files, if any, in both new and cur and tmp just to see what's there.

Tanstaafl

9:14 a.m.

New subject: Hello all - mailman down after power failure and hard shutdown

Thanks for trying, but mailman is running fine.

Lists that have only real email addresses work fine.

Also, individual messages invoking postfix/local also work fine, (ie, emails sent from cron (8 from last night and this morning), etc)...

Mark has helped me narrow the problem down to whenever multiple messages are submitted to postfix/local simultaneously.

On 2013-06-08 2:20 PM, Richard Shetron <guest2@sgeinc.com> wrote:

...

Mark Sapiro

June 2013

1:01 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 06/08/2013 05:10 AM, Larry Kuenning wrote:

...

The GNU Mailman tarball distribution contains misc/mailman.in which configure uses to make misc/mailman.

This is a sample init.d script for Mailman and it contains instructions for installing and activating it on RedHat/CentOS and Debian/Ubuntu.

And if you installed Mailman from a package, your packager should have provided this or something similar.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Mark Sapiro

1:20 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 06/08/2013 03:43 AM, Tanstaafl wrote:

...

I'm getting the following error when trying to send an email to one of the lists:

2013-06-08T06:30:47-04:00 myhost postfix/postsuper[29691]: Requeued: 1 message 2013-06-08T06:31:12-04:00 myhost postfix/pickup[3124]: D55D7B7D175: uid=207 from=<valid-list@media-brokers.com> orig_id=45BF8B7B393 2013-06-08T06:31:12-04:00 myhost postfix/cleanup[29631]: D55D7B7D175: message-id=<51B30786.7020805@media-brokers.com> 2013-06-08T06:31:12-04:00 myhost postfix/qmgr[3126]: D55D7B7D175: from=<valid-list-bounces@media-brokers.com>, size=4065, nrcpt=6 (queue active)

Postfix has received the message and is trying to deliver it via the local transport which is good.

...

2013-06-08T06:31:12-04:00 myhost postfix/qmgr[3126]: warning: connect to transport private/local: Resource temporarily unavailable

but Postfix can't find the local transport or more likely there is a stale lock on the transport left over from before the crash, so Postfix tries to queue the message for retry.

...

2013-06-08T06:31:12-04:00 myhost postfix/qmgr[3126]: warning: connect to transport private/retry: Resource temporarily unavailable

but it can't access the retry transport either ...

...

I've run check_perms and it says 'No problems found'...

Because this isn't a Mailman problem. It's a Postfix problem. I don't know enough Postfix to point directly at a solution, but I doubt that Postfix can deliver any mail via the local transport.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Mark Sapiro

1:58 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 06/08/2013 10:20 AM, Mark Sapiro wrote:

...

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Tanstaafl

5:21 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 2013-06-08 1:58 PM, Mark Sapiro <mark@msapiro.net> wrote:

...

Actually, private/local and private/retry refer to the sockets used for communication between the Postfix master and the various daemons. If you do 'netstat -l' you should see these and many others 'LISTENING', Do you?

Yep, they're all there. And local is working - at least sometimes (see below) :(

...

I don't know why a reboot or even just a stop and start of Postfix doesn't fix this. If you stop and start Postfix, are there any messages in the mail logs beyond the "postfix/master[pppp]: daemon started ..." message?

Nothing more than the three warnings I already posted, two of which you see below, and the third being:

...

2013-06-08T13:10:19-04:00 myhost postfix/master[4076]: warning: master_wakeup_timer_event: service tlsmgr(private/tlsmgr): Resource temporarily unavailable

But, I have more details after some testing...

First, mailman is definitely working. I tested sending to one of my test lists with just two people on it, and it works fine:

...

2013-06-08T16:28:31-04:00 myhost postfix/qmgr[4078]: 88BA3831DC: from=<CMarcus@Media-Brokers.com>, size=743, nrcpt=1 (queue active) 2013-06-08T16:28:31-04:00 myhost postfix-587/smtpd[5878]: disconnect from client.atl.media-brokers.com[192.168.1.110] 2013-06-08T16:28:31-04:00 myhost postfix/local[5884]: 88BA3831DC: to=<test-list@smtp.media-brokers.com>, orig_to=<test-list@media-brokers.com>, relay=local, delay=0.31, delays=0.08/0/0/0.23, dsn=2.0.0, status=sent (delivered to command: /usr/lib64/mailman/mail/mailman post test-list) 2013-06-08T16:28:31-04:00 myhost postfix/qmgr[4078]: 88BA3831DC: removed 2013-06-08T16:28:32-04:00 myhost dovecot: imap(cmarcus@media-brokers.com): Connection closed in=1013 out=1725269 2013-06-08T16:28:32-04:00 myhost dovecot: imap-login: Login: user=<cmarcus@media-brokers.com>, method=PLAIN, rip=192.168.1.110, lport=993, mpid=5900, TLS, session=<T7IhZKreIgDAqAFu> 2013-06-08T16:28:33-04:00 myhost postfix-25/smtpd[5887]: connect from myhost.media-brokers.com[127.0.0.1] 2013-06-08T16:28:33-04:00 myhost postfix-25/smtpd[5887]: 668EA831DC: client=myhost.media-brokers.com[127.0.0.1] 2013-06-08T16:28:33-04:00 myhost postfix/cleanup[5883]: 668EA831DC: message-id=<51B393EF.2010908@Media-Brokers.com> 2013-06-08T16:28:33-04:00 myhost postfix/qmgr[4078]: 668EA831DC: from=<test-list-bounces@media-brokers.com>, size=1269, nrcpt=1 (queue active) 2013-06-08T16:28:33-04:00 myhost postfix-25/smtpd[5887]: 77983189530: client=myhost.media-brokers.com[127.0.0.1] 2013-06-08T16:28:33-04:00 myhost postfix/cleanup[5883]: 77983189530: message-id=<51B393EF.2010908@Media-Brokers.com> 2013-06-08T16:28:33-04:00 myhost postfix/qmgr[4078]: 77983189530: from=<test-list-bounces@media-brokers.com>, size=1271, nrcpt=2 (queue active) 2013-06-08T16:28:33-04:00 myhost postfix-25/smtpd[5887]: disconnect from myhost.media-brokers.com[127.0.0.1] 2013-06-08T16:28:33-04:00 myhost postfix/virtual[5889]: 77983189530: to=<recipient@media-brokers.com>, relay=virtual, delay=0.2, delays=0.05/0/0/0.15, dsn=2.0.0, status=sent (delivered to maildir) 2013-06-08T16:28:33-04:00 myhost postfix/pipe[5890]: 77983189530: to=<recipient#media-brokers.com@autoreply.media-brokers.com>, orig_to=<recipient@media-brokers.com>, relay=vacation, delay=0.4, delays=0.05/0/0/0.35, dsn=2.0.0, status=sent (delivered via vacation service) 2013-06-08T16:28:33-04:00 myhost postfix/qmgr[4078]: 77983189530: removed 2013-06-08T16:28:35-04:00 myhost postfix/smtp[5888]: 668EA831DC: to=<recipient@example.org>, relay=filtered.maildistiller.com[176.31.241.80]:25, delay=1.8, delays=0.07/0/0.62/1.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 0F38F157) 2013-06-08T16:28:35-04:00 myhost postfix/qmgr[4078]: 668EA831DC: removed

...

2013-06-08T16:36:50-04:00 myhost postfix/qmgr[4078]: 4F86719D832: from=<CMarcus@Media-Brokers.com>, size=935, nrcpt=1 (queue active) 2013-06-08T16:36:50-04:00 myhost postfix-587/smtpd[5968]: disconnect from client.atl.media-brokers.com[192.168.1.110] 2013-06-08T16:36:50-04:00 myhost postfix/local[5970]: 4F86719D832: to=<test-list2@smtp.media-brokers.com>, orig_to=<test-list2@media-brokers.com>, relay=local, delay=0.28, delays=0.09/0.01/0/0.18, dsn=2.0.0, status=sent (delivered to command: /usr/lib64/mailman/mail/mailman post test-list2) 2013-06-08T16:36:50-04:00 myhost postfix/qmgr[4078]: 4F86719D832: removed 2013-06-08T16:36:52-04:00 myhost postfix-25/smtpd[5973]: connect from myhost.media-brokers.com[127.0.0.1] 2013-06-08T16:36:52-04:00 myhost postfix-25/smtpd[5973]: 22FAC19D832: client=myhost.media-brokers.com[127.0.0.1] 2013-06-08T16:36:52-04:00 myhost postfix/cleanup[5969]: 22FAC19D832: message-id=<51B395E2.7070107@Media-Brokers.com> 2013-06-08T16:36:52-04:00 myhost postfix/qmgr[4078]: 22FAC19D832: from=<test-list2-bounces@media-brokers.com>, size=1442, nrcpt=9 (queue active) 2013-06-08T16:36:52-04:00 myhost postfix-25/smtpd[5973]: disconnect from myhost.media-brokers.com[127.0.0.1] 2013-06-08T16:36:52-04:00 myhost postfix/qmgr[4078]: warning: connect to transport private/vacation: Resource temporarily unavailable 2013-06-08T16:36:52-04:00 myhost postfix/qmgr[4078]: warning: connect to transport private/retry: Resource temporarily unavailable 2013-06-08T16:36:52-04:00 myhost postfix/qmgr[4078]: 22FAC19D832: to=<validuser1#media-brokers.com@autoreply.media-brokers.com>, orig_to=<validuser@media-brokers.com>, relay=none, delay=0.15, delays=0.07/0.08/0/0, dsn=4.3.0, status=deferred (mail transport unavailable) 2013-06-08T16:36:52-04:00 myhost postfix/qmgr[4078]: 22FAC19D832: to=<validuser2#media-brokers.com@autoreply.media-brokers.com>, orig_to=<validuser2@media-brokers.com>, relay=none, delay=0.21, delays=0.07/0.14/0/0, dsn=4.3.0, status=deferred (mail transport unavailable) 2013-06-08T16:36:52-04:00 myhost postfix/virtual[5974]: 22FAC19D832: to=<validuser3@media-brokers.com>, relay=virtual, delay=0.28, delays=0.07/0.14/0/0.07, dsn=2.0.0, status=sent (delivered to maildir) 2013-06-08T16:36:52-04:00 myhost postfix/virtual[5974]: 22FAC19D832: to=<validuser4@media-brokers.com>, relay=virtual, delay=0.37, delays=0.07/0.14/0/0.16, dsn=2.0.0, status=sent (delivered to maildir) 2013-06-08T16:36:52-04:00 myhost postfix/virtual[5974]: 22FAC19D832: to=<validuser5@media-brokers.com>, relay=virtual, delay=0.47, delays=0.07/0.14/0/0.25, dsn=2.0.0, status=sent (delivered to maildir) 2013-06-08T16:36:52-04:00 myhost postfix/virtual[5974]: 22FAC19D832: to=<validuser1@media-brokers.com>, relay=virtual, delay=0.56, delays=0.07/0.14/0/0.35, dsn=2.0.0, status=sent (delivered to maildir) 2013-06-08T16:36:52-04:00 myhost postfix/virtual[5974]: 22FAC19D832: to=<validuser6@media-brokers.com>, relay=virtual, delay=0.65, delays=0.07/0.14/0/0.43, dsn=2.0.0, status=sent (delivered to maildir) 2013-06-08T16:36:52-04:00 myhost postfix/virtual[5974]: 22FAC19D832: to=<validuser7@media-brokers.com>, relay=virtual, delay=0.72, delays=0.07/0.14/0/0.51, dsn=2.0.0, status=sent (delivered to maildir) 2013-06-08T16:36:53-04:00 myhost postfix/virtual[5974]: 22FAC19D832: to=<validuser2@media-brokers.com>, relay=virtual, delay=0.88, delays=0.07/0.14/0/0.66, dsn=2.0.0, status=sent (delivered to maildir)

As you can see, only the two vacation messages are deferred with transport unavailable.

It also appears that the problem manifests with NESTED lists:

...

2013-06-08T17:07:43-04:00 myhost postfix/qmgr[4078]: 8485738737A: from=<CMarcus@Media-Brokers.com>, size=3474, nrcpt=1 (queue active) 2013-06-08T17:07:43-04:00 myhost postfix-587/smtpd[6187]: disconnect from client.atl.media-brokers.com[192.168.1.110] 2013-06-08T17:07:43-04:00 myhost postfix/local[6190]: 8485738737A: to=<lists-all@smtp.media-brokers.com>, orig_to=<lists-all@Media-Brokers.com>, relay=local, delay=0.3, delays=0.08/0/0/0.22, dsn=2.0.0, status=sent (delivered to command: /usr/lib64/mailman/mail/mailman post lists-all) 2013-06-08T17:07:43-04:00 myhost postfix/qmgr[4078]: 8485738737A: removed 2013-06-08T17:07:44-04:00 myhost postfix-25/smtpd[6206]: connect from myhost.media-brokers.com[127.0.0.1] 2013-06-08T17:07:44-04:00 myhost postfix-25/smtpd[6206]: D682B38737A: client=myhost.media-brokers.com[127.0.0.1] 2013-06-08T17:07:44-04:00 myhost postfix/cleanup[6182]: D682B38737A: message-id=<51B39D1F.3050404@Media-Brokers.com> 2013-06-08T17:07:44-04:00 myhost postfix/qmgr[4078]: D682B38737A: from=<lists-all-bounces@media-brokers.com>, size=3933, nrcpt=6 (queue active) 2013-06-08T17:07:44-04:00 myhost postfix-25/smtpd[6206]: disconnect from myhost.media-brokers.com[127.0.0.1] 2013-06-08T17:07:44-04:00 myhost postfix/qmgr[4078]: warning: connect to transport private/local: Resource temporarily unavailable 2013-06-08T17:07:44-04:00 myhost postfix/qmgr[4078]: warning: connect to transport private/retry: Resource temporarily unavailable 2013-06-08T17:07:45-04:00 myhost postfix/qmgr[4078]: D682B38737A: to=<list-1@smtp.media-brokers.com>, orig_to=<list-1@media-brokers.com>, relay=none, delay=0.17, delays=0.1/0.08/0/0, dsn=4.3.0, status=deferred (mail transport unavailable) 2013-06-08T17:07:45-04:00 myhost postfix/qmgr[4078]: D682B38737A: to=<list-2@smtp.media-brokers.com>, orig_to=<list-2@media-brokers.com>, relay=none, delay=0.22, delays=0.1/0.13/0/0, dsn=4.3.0, status=deferred (mail transport unavailable) 2013-06-08T17:07:45-04:00 myhost postfix/qmgr[4078]: D682B38737A: to=<list-3@smtp.media-brokers.com>, orig_to=<list-3@media-brokers.com>, relay=none, delay=0.3, delays=0.1/0.2/0/0, dsn=4.3.0, status=deferred (mail transport unavailable) 2013-06-08T17:07:45-04:00 myhost postfix/qmgr[4078]: D682B38737A: to=<list-4@smtp.media-brokers.com>, orig_to=<list-4@media-brokers.com>, relay=none, delay=0.36, delays=0.1/0.26/0/0, dsn=4.3.0, status=deferred (mail transport unavailable) 2013-06-08T17:07:45-04:00 myhost postfix/qmgr[4078]: D682B38737A: to=<list-5@smtp.media-brokers.com>, orig_to=<list-5@media-brokers.com>, relay=none, delay=0.41, delays=0.1/0.32/0/0, dsn=4.3.0, status=deferred (mail transport unavailable) 2013-06-08T17:07:45-04:00 myhost postfix/qmgr[4078]: D682B38737A: to=<list-6@smtp.media-brokers.com>, orig_to=<list-6@media-brokers.com>, relay=none, delay=0.47, delays=0.1/0.37/0/0, dsn=4.3.0, status=deferred (mail transport unavailable)

I imagine that the two problems are being caused by the same problem, whatever it is...

It also seems to be something to do with how many recipients are involved. One or two appear to be ok, but more than that and it gets iffy...

Appreciate any more thoughts on this weirdness, because I'm stumped....

Mark Sapiro

5:44 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 06/08/2013 02:21 PM, Tanstaafl wrote:

...

I think that's a coincidence. The biggest problem is with delivery from Postfix to Mailman, at which point nothing knows how many list members there are or how many messages Mailman will send.

...

Appreciate any more thoughts on this weirdness, because I'm stumped....

See this <http://tech.groups.yahoo.com/group/postfix-users/message/245375>, particularly the replies from Wietse.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Tanstaafl

6:10 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 2013-06-08 5:44 PM, Mark Sapiro <mark@msapiro.net> wrote:

...

Also, as I said, lists that only have individual recipients work just fine, even with 30+ recipients.

Also the weirdness when a list member has their vacation enabled - they get the original list message, but the vacation message gets stuck in the queue with the error.

I'm thinking of trying to reinstalling (this is gentoo, so that will be easy) first mailman, then postfix... I'll probably try that tomorrow if no other solution presents itself.

Thanks for your help, Mark, much appreciated...

Mark Sapiro

June 2013

7:13 p.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 06/08/2013 03:10 PM, Tanstaafl wrote:

...

I read them all, and I don't think it is relevant (this is the same kernel and same versions of postfix dovecot and mailman for some time now), but, I changed the default limit to 10 and reloaded postfix, with the same error when sending to my 'All' list (that has only 6 members, all lists).

How long was the system up before the crash, and during that time did you change any dynamic configuration parameters the would have been reverted by the crash.

...

Also, as I said, lists that only have individual recipients work just fine, even with 30+ recipients.

...

Also the weirdness when a list member has their vacation enabled - they get the original list message, but the vacation message gets stuck in the queue with the error.

Another case of multiple messages to be handled by the local transport.

...

I'm thinking of trying to reinstalling (this is gentoo, so that will be easy) first mailman, then postfix... I'll probably try that tomorrow if no other solution presents itself.

If you reinstall Mailman without touching Postfix and that fixes this, I'll be incredibly surprised.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Tanstaafl

8:49 a.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 2013-06-08 7:13 PM, Mark Sapiro <mark@msapiro.net> wrote:

...

So, the doesn't occur with a single message to a single list, but it occurs when Postfix receives six messages at once FROM the lists-all list. Also your deliveries to to=<validuser1@media-brokers.com> et al are via the virtual transport which is apparently unaffected.

...

...
Also the weirdness when a list member has their vacation enabled - they get the original list message, but the vacation message gets stuck in the queue with the error.

...

Another case of multiple messages to be handled by the local transport.

Ok, yeah, I think you've nailed it... the problem is when more than one message at a time is passed to postfix/local...

...

...
I'm thinking of trying to reinstalling (this is gentoo, so that will be easy) first mailman, then postfix... I'll probably try that tomorrow if no other solution presents itself.

...

If you reinstall Mailman without touching Postfix and that fixes this, I'll be incredibly surprised.

I think you're right, I'll do postfix first.

...

All the evidence you've presented together with everything I know says this is a Postfix issue, not a Mailman issue. If I knew Postfix as well as I know Mailman, I could probably tell you how to fix this.

I agree with you that this seems to be a postfix problem, but is it possible that some kind of corruption in a userb could cause these warnings? To recap, they are:

The first one from postfix/master only shows up rarely - 11 times since I got the system back up, and within 5 or 10 minutes (but usually with 5 or 10 seconds) of postfix being restarted:

postfix/master[6406]: warning: master_wakeup_timer_event: service tlsmgr(private/tlsmgr): Resource temporarily unavailable

Then these (only when I try to send to my lists-all list):

warning: connect to transport private/local: Resource temporarily unavailable warning: connect to transport private/retry: Resource temporarily unavailable

I do have backups of my mysql userdb, as well as all others (mailman aliases/dbs, etc), so I can replace any of these from backups if it will fix the problem.

Thanks again for your time and help Mark...

Charles

Mark Sapiro

10:33 a.m.

New subject: Hello all - mailman down after power failure and hard shutdown

On 06/09/2013 05:49 AM, Tanstaafl wrote:

...

However, if aliases were involved, the thing to run is Mailman's bin/genaliases, but we know aliases are not the problem, both from the above and the fact that the lists all work 'one at a time'

There is definitely some resource contention issue when Postfix is trying to access the same socket for multiple messages.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Tanstaafl

4:20 p.m.

New subject: SOLVED - Re: Hello all - mailman down after power failure and hard shutdown

Ok, facepalm time...

I had forgotten that I had built a new kernel a few weeks ago, and changed it to the default - but hadn't properly tested it yet.

Reverting to the previous kernel resolved the problem.

I'm not sure what the heck I changed to cause this, but that'll sure tech me to never change the kernel boot default without proper testing.

Anyway, thanks for the assist and sorry for the noise.

Charles

On 2013-06-09 10:33 AM, Mark Sapiro <mark@msapiro.net> wrote:

...

4296

Age (days ago)

4297

Last active (days ago)

List overview

Download

16 comments

4 participants

participants (4)

Larry Kuenning
Mark Sapiro
Richard Shetron
Tanstaafl

Hello all - mailman down after power failure and hard shutdown

tags

participants (4)