MM 2.1 -- another qrunner crash

Any thoughts on this?
Jan 06 11:11:44 2003 (13289) Uncaught runner exception: [Errno 4] Interrupted system call Jan 06 11:11:45 2003 (13289) Traceback (most recent call last): File "/usr/local/mailman/Mailman/Queue/Runner.py", line 105, in _oneloop self._onefile(msg, msgdata) File "/usr/local/mailman/Mailman/Queue/Runner.py", line 155, in _onefile keepqueued = self._dispose(mlist, msg, msgdata) File "/usr/local/mailman/Mailman/Queue/OutgoingRunner.py", line 61, in _dispose self._func(mlist, msg, msgdata) File "/usr/local/mailman/Mailman/Handlers/SMTPDirect.py", line 139, in process deliveryfunc(mlist, msg, msgdata, envsender, refused, conn) File "/usr/local/mailman/Mailman/Handlers/SMTPDirect.py", line 335, in bulkdeliver refused = conn.sendmail(envsender, recips, msgtext) File "/usr/local/mailman/Mailman/Handlers/SMTPDirect.py", line 61, in sendmail results = self.__conn.sendmail(envsender, recips, msgtext) File "/usr/src/build/143041-i386/install/usr/lib/python2.2/smtplib.py", line 649, in sendmail (code,resp) = self.data(msg) File "/usr/src/build/143041-i386/install/usr/lib/python2.2/smtplib.py", line 452, in data (code,msg)=self.getreply() File "/usr/src/build/143041-i386/install/usr/lib/python2.2/smtplib.py", line 325, in getreply line = self.file.readline() IOError: [Errno 4] Interrupted system call
Jan 06 11:11:45 2003 (13289) SHUNTING: 1041867350.917309+1f446e9ad4c55f50183ff7a818a95c53a2ccf1ad

Just a stab in the dark... Is it communicating with the smtp server? It looks like its getting a connection refused to sendmail....maybe?
- David Gibbs (david@midrange.com) wrote:
| Matthew Davis /\ http://dogpound.vnet.net/ | |--------------------------------------------| | Monday, January 06, 2003 / 11:07PM |
Ability is a good thing but stability is even better.

"DG" == David Gibbs <david@midrange.com> writes:
DG> Any thoughts on this? | IOError: [Errno 4] Interrupted system call Just that something's wrong with your mail server, or the connection between it and Mailman, or something sent the qrunner process a signal while it was in the middle of talking to your mail server. Try this patch; it won't avoid the interrupted system call, but it ought to handle the situation more gracefully (assuming it's a transient problem). Untested, but let me know if it works for you. -Barry -------------------- snip snip -------------------- Index: SMTPDirect.py =================================================================== RCS file: /cvsroot/mailman/mailman/Mailman/Handlers/SMTPDirect.py,v retrieving revision 2.25 diff -u -r2.25 SMTPDirect.py --- SMTPDirect.py 6 Nov 2002 04:43:54 -0000 2.25 +++ SMTPDirect.py 7 Jan 2003 05:35:24 -0000 @@ -337,7 +337,7 @@ refused = e.recipients # MTA not responding, or other socket problems, or any other kind of # SMTPException. In that case, nothing got delivered - except (socket.error, smtplib.SMTPException), e: + except (socket.error, smtplib.SMTPException, IOError), e: # BAW: should this be configurable? syslog('smtp', 'All recipients refused: %s', e) # If the exception had an associated error code, use it, otherwise,

barry@python.org said:
I kind of figured it was something like that ... although MM 2.0.13 never encountered problems like this (that I knew of). I did, however, recently upgrade to Redhat 8.0.
I have noticed that, on occasion, I have a sendmail task that will sit for extended periods of time in 'data' mode ... the PS entry looks like this:
31445 ? S 0:02 sendmail: h07Knsb2031445 localhost [127.0.0.1]: data
When I looked at the mail queue file's associated with id "h07Knsb2031445", they were defintely associated with the mailing list.
I'll give it a try ... but even after I applied the patch, I saw the hung task again.
Thanks!
david
-- "Who said I couldn't have it all?" -B Gates

"David Gibbs" <david@midrange.com> wrote in message news:30124.208.248.38.130.1041973126.squirrel@webmail.midrange.com...
I've tweaked & tuned a bit and I think I've got it working well.
Had to create another sendmail instance that didn't cannonify and was queueonly, but I haven't had a similar crash in a day or so.
Thanks to all who directly, or indirectly, helped.
david

Just a stab in the dark... Is it communicating with the smtp server? It looks like its getting a connection refused to sendmail....maybe?
- David Gibbs (david@midrange.com) wrote:
| Matthew Davis /\ http://dogpound.vnet.net/ | |--------------------------------------------| | Monday, January 06, 2003 / 11:07PM |
Ability is a good thing but stability is even better.

"DG" == David Gibbs <david@midrange.com> writes:
DG> Any thoughts on this? | IOError: [Errno 4] Interrupted system call Just that something's wrong with your mail server, or the connection between it and Mailman, or something sent the qrunner process a signal while it was in the middle of talking to your mail server. Try this patch; it won't avoid the interrupted system call, but it ought to handle the situation more gracefully (assuming it's a transient problem). Untested, but let me know if it works for you. -Barry -------------------- snip snip -------------------- Index: SMTPDirect.py =================================================================== RCS file: /cvsroot/mailman/mailman/Mailman/Handlers/SMTPDirect.py,v retrieving revision 2.25 diff -u -r2.25 SMTPDirect.py --- SMTPDirect.py 6 Nov 2002 04:43:54 -0000 2.25 +++ SMTPDirect.py 7 Jan 2003 05:35:24 -0000 @@ -337,7 +337,7 @@ refused = e.recipients # MTA not responding, or other socket problems, or any other kind of # SMTPException. In that case, nothing got delivered - except (socket.error, smtplib.SMTPException), e: + except (socket.error, smtplib.SMTPException, IOError), e: # BAW: should this be configurable? syslog('smtp', 'All recipients refused: %s', e) # If the exception had an associated error code, use it, otherwise,

barry@python.org said:
I kind of figured it was something like that ... although MM 2.0.13 never encountered problems like this (that I knew of). I did, however, recently upgrade to Redhat 8.0.
I have noticed that, on occasion, I have a sendmail task that will sit for extended periods of time in 'data' mode ... the PS entry looks like this:
31445 ? S 0:02 sendmail: h07Knsb2031445 localhost [127.0.0.1]: data
When I looked at the mail queue file's associated with id "h07Knsb2031445", they were defintely associated with the mailing list.
I'll give it a try ... but even after I applied the patch, I saw the hung task again.
Thanks!
david
-- "Who said I couldn't have it all?" -B Gates

"David Gibbs" <david@midrange.com> wrote in message news:30124.208.248.38.130.1041973126.squirrel@webmail.midrange.com...
I've tweaked & tuned a bit and I think I've got it working well.
Had to create another sendmail instance that didn't cannonify and was queueonly, but I haven't had a similar crash in a day or so.
Thanks to all who directly, or indirectly, helped.
david
participants (3)
-
barry@python.org
-
David Gibbs
-
Matthew Davis