[Mailman-Users] Stuck OutgoingRunner
Yasuhito FUTATSUKI
futatuki at poem.co.jp
Tue Feb 6 22:43:18 EST 2018
On 02/07/18 01:01, Mark Sapiro wrote:
> On 02/06/2018 03:51 AM, Sebastian Hagedorn wrote:
>>
>> --On 4. Februar 2018 um 12:54:43 +0900 Yasuhito FUTATSUKI
>> <futatuki at poem.co.jp> wrote:
>>>
>>> As far as I read the code, if OutgoingRunner catch SIGINT during waiting
>>> for response from the MTA, the signal handler for SIGINT in qrunner set
>>> flag to exit from loop, then socket module raise socket.error for EINTR,
>>> but SMTP module retry to read from socket and waiting for response until
>>> receiving response or connection closing (from MTA side or by error).
>>> Thus it cannot reach to the code to exit if the connection is kept alive
>>> and MTA send no data.
I'm sorry, above is partly wrong, it is not smtplib.SMTP object to continue
reading but socket module itself.(on Python 2.7.14, socket._fileobject.readline())
But it does not affect main subject.
>> Thanks. I think that might be a possible explanation, but what could
>> cause a SIGINT to be sent to the OutgoingRunner?
>
>
> The above is an explanation of why the runner doesn't exit when it
> receives a SIGINT or SIGTERM from the master when you restart or stop
> Mailman and why you have to SIGKILL it. It suggests that what's
> happening when it's hung is it's waiting for a response from the MTA.
thanks to explain for my intension.
In fact,
On 02/02/18 19:26, Sebastian Hagedorn wrote:
> root at mailman3/usr/lib/mailman/bin]$ strace -p 1677
> Process 1677 attached
> recvfrom(10, ^CProcess 1677 detached
indicates the OutGoingRunner process 1677 was still in recvfrom(2)
system call (perhaps called from recv(2)) for FD 10, and
> [root at mailman3/usr/lib/mailman/bin]$ lsof -p 1677
> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
> python2.7 1677 mailman cwd DIR 253,0 4096 173998 /usr/lib/mailman
> python2.7 1677 mailman rtd DIR 253,0 4096 2 /
> ...
> python2.7 1677 mailman 10u IPv6 46441320 0t0 TCP mailman3.rrz.uni-koeln.de:55764->smtp-out.rrz.uni-koeln.de:smtp (ESTABLISHED)
indicates its FD 10 was ESTABLISHED connection to the MTA.
If the MTA is hanging up (or very slow progress) in application layer and
keeping alive TCP connection in lower layer, client using smtplib
without specifying timeout, like current SMTPDirect handler in Mailman,
must wait for response or the MTA dying.
Unfortunately smtplib for Python 2 before 2.6 don't have way to specify
timeout. It uses a socket in blocking mode unless seting default timeout
by using socket.setdefaulttimeout() before calling smtplib.SMTP.connction().
For Python 2.6 and above, it can be specified on create smtplib.SMTP object.
--
Yasuhito FUTATSUKI <futatuki at poem.co.jp>
More information about the Mailman-Users
mailing list