
On Apr 9, 2021, at 4:07 PM, Richard van der Hoff <richard@matrix.org> wrote:
Hi folks,
I've been investigating https://github.com/matrix-org/synapse/issues/9566, which amounts to: "when there is a TLS error connecting to the SMTP server, the resultant exception is unreadable".
I think I've traced the problem to the fact that SMTPSenderFactory.clientConnectionFailed is being called with an unhelpful ConnectionAborted rather than anything more descriptive.
I've then reproduced that with a simpler test case: see https://gist.github.com/richvdh/909761ff5dab23f0873eeddd7936a740. As you can see, the output is: "Factory lost connection. Reason: Connection was aborted locally using ITCPTransport.abortConnection."
This seems to be thanks to TLSMemoryBIOProtocol.failVerification, which stashes the error and calls abortConnection(): https://github.com/twisted/twisted/blob/trunk/src/twisted/protocols/tls.py#L....
At this point I'm struggling. Is the SMTP code holding the Factory wrong? Or is it reasonable to expect the verification error to propagate into clientConnectionFailed - in which case, how could this work?
Thanks for your thoughts!
Hi Richard, Sorry for the delayed response here. This is a bug in Twisted, and I think it boils down to this line: https://github.com/twisted/twisted/blob/3c868ac11786eef7ea269caa3056f0085412... <https://github.com/twisted/twisted/blob/3c868ac11786eef7ea269caa3056f0085412...> The code as written here appears to be expecting this sequence: failVerification is called with a reason containing a helpful OpenSSL verification error we save that reason as `self._reason` for reporting later, calling abortConnection() since the connection got aborted, we expect our underlying transport to call loseConnection on us we will then get a falsey reason [?!?!] and as such we will use self._reason instead Assumption 4 is nonsense and has never been true within Twisted as far as I know; connectionLost always gets a Failure, Failures are never falsey, so we will never use self._reason. To fix this we need to actually honor self._reason under the conditions we expect, i.e. it's a ConnectionAborted https://github.com/twisted/twisted/blob/3c868ac11786eef7ea269caa3056f0085412... <https://github.com/twisted/twisted/blob/3c868ac11786eef7ea269caa3056f0085412...> Does this make sense? (Will you be able to file / fix this bug?) Thanks for the report, -g