[Twisted-Python] Circular references in TLSMemoryBIOProtocol
Hello, I have the Twisted app that serves tons of short-lived TLS connections using TLSMemoryBIOFactory. I usually set loosened garbage collector thresholds in production environment for the sake of performance. But I've noticed that this app's RAM usage quickly grows up to unreasonable values. Digging into the issue using pdb and objgraph showed that protocol instances are still living long after they were closed. I found two circular dependencies which are created for each TLS connection: 1. Between twisted.protocols.policies.ProtocolWrapper and its self.wrappedProtocol 2. Between twisted.protocols.tls.TLSMemoryBIOProtocol and its self._tlsConnection Both of them cause protocol instance to not be deleted when the connection is closed. So all OpenSSL-related objects and all business-related data attached to that protocol instance are still living untill the next GC collection. This affects both RAM usage and performance (due to much more often GC collections) I've tried to fix both circular dependencies: replaced https://github.com/twisted/twisted/blob/trunk/src/twisted/protocols/policies... by self.wrappedProtocol.makeConnection(weakref.proxy(self)) and replaced https://github.com/twisted/twisted/blob/trunk/src/twisted/protocols/tls.py#L... by: self._tlsConnection = self.factory._createConnection(weakref.proxy(self)) Memory usage pattern changed drastically after this change. I've created demo script that makes 10k TLS loopback connections with GC disabled and measures the number of objects are still living after the work is done and total resident RAM consumption: https://gist.github.com/IlyaSkriblovsky/4dd3abfd5f67c64b13f1c673f56466f9 Output without the fix: N = 10000 , K = 100 objects before 50136 DummyServerProtocols still living 10000 objects after 439919 mem 778 mb Output with the fix: N = 10000 , K = 100 objects before 50133 DummyServerProtocols still living 0 objects after 159919 mem 96 mb So using weakrefs makes all protocol instances and instances of TLSMemoryBIOProtocol to be deleted right after a connection is closed. Less circular-dependent objects → less GC invocations → better performance. And I see much nicer RAM usage pattern in my app. Is it possible to fix circular deps in some more clean way? Can this be solved at all while user's code is able to try to touch both sides of circular dep after connection is closed? Please advice Thanks for consideration Best regards, Ilya
Hi Ilya, On January 17, 2018 at 3:09:52 PM, Ilya Skriblovsky (ilyaskriblovsky@gmail.com) wrote: [Trimmed for context] So using weakrefs makes all protocol instances and instances of TLSMemoryBIOProtocol to be deleted right after a connection is closed. Less circular-dependent objects → less GC invocations → better performance. And I see much nicer RAM usage pattern in my app. Is it possible to fix circular deps in some more clean way? Can this be solved at all while user's code is able to try to touch both sides of circular dep after connection is closed? Please advice Personally, I don’t mind the weaker approach, but if you wanted to be completely explicit, I’d look at modifying the connectionLost method of both the protocol and the protocol wrapper to break circular references. Thanks for consideration Best regards, Ilya Hope this helps, Daniel -- L. Daniel Burr ldanielburr@me.com (312) 656-8387
This sounds like an issue that should be reported as a bug and fixed! It would be great if you could come up with a performance regression test or benchmark which could validate that this doesn't regress, but, it's quite challenging to do this (especially for memory issues) so as long as it's adequately behaviorally tested I'm sure we could accept something.
Hooray!
Is it possible to fix circular deps in some more clean way? Can this be solved at all while user's code is able to try to touch both sides of circular dep after connection is closed? Please advice
Protocols and transports have a fairly defined lifecycle, and as L. Daniel Burr already pointed out, it would probably be appropriate to explicitly break these reference cycles in connectionLost. -g
Explicitly breaking cycle in ProtocolWrapper.connectionLost by any of: • self.wrappedProtocol = None • self.wrappedProtocol.transport = None • self.wrappedProtocol = weakref.proxy(self.wrappedProtocol) • self.wrappedProtocol.transport = weakref.proxy(self) ... breaks some existing tests :( Seems like these tests do some post-run checks against protocol instances and their transports. Not sure whether it is relevant to real-life usage. Will investigate more... - Ilya чт, 18 янв. 2018 г. в 0:09, Ilya Skriblovsky <ilyaskriblovsky@gmail.com>:
Yes, doing it only for TLSMemoryBIOProtocol fails test too :( SSL-related seem to be touching both ends of this reference cycle after connectionLost: 1. twisted/test/test_sslverify.py:2102 self.assertEqual(sProto.wrappedProtocol.data, b'') This one touches `wrappedProtocol` 2. twisted/test/proto_helpers.py:924 (waitUntilAllDisconnected, used by twisted.web.test.test_webclient.WebClientSSLTests, for example) if not True in [x.transport is not None and x.transport.connected for x in protocols]: and this one touches `transport` field There are other examples as well. Sure, these test failures can probably be fixed by changing tests themselves, for example by making them to hold their own references to both wrapping and wrapped protocols. But I'm not sure this wouldn't break any user's code too... For my app it was easily fixed by breaking cycle in my protocol's connectionLost. But I'm not experienced enough in Twisted internals to be sure doing it inside TLSMemoryBIOProtocol wouldn't break any real-world usage scenarios. - Ilya сб, 20 янв. 2018 г. в 9:10, Glyph <glyph@twistedmatrix.com>:
On Jan 20, 2018, at 9:32 AM, Ilya Skriblovsky <ilyaskriblovsky@gmail.com> wrote:
I think that this is worth trying, at least. If you could write a PR that fixes the tests, you might want to try following the exception process documented in https://twistedmatrix.com/documents/current/core/development/policy/compatib... <https://twistedmatrix.com/documents/current/core/development/policy/compatib...> and see if anyone has any code that might break. I'm pretty sure that the direction to break the cycle in is to break the reference to .wrappedProtocol, and not to mess with .wrappedProtocol.transport (which is not really something that should be touched from the outside of the wrapped protocol). -glyph
I've created the pull request with breaking cycles in connectionLost, please consider: https://github.com/twisted/twisted/pull/955 This change seems to fit well to the reasoning of Compatibility Exception process. Should I create new thread in the mailing list with "INCOMPATIBLE CHANGE" in a subject? вс, 21 янв. 2018 г. в 12:51, Glyph <glyph@twistedmatrix.com>:
Never mind, I realized I didn't some steps 10+ from The Manual http://twistedmatrix.com/trac/wiki/TwistedDevelopment#SubmittingaPatch Will fix that пн, 29 янв. 2018 г., 16:52 Ilya Skriblovsky <ilyaskriblovsky@gmail.com>:
Just wanted to make sure, did I all what I should do for putting this ticket into review: https://twistedmatrix.com/trac/ticket/9374 ? Should I just wait for maintainers to review it? Thanks вт, 30 янв. 2018 г. в 6:28, Glyph <glyph@twistedmatrix.com>:
On Mar 9, 2018, at 4:03 AM, Ilya Skriblovsky <ilyaskriblovsky@gmail.com> wrote:
Just wanted to make sure, did I all what I should do for putting this ticket into review: https://twistedmatrix.com/trac/ticket/9374 <https://twistedmatrix.com/trac/ticket/9374> ? Should I just wait for maintainers to review it?
You've done all the right things. :) Twisted maintainers go through the list at https://twisted.reviews <https://twisted.reviews/> and review things, and you can indeed see that your change is on that list. Unfortunately, as you can see, there's quite a backlog. Personally I have been quite busy with a new child and a new startup, so (as previously discussed on this very mailing list) I haven't been doing much in the way of code review myself recently. Even what little open source time I have needs to be focused elsewhere at the moment. The best thing you can do to accelerate your own change getting reviewed is to code review others' changes, so that when a reviewer arrives to stochastically select something to review it is more likely that they'll select your thing instead of one of the other things :-). The second best thing you can do is to donate a lot of money using the form on the front page of the web site: https://twistedmatrix.com/trac/#DonatetoTwisted <https://twistedmatrix.com/trac/#DonatetoTwisted>, which, at some point, will allow us to re-start the Twisted fellowship program http://labs.twistedmatrix.com/2015/06/twisted-fellowship-2015-call-for.html <http://labs.twistedmatrix.com/2015/06/twisted-fellowship-2015-call-for.html> and have someone actually keep the review queue clear as their actual paid responsibility :). -g
On 10 March 2018 at 04:57, Glyph <glyph@twistedmatrix.com> wrote:
I went for an initial review.
+1 I am thinking of adding this to the wiki page for the review process. I think that we can try and see how it goes. Doing a review is not easy, you end up with the responsibility for the approved code, but with the size of the queue I think that we should try.
-g
+1 on this. I think that it might work even if we get someone to work on this 1 hour per day. Glyph, do you know what is the required money we need to raise to start this program... and are the current money on bank now? Thanks! -- Adi Roiban
participants (4)
-
Adi Roiban
-
Glyph
-
Ilya Skriblovsky
-
L. Daniel Burr