On Mon, Sep 18, 2017 at 10:59 AM, Antoine Pitrou <antoine@python.org> wrote:
Le 18/09/2017 à 19:53, Nathaniel Smith a écrit :
Why are reference cycles a problem that needs solving?
Because sometimes they are holding up costly resources in memory when people don't expect them to. Such as large Numpy arrays :-)
Do we have any reason to believe that this is actually happening on a regular basis though?
Define "regular" :-) We did get some reports on dask/distributed about it.
Caused by uncollected cycles involving tracebacks? I looked here: https://github.com/dask/distributed/issues?utf8=%E2%9C%93&q=is%3Aissue%20memory%20leak and saw some issues with cycles causing delayed collection (e.g. #956) or the classic memory leak problem of explicitly holding onto data you don't need any more (e.g. #1209, bpo-29861), but nothing involving traceback cycles. It was just a quick skim though.
If it is then it might make sense to look at the cycle collection heuristics; IIRC they're based on a fairly naive count of how many allocations have been made, without regard to their size.
Yes... But just because a lot of memory has been allocated isn't a good enough heuristic to launch a GC collection.
I'm not an expert on GC at all, but intuitively it sure seems like allocation size might be a useful piece of information to feed into a heuristic. Our current heuristic is just, run a small collection after every 700 allocations, run a larger collection after 10 smaller collections.
What if that memory is gonna stay allocated for a long time? Then you're frequently launching GC runs for no tangible result except more CPU consumption and frequent pauses.
Every heuristic has problematic cases, that's why we call it a heuristic :-). But somehow every other GC language manages to do well-enough without refcounting... I think they mostly have more sophisticated heuristics than CPython, though. Off the top of my head, I know PyPy's heuristic involves the ratio of the size of nursery objects versus the size of the heap, and JVMs do much cleverer things like auto-tuning nursery size to make empirical pause times match some target.
Perhaps we could special-case tracebacks somehow, flag when a traceback remains alive after the implicit "del" clause at the end of an "except" block, then maintain some kind of linked list of the flagged tracebacks and launch specialized GC runs to find cycles accross that collection. That sounds quite involved, though.
We already keep a list of recently allocated objects and have a specialized GC that runs across just that collection. That's what generational GC is :-). -n -- Nathaniel J. Smith -- https://vorpus.org