[Neil and Vladimir say a threshold of 5000 is too high!] [Jeremy says a threshold of 100 is too low!] [merriment ensues]
... Any suggestions about what a more reasonable value would be and why it is reasonable?
Jeremy
There's not going to be consensus on this, as the threshold is a crude handle on a complex problem. That's sure better than *no* handle, but trash behavior is so app-specific that there simply won't be a killer argument. In cases like this, the geometric mean of the extreme positions is always the best guess <0.8 wink>:
import math math.sqrt(5000 * 100) 707.10678118654755
So 9 times out of 10 we can run it with a threshold of 707, and 1 out of 10 with 708 <wink>. Tuning strategies for gc *can* get as complex as OS scheduling algorithms, and for the same reasons: you're in the business of predicting the future based on just a few neurons keeping track of gross summaries of what happened before. A program can go through many phases of quite different behavior over its life (like I/O-bound vs compute-bound, or cycle-happy vs not), and at the phase boundaries past behavior is worse than irrelevant (it's actively misleading). So call it 700 for now. Or 1000. It's a bad guess at a crude heuristic regardless, and if we avoid extreme positions we'll probably avoid doing as much harm as we *could* do <0.9 wink>. Over time, a more interesting measure may be how much cyclic trash collections actually recover, and then collect less often the less trash we're finding (ditto more often when we're finding more). Another is like that, except replace "trash" with "cycles (whether trash or not)". The gross weakness of "net container allocations" is that it doesn't directly measure what this system was created to do. These things *always* wind up with dynamic measures, because static ones are just too crude across apps. Then the dynamic measures fail at phase boundaries too, and more gimmicks are added to compensate for that. Etc. Over time it will get better for most apps most of the time. For now, we want *both* to exercise the code in the field and not waste too much time, so hasty compromise is good for the beta. let-a-thousand-thresholds-bloom-ly y'rs - tim