On Sun, Jun 27, 2010 at 8:33 AM, Guido van Rossum <guido@python.org> wrote:
On Sat, Jun 26, 2010 at 4:45 PM, M.-A. Lemburg <mal@egenix.com> wrote:
Also note that garbage collection will not necessarily do what the user expects: it is well possible that big amounts of memory will stay allocated as unused space in pymalloc. This is not specific to the discussed case, but still a valid user concern. Greg Hazel observed this situation in his example.
Aha. So whereas the process size ballooned, there is no actual memory leak (his example threw away the exception each time through the loop), it's just that looking at process size is a bad way to assess memory leaks. I would like to reject this then as "that's just how Python's memory allocation works". As you say, it's not specific to this case; it comes up occasionally and it's just a matter of user education.
Leak? My example does not try to demonstrate a leak. It demonstrates excessive allocation. If you collect a few times after the test the memory usage of the process does drop to a reasonable level again. In a real-world application with long-lived traceback objects and more state, this excessive allocation becomes crippling. Go ahead, add a zero to the size of that list being created in the example. Without the traceback reference the process stays stable at 17MB, with the reference it balloons to consume all of the 2GB of RAM in my laptop, causing swapping. This is similar to the observed behavior of a real application, which is completely stable and requires relatively little memory when not using traceback objects, but quickly grows to an unmanageable size with traceback objects.
I don't think anything should be done about __traceback__ either -- frameworks that have this problem can work around it in various ways. Or, at least I don't see a reason to panic and roll back the feature. Maybe eventually it can be improved by adding some kind of functionality to control some details of the behavior.
This idea is about an improvement to control some details of the behavior. Keeping __traceback__ in more cases would be nothing to "panic" about, if tracebacks were not such "unsafe" objects. I have not yet seen any way for a framework to work around the references issue without discarding the traceback object entirely and losing the ability to re-raise. -Greg