
The following program: import rexec while 1: x = rexec.RExec() del x leaks memory at a fantastic rate. It seems clear (?) that this is due to the call to "set_rexec" at rexec.py:140, which creates a circular reference between the `rexec' and `hooks' objects. (There's even a nice comment to that effect). I'm curious however as to why the spiffy new cyclic-garbage collector doesn't pick this up? Just-wondering-ly y'rs, cgw

Me too. I turned on gc debugging (gc.set_debug(077) :-) and got messages suggesting that it is not collecting everything. The output looks like this: . . . gc: collecting generation 0... gc: objects in each generation: 764 6726 89174 gc: done. gc: collecting generation 1... gc: objects in each generation: 0 8179 89174 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 0 97235 gc: done. gc: collecting generation 0... gc: objects in each generation: 757 747 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 1386 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 757 2082 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 2721 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 757 3417 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 4056 97184 gc: done. . . . With the third number growing each time a "generation 1" collection is done. Maybe Neil can shed some light? The gc.garbage list is empty. This is about as much as I know about the GC stuff... --Guido van Rossum (home page: http://www.python.org/~guido/)

On Tue, Dec 19, 2000 at 09:48:47AM -0500, Guido van Rossum wrote:
Line 140 is not the only place a circular reference is created. There is another one which is trickier to find: def add_module(self, mname): if self.modules.has_key(mname): return self.modules[mname] self.modules[mname] = m = self.hooks.new_module(mname) m.__builtins__ = self.modules['__builtin__'] return m If the module being added is __builtin__ then m.__builtins__ = m. The GC currently doesn't track modules. I guess it should. It might be possible to avoid this circular reference but I don't know enough about how RExec works. Would something like: def add_module(self, mname): if self.modules.has_key(mname): return self.modules[mname] self.modules[mname] = m = self.hooks.new_module(mname) if mname != '__builtin__': m.__builtins__ = self.modules['__builtin__'] return m do the trick? Neil

That's certainly a good thing to do (__builtin__ has no business having a __builtins__!), but (in my feeble experiment) it doesn't make the leaks go away. Note that almost every module participates heavily in cycles: whenever you define a function f(), f.func_globals is the module's __dict__, which also contains a reference to f. Similar for classes, with an extra hop via the class object and its __dict__. --Guido van Rossum (home page: http://www.python.org/~guido/)

Neil Schemenauer writes:
No... if you change "add_module" in exactly the way you suggest (without worrying about whether it breaks the functionality of rexec!) and run the test while 1: rexec.REXec() you will find that it still leaks memory at a prodigious rate. So, (unless there is yet another module-level cyclic reference) I don't think this theory explains the problem.

Me too. I turned on gc debugging (gc.set_debug(077) :-) and got messages suggesting that it is not collecting everything. The output looks like this: . . . gc: collecting generation 0... gc: objects in each generation: 764 6726 89174 gc: done. gc: collecting generation 1... gc: objects in each generation: 0 8179 89174 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 0 97235 gc: done. gc: collecting generation 0... gc: objects in each generation: 757 747 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 1386 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 757 2082 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 2721 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 757 3417 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 4056 97184 gc: done. . . . With the third number growing each time a "generation 1" collection is done. Maybe Neil can shed some light? The gc.garbage list is empty. This is about as much as I know about the GC stuff... --Guido van Rossum (home page: http://www.python.org/~guido/)

On Tue, Dec 19, 2000 at 09:48:47AM -0500, Guido van Rossum wrote:
Line 140 is not the only place a circular reference is created. There is another one which is trickier to find: def add_module(self, mname): if self.modules.has_key(mname): return self.modules[mname] self.modules[mname] = m = self.hooks.new_module(mname) m.__builtins__ = self.modules['__builtin__'] return m If the module being added is __builtin__ then m.__builtins__ = m. The GC currently doesn't track modules. I guess it should. It might be possible to avoid this circular reference but I don't know enough about how RExec works. Would something like: def add_module(self, mname): if self.modules.has_key(mname): return self.modules[mname] self.modules[mname] = m = self.hooks.new_module(mname) if mname != '__builtin__': m.__builtins__ = self.modules['__builtin__'] return m do the trick? Neil

That's certainly a good thing to do (__builtin__ has no business having a __builtins__!), but (in my feeble experiment) it doesn't make the leaks go away. Note that almost every module participates heavily in cycles: whenever you define a function f(), f.func_globals is the module's __dict__, which also contains a reference to f. Similar for classes, with an extra hop via the class object and its __dict__. --Guido van Rossum (home page: http://www.python.org/~guido/)

Neil Schemenauer writes:
No... if you change "add_module" in exactly the way you suggest (without worrying about whether it breaks the functionality of rexec!) and run the test while 1: rexec.REXec() you will find that it still leaks memory at a prodigious rate. So, (unless there is yet another module-level cyclic reference) I don't think this theory explains the problem.
participants (3)
-
Charles G Waldman
-
Guido van Rossum
-
Neil Schemenauer