
I think in Python 2.0 it would be nice to have some way to reclaim circular dependencies without the programmer explicitly having to do something like implement a destroy() method and requiring other programmers to (remember to) call it. I forget what the current state of affairs is w.r.t. future memory management in Python. Not knowing anything much about memory management, would it be possible to have a sort of mixed ref count/garbage collection system where you only use the gc stuff as a last resort? My thought is that it would be useful to use gc to find and reclaim circular garbage. can-you-tell-I-just-got-bitten-by-a-circular-reference?-ly y'rs Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/~skip/ 847-971-7098 | Python: Programming the way Guido indented...

Skip Montanaro wrote:
If you don't want to wait until 2.0 becomes GA, you could try weak references: http://starship.skyport.net/~lemburg/mxProxy.html -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 102 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

[Skip Montanaro]
This was debated (again) at great length on c.l.py just a few months ago. Guido chimed in with a proposal to keep track of only the dicts that have been allocated, and now and again mark everything reachable from the root set and nuke whatever dicts don't end up marked. Cycles involving dicts would get reclaimed this way, but not cycles not involving dicts. The approach to destructors for objects in cycles was "tough -- they don't get called". What to do about destructors for objects that are not themselves involved in cycles but are reachable only from dead cycles (so are in fact dead too) wasn't addressed. Seemed possible that stuff reachable from ordinary dicts (not in a cycle, and neither reachable from a cycle) would behave differently than today, since the "list of all dicts" may keep the stuff artificially alive until the next mark+sweep, even if the refcount on the stuff fell to zero; there's probably an OK way around that, though. Anyway, Guido was aiming for the minimal changes that could possibly do real good. It didn't pretend to reclaim all cycles, and was (IMO) too eager to punt on the hard issues (the combo of cycles, destructors and resurrection is a god-awful mess, even in theory; Scheme uses callbacks to dump the problems back on the users Java has incredibily elaborate rules that are both bulletproof and unusable; the Boehm collector lets objects with destructors that are in cycles simply leak, rather than do a wrong thing; Stroustrup has flip-flopped and most recently argued for Guido's "reclaim the memory but don't call the destructors" approach, but a member of the C++ committee told me he's overwhelmingly opposed on this one (I know I would oppose it)). In any case, nothing has come of it, and no easy principled solution is in sight. OTOH, if Guido balks at explaining what "__" means to 12-year-olds, wait until he tries to explain immortal cyclic trash <wink>. Perl scores points for its brute-force end-of-thread M&S, but I believe complex user-level data structures are much rarer in Perl, simply due to the clumsiness of the syntax and explicit reference model. Perl's version of "nested functions" don't actually nest (all "def"s are floated to the top level by the compiler, regardless of how deeply they're nested), so Perl's lexical closures don't create cyclic trash either (well, they do, but in the same sense there's a cycle between a Python module namespace and the functions in that module -- Perl also special-cases the snot out of those and busts those cycles by brute force). So there you go! It needs to be solved and nobody has a clue <wink>. if-java-hype-hadn't-suffered-exponential-decay-we-could-have- dumped-it-on-the-jvm-by-now-ly y'rs - tim

Tim Peters wrote:
You could probably tackle the problem by doing local mark&sweep whenever the ref count on a dictionary falls down to 1 (meaning that it is only referenced from the list of all dicts). This is what I do in mxProxy's weak reference implementation and to my surprise it solved all those strange situations where objects are kept alive longer than they would have normally.
Not calling the destructor will cause leakage in all objects allocating extra storage, such as lists, instances and probably just about any dynamically sized object there is in Python... solving the problem only half way. Plus you will definitely run into trouble as soon as external resources are involved, e.g. open files or connections to databases. Perhaps we should give more power to the user instead of trying to give him fuzzy feelings about what's happening underneath the hood. Builtin weak references or other indirect ways of accessing objects (e.g. by giving unique names to the involved objects) can solve many of those circ. ref. problems. BTW, I usually use an instrumented Python interpreter to track down circular references: it uses a tracing hook in the allocation/deallocation code of Python instances which is used when Python is run in debugging mode (python -d). The hook calls a function sys.traceinstances (if present) which allows me to keep a track record of all allocated instances: def traceinstances(action,inst): """ Tracing hook. This is called whenever an instances is created and destroyed. action is either 'create' or 'delete'; inst points to the instance object. """ ... If anyone is interested I can post the patch (against Python 1.5). -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 101 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

If I remember well, the only destructors not called would be __del__ methods, since the dependencies between to-be-deleted instances are unknown to the collector. Regular (C-level) destructors would of course be called. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Ok, so low-stuff will not break. But what about e.g. wrappers around these low-level (C-level) objects written in Python, e.g. database abstraction classes ? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 101 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

[Guido van Rossum]
Seems to me that the dependencies among to-be-deleted arbitrary C objects are equally unknown to the collector. Assuming there's a fundamental difference between objects implemented in Python and objects implemented in C seems shaky on the face of it. Or if it's not just a convenient assumption, on what is it based? and-what-about-objects-implemented-in-fortran<wink>?-ly y'rs - tim

[Tim]
My assumption was based on the standard Python objects. Cycles necessarily have to include dictionaries or lists (modules, classes and instances link to each other through dictionaries; ditto for function objects; i'm crossing my fingers here for stack frame and traceback objects :-) and I can do things to these to get rid of the links without getting rid of the objects: del L[:] or D.clear(). Third party C objects might have interdependencies similar to those found in Python instances, but my gut feeling is that these aren't as problematic -- e.g. interdependencies between C modules are rare (because the machinery is cumbersome) while they are common between Python modules; and C code isn't susceptible to the problems that Python destructors encounter when the modules they import have already been destroyed. I know, it's an unusual amount of handwaving... --Guido van Rossum (home page: http://www.python.org/~guido/)

Marc> BTW, I usually use an instrumented Python interpreter to track Marc> down circular references: ... Marc> If anyone is interested I can post the patch (against Python 1.5). That would be interesting to look at. I found my latest circular reference by building Python with Py_DEBUG defined and trudging through the output at the end. Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/~skip/ 847-971-7098 | Python: Programming the way Guido indented...

Skip Montanaro wrote:
Here it is: --- Objects/orig/classobject.c Thu Jan 1 20:39:04 1998 +++ Objects/classobject.c Sat Aug 8 21:38:57 1998 @@ -403,10 +588,37 @@ PyInstance_New(class, arg, kw) inst = NULL; } Py_DECREF(res); } } + /* sys.traceinstances hook */ + if (Py_DebugFlag && inst) { + PyObject *fct; + + fct = PySys_GetObject("traceinstances"); + if (fct) { + PyObject *v,*arg; + PyObject *error_type, *error_value, *error_traceback; + + /* Save and clear any exception */ + PyErr_Fetch(&error_type,&error_value, + &error_traceback); + PyErr_Clear(); + arg = Py_BuildValue("(sO)","create",(PyObject *)inst); + v = PyEval_CallObject(fct,arg); + Py_DECREF(arg); + if (!v) { + PyErr_Print(); + PyErr_Clear(); + } + else + Py_DECREF(v); + /* Restore exception state */ + PyErr_Restore(error_type,error_value, + error_traceback); + } + } return (PyObject *)inst; } /* Instance methods */ @@ -460,10 +672,31 @@ instance_dealloc(inst) } else Py_DECREF(res); Py_DECREF(del); } + /* sys.traceinstances hook */ + if (Py_DebugFlag) { + PyObject *fct; + + fct = PySys_GetObject("traceinstances"); + if (fct) { + PyObject *v,*arg; + + /* Clear any previous exception */ + PyErr_Clear(); + arg = Py_BuildValue("(sO)","delete",(PyObject *)inst); + v = PyEval_CallObject(fct,arg); + Py_DECREF(arg); + if (!v) { + PyErr_Print(); + PyErr_Clear(); + } + else + Py_DECREF(v); + } + } /* Restore the saved exception and undo the temporary revival */ PyErr_Restore(error_type, error_value, error_traceback); /* Can't use DECREF here, it would cause a recursive call */ if (--inst->ob_refcnt > 0) { #ifdef COUNT_ALLOCS -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 101 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

Skip Montanaro wrote:
If you don't want to wait until 2.0 becomes GA, you could try weak references: http://starship.skyport.net/~lemburg/mxProxy.html -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 102 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

[Skip Montanaro]
This was debated (again) at great length on c.l.py just a few months ago. Guido chimed in with a proposal to keep track of only the dicts that have been allocated, and now and again mark everything reachable from the root set and nuke whatever dicts don't end up marked. Cycles involving dicts would get reclaimed this way, but not cycles not involving dicts. The approach to destructors for objects in cycles was "tough -- they don't get called". What to do about destructors for objects that are not themselves involved in cycles but are reachable only from dead cycles (so are in fact dead too) wasn't addressed. Seemed possible that stuff reachable from ordinary dicts (not in a cycle, and neither reachable from a cycle) would behave differently than today, since the "list of all dicts" may keep the stuff artificially alive until the next mark+sweep, even if the refcount on the stuff fell to zero; there's probably an OK way around that, though. Anyway, Guido was aiming for the minimal changes that could possibly do real good. It didn't pretend to reclaim all cycles, and was (IMO) too eager to punt on the hard issues (the combo of cycles, destructors and resurrection is a god-awful mess, even in theory; Scheme uses callbacks to dump the problems back on the users Java has incredibily elaborate rules that are both bulletproof and unusable; the Boehm collector lets objects with destructors that are in cycles simply leak, rather than do a wrong thing; Stroustrup has flip-flopped and most recently argued for Guido's "reclaim the memory but don't call the destructors" approach, but a member of the C++ committee told me he's overwhelmingly opposed on this one (I know I would oppose it)). In any case, nothing has come of it, and no easy principled solution is in sight. OTOH, if Guido balks at explaining what "__" means to 12-year-olds, wait until he tries to explain immortal cyclic trash <wink>. Perl scores points for its brute-force end-of-thread M&S, but I believe complex user-level data structures are much rarer in Perl, simply due to the clumsiness of the syntax and explicit reference model. Perl's version of "nested functions" don't actually nest (all "def"s are floated to the top level by the compiler, regardless of how deeply they're nested), so Perl's lexical closures don't create cyclic trash either (well, they do, but in the same sense there's a cycle between a Python module namespace and the functions in that module -- Perl also special-cases the snot out of those and busts those cycles by brute force). So there you go! It needs to be solved and nobody has a clue <wink>. if-java-hype-hadn't-suffered-exponential-decay-we-could-have- dumped-it-on-the-jvm-by-now-ly y'rs - tim

Tim Peters wrote:
You could probably tackle the problem by doing local mark&sweep whenever the ref count on a dictionary falls down to 1 (meaning that it is only referenced from the list of all dicts). This is what I do in mxProxy's weak reference implementation and to my surprise it solved all those strange situations where objects are kept alive longer than they would have normally.
Not calling the destructor will cause leakage in all objects allocating extra storage, such as lists, instances and probably just about any dynamically sized object there is in Python... solving the problem only half way. Plus you will definitely run into trouble as soon as external resources are involved, e.g. open files or connections to databases. Perhaps we should give more power to the user instead of trying to give him fuzzy feelings about what's happening underneath the hood. Builtin weak references or other indirect ways of accessing objects (e.g. by giving unique names to the involved objects) can solve many of those circ. ref. problems. BTW, I usually use an instrumented Python interpreter to track down circular references: it uses a tracing hook in the allocation/deallocation code of Python instances which is used when Python is run in debugging mode (python -d). The hook calls a function sys.traceinstances (if present) which allows me to keep a track record of all allocated instances: def traceinstances(action,inst): """ Tracing hook. This is called whenever an instances is created and destroyed. action is either 'create' or 'delete'; inst points to the instance object. """ ... If anyone is interested I can post the patch (against Python 1.5). -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 101 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

If I remember well, the only destructors not called would be __del__ methods, since the dependencies between to-be-deleted instances are unknown to the collector. Regular (C-level) destructors would of course be called. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Ok, so low-stuff will not break. But what about e.g. wrappers around these low-level (C-level) objects written in Python, e.g. database abstraction classes ? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 101 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

[Guido van Rossum]
Seems to me that the dependencies among to-be-deleted arbitrary C objects are equally unknown to the collector. Assuming there's a fundamental difference between objects implemented in Python and objects implemented in C seems shaky on the face of it. Or if it's not just a convenient assumption, on what is it based? and-what-about-objects-implemented-in-fortran<wink>?-ly y'rs - tim

[Tim]
My assumption was based on the standard Python objects. Cycles necessarily have to include dictionaries or lists (modules, classes and instances link to each other through dictionaries; ditto for function objects; i'm crossing my fingers here for stack frame and traceback objects :-) and I can do things to these to get rid of the links without getting rid of the objects: del L[:] or D.clear(). Third party C objects might have interdependencies similar to those found in Python instances, but my gut feeling is that these aren't as problematic -- e.g. interdependencies between C modules are rare (because the machinery is cumbersome) while they are common between Python modules; and C code isn't susceptible to the problems that Python destructors encounter when the modules they import have already been destroyed. I know, it's an unusual amount of handwaving... --Guido van Rossum (home page: http://www.python.org/~guido/)

Marc> BTW, I usually use an instrumented Python interpreter to track Marc> down circular references: ... Marc> If anyone is interested I can post the patch (against Python 1.5). That would be interesting to look at. I found my latest circular reference by building Python with Py_DEBUG defined and trudging through the output at the end. Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/~skip/ 847-971-7098 | Python: Programming the way Guido indented...

Skip Montanaro wrote:
Here it is: --- Objects/orig/classobject.c Thu Jan 1 20:39:04 1998 +++ Objects/classobject.c Sat Aug 8 21:38:57 1998 @@ -403,10 +588,37 @@ PyInstance_New(class, arg, kw) inst = NULL; } Py_DECREF(res); } } + /* sys.traceinstances hook */ + if (Py_DebugFlag && inst) { + PyObject *fct; + + fct = PySys_GetObject("traceinstances"); + if (fct) { + PyObject *v,*arg; + PyObject *error_type, *error_value, *error_traceback; + + /* Save and clear any exception */ + PyErr_Fetch(&error_type,&error_value, + &error_traceback); + PyErr_Clear(); + arg = Py_BuildValue("(sO)","create",(PyObject *)inst); + v = PyEval_CallObject(fct,arg); + Py_DECREF(arg); + if (!v) { + PyErr_Print(); + PyErr_Clear(); + } + else + Py_DECREF(v); + /* Restore exception state */ + PyErr_Restore(error_type,error_value, + error_traceback); + } + } return (PyObject *)inst; } /* Instance methods */ @@ -460,10 +672,31 @@ instance_dealloc(inst) } else Py_DECREF(res); Py_DECREF(del); } + /* sys.traceinstances hook */ + if (Py_DebugFlag) { + PyObject *fct; + + fct = PySys_GetObject("traceinstances"); + if (fct) { + PyObject *v,*arg; + + /* Clear any previous exception */ + PyErr_Clear(); + arg = Py_BuildValue("(sO)","delete",(PyObject *)inst); + v = PyEval_CallObject(fct,arg); + Py_DECREF(arg); + if (!v) { + PyErr_Print(); + PyErr_Clear(); + } + else + Py_DECREF(v); + } + } /* Restore the saved exception and undo the temporary revival */ PyErr_Restore(error_type, error_value, error_traceback); /* Can't use DECREF here, it would cause a recursive call */ if (--inst->ob_refcnt > 0) { #ifdef COUNT_ALLOCS -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 101 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
participants (5)
-
Greg Ward
-
Guido van Rossum
-
M.-A. Lemburg
-
Skip Montanaro
-
Tim Peters