[Jason Evans] ...
The low level tree code is implemented completely separately from the Python object wrapper code. This means that, for example, the Python "Node" object does not actually store PyObject* pointers to Ring objects; instead it uses the low level tree code to find out what Ring objects are attached to it.
I think you're in for a rude awakening if you expect any language's gc support to help you manage arbitrary blobs of memory obtained from the system malloc(). If all the nodes in your object graph derived from PyObject, Python's gc would do it all for you. If you want to plant pointers in and among arbitrary C structs, then you'll have to implement your own gc, or hope that you can bash a general gc thingie (like BDW -- and I'm not sure there is anything else of that sort usefully available, so "like" is probably optimistic) into guessing the layout of your structs.
Crux was designed this way in order to be able to implement various low level algorithms such that they never have to call interpreter-related code.
I don't know why that was thought to be desirable, or even exactly what it means. One plausible meaning is that you don't want to tie the "various low level algorithms" to Python, because you hope to reuse them in other, non-Python contexts. And, in addition to that, you don't want, or don't know how, to abstract the necessary operations into macros or functions that are pluggable for the different contexts you have in mind. As a very simple example, using Py_INCREF in a function doesn't tie that function to Python -- if you want to use it in a non-Python context, #define Py_INCREF(x) gets rid of such calls. But I'm guessing too much about what you might mean, so I'll stop there.
As such, reference breaking code must actually tear down the low level tree in such a way that it is always "consistent" (simply setting pointers to NULL doesn't fit well with this way of doing things).
Martin had in mind that the pointers were to properly refcounted PyObject*, in which case his advice was sound. [Martin]
Hmmm. Even if a ring member holds only a single reference to another ring member, would not calling visit() on this member in turn invoke visit() on the other one, which would, in turn, invoke visit() on the next one, and so on?
[Jason]
Yes, this is true.
It's false, but I covered that one yesterday. What's passed back to visit() has nothing to do with which objects' tp_traverse()s get called. Registering ("tracking") PyObject*s of gc'able types is what determines which objects' tp_traverse()s get called. Python's gc doesn't try to determine what's reachable, it determines what isn't reachable from pointers it *hasn't* been told about. It infers "reachable from outside" from the refcounts: if all the refcounts on an object are not accounted for by the objects gc *has* been told about, then that object is directly reachable from something gc hasn't been told about, so gc dare not touch it. The transitive closure of what's reachable from such objects is the set of all objects gc knows about that are reachable from outside. gc leaves those alone. Everything else gc knows about is necessarily cyclic trash, or reachable only from cyclic trash. It can't know anything about objects that aren't tracked by gc, so in particular can't know anything about pointers that don't point at a PyObject.
The problem I ran into had to do with the node logically holding references to all the ring objects, but only reporting one of them.
If a ring object doesn't derive from PyObject, then visit() must not be passed a pointer to a ring object. Doing so would account for many catastrophic meanings of "doesn't work" <wink>.
That is, the node's tp_traverse function was not reporting all references, though it was reporting enough of them to guarantee that all reachable objects were visited.
No, the visit() callback has nothing to do with which objects are traversed. gc determines the set of objects to traverse, before any object's tp_traverse() has been called, as a "generational" subset of all objects that have registered with gc. The pointers passed back to visit() callbacks have no effect on that set: they can't add anything to, or remove anything from, that set. The set of objects to traverse is predetermined.
At this point, it is clear to me that Python's cyclic GC does not do mark/sweep GC,
RIght, although we're only talking about CPython here (a particular implementation of Python),
and that this won't change,
In CPython that's a safe bet.
so there's probably not much point in discussing this issue further.
We can help you use Python's gc, but not if you're unwilling or unable to change anything about what you do.
... What if breaking cycles and cleanup are the same thing? In the case of Crux, they are one and the same, since the low level tree must be torn down one piece at a time.
Again Martin had in mind that all your objects derived from PyObject. If that were so, then Python's gc would guarantee not to tear down anything until every cycle it was part of became unreachable, in which case internal invariants no longer matter (since they're no longer reachable form anything except other unreachable trash). Python's gc is very nice to work with, for Python objects. If you can't make your objects PyObjects, then you get all the pain of implementing a form of gc that does work with your objects. You really should look at how core objects implement tp_traverse() and tp_clear(), to see how easy life *can* be, even in the presence of much hairier object graphs than you're working with.
I should have said tp_clear here. In Modules/gcmodule.c:delete_garbage(), the clear() call does not pay attention to the return value.
Heh. So true! See http://mail.python.org/pipermail/python-dev/2003-April/034435.html
It would be possible to check the return value, and defer deallocation if the return value were non-zero, and ob_refcnt were 1. (However, it would probably have to be a separate list of objects, contrary to what I claimed before.)
I'm not sure, but I think your desire for this was still based on misunderstanding of what Python's gc does.