fork()

Tim Peters tim_one at email.msn.com
Sun Jun 13 03:22:36 EDT 1999


[Evan Simpson]
> ...
> Perhaps I'm being naive, but I have the impression that when cycles
> arise it's usually a simple accident and easy to fix once you know
> you've done it.

Exactly so:  you're being naive <wink>.  Note that there are even cycles in
Python's implementation, most notably that a module namespace points to the
top-level functions defined in the module, while each of the latter points
back to the module namespace (which serves as the function's global
namespace, so must be reachable from the function).  There's no reasonable
way known to break that cycle, and indeed it causes Guido quite a bit of
pain at system shutdown time (when these cycles get  artifically snapped).
OTOH, nobody *notices* these cycles because they *expect* module functions
to stick around as long as the module does, and vice versa.  It's only at
shutdown time that surprises occur.

Stuff "like that" pops up in all sorts of apps.  With practice you learn
ways around it most often.  I've noted more than once that the resulting
cycle-free design is often more robust and comprehensible than its cyclic
ancestor, so I think it's a fruitful exercise to do this; doesn't always
succeed, though, and when you're in a rush it's a pain in the ass to have to
bother.

> ...
> Would it be possible (and not too expensive) to distinguish actual cycle
> members from dangly bits (maximal non-cyclic subgraphs)?  If so, it might
> be worth releasing the dangly bits normally through their root object
> after flushing the cycle.  That way you could have cycles in a program
> yet still have a place to put resource finalizers.

It doesn't fall out naturally in this approach, and suspect it's a lot more
expensive than simply marking the live objects.  There *are* elegant schemes
that work like this:  keep a refcount not on individual objects, but on the
strongly connected components (SCCs).  The SCCs of any graph necessarily
form a DAG, so the "cycle problem" can't arise *at* the SCC level; e.g.,

class C:
    def __init__(self, id):
        self.id = id
    def __del__(self):
        print self.id, "going away"

a, b, c = C('a'), C('b'), C('c')
a.partner = b
b.partner = a
a.extra = c
del a, b, c

a and b embrace each other, c is a "dangly bit" referenced by a, and at this
point all have a refcount of 1 and all are unreachable.  In the SCC scheme,
the SCCs are {a, b} and {c}, the refcount on the former is *zero* now and
the refcount on the latter is 1 (because {c} is referenced by {a, b}).  Does
a perfect job of separating the cycles from the dangly bits, and gives a
destruction order with the least possible number of headaches (because it
still respects dependencies *between* cycles and among the dangly bits).
Also nukes cycles the instant they become unreachable!  Alas, best I know
it's still thought impractically expensive to update the SCC info under
mutation, and so the scheme only shows up in some pure functional languages.

What you suggest is weaker, effectively lumping the SCCs into two gross
classes (singletons and non-singletons), and nuking the latter first.  I
think that just pushes the problems back one level; e.g., in the example
above, create another embracing pair {d, e} referenced from {c}.  You'll
invoke c's finalizer after {d, e} gets nuked, but since {c} did reference
{d, e} that probably means c's finalizer will blow up.  SCC would nuke {a,
b} first, *then* (c}, and *then* {d, e}.

Anyway, that's all by way of hinting <wink> that the scheme you suggest has
merit, but won't really work unless extended to the SCC approach (which
doesn't *need* to be incremental -- it could be done from scratch, at
massive expense <wink>, along with M&S).

too-bad-"if-the-implementation-is-hard-to-explain-it's-a-bad-idea"-is-
    one-of-our-core-principles<wink>-ly y'rs  - tim






More information about the Python-list mailing list