fork()

Tim Peters tim_one at email.msn.com
Tue Jun 15 01:28:57 EDT 1999


[Hisao Suzuki]
> ...
> [quoting Stroustrup] and in the C++ 3rd Ed. (1997):
>    "When an object is about to be recycled by a garbage
>     collector, two alternatives exist:
>       [1] Call the destructor (if any) for the object.
>       [2] Treat the object as raw memory (don't call its
>           destructor).
>     By default, a garbage collector should choose option (2)
>     because objects created using _new_ and never _delete_d are
>     never destroyed.  Thus, one can see a garbage collector as a
>     mechanism for simulating an infinite memory." (C.9.1.3
>     Destructors)
>
> Python's __del__ is conceptually equivalent to C++'s destructor.

This is true, but Python has nothing equivalent to the C++ "delete"
operator, and *everything* in Python acts as if it had been created by the
C++ "new" operator. Stroustrup is only talking about C++ objects explicitly
created by "new" but never explicitly destroyed by "delete".  C++ has
several other ways for objects to come in and out of existence that don't
involve "new" or "delete" (block-scope auto; static; anonymous temps in
expressions), and rigidly defines when destructors are invoked for those
(more rigidly than Python, which actually opens some uses for destructors in
C++ that can't quite be done in Python).

> (Otherwise we would not rely on self.close() at __del__ in
> Lib/tempfile.py!)  The above phrase `(be) never deleted' for C++
> can be safely translated as `(be) part of cycles' for Python.
> Thus, Guido's idea is quite consistent and orthodox in this
> regard.

If there's an analogy to be drawn here, it would be that since everything in
Python acts as if created by C++ "new" and can never by explicitly
destroyed, Stroustrup is arguing that Python should never invoke a
destructor period.

The contexts are so different, though, that this analogy is as silly as it
sounds <wink>.

As a practical matter, there's a universe of difference between a user
explicitly forcing heap-based storage via C++ "new" and a Python programmer
who happens to create a cycle by design or accident.  For example, the
former is apparent via static inspection of program text, while the latter
is impossible to determine statically.

> It will be acceptable to most of the current working Pythoneers,
> including me, even if they are not fans of the Lots of Intolerable Stupid
> Parentheses ;-)

But now you're getting *really* practical <wink>.  I'm still most concerned
that (besides it being an incoherent mess <0.9 wink>) objects that aren't
themselves in a cycle may nevertheless by reachable only from an unreachable
cycle, and in that case it seems intolerable to say the (in Evan's nice turn
of phrase) "dangly bits" won't have destructors invoked either.  "Well,
sure, open file objects get closed by magic -- unless they happen to be
buried in some object reachable only via an unreachable cycle."  That's a
true stmt today, and not sure there's real gain if it continues to be true
tomorrow.

> By the way, C++'s destructor and Python's __del__ have been
> proved useful in practice, particularly in conjunction with the
> `resource acquisition is initialization' technique.  On the
> other hand, such a finalizer as found in Java is almost useless
> and unreliable because of its unpredictability.  I have never
> made use of finalize() in Java for years, and I don't know any
> skilled Java programmer substantially making use of finalize().

All different mechanisms, though.  Useful in C++ because block-scope autos
are reliably destructed upon block exit, and at least current C++ rigidly
defines the order in which they're destructed (so cyclic autos have *some*
ground to stand on).  Useful in Python because RC gets close to the same
effect (but less flexibly & less reliably).

> Further the C++ 3rd Ed. says:
>    "It is possible to design a garbage collector to invoke the
>     destructors for objects that have been specifically
>     `registered' with the collector.  However, there is no
>     standard way of `registering' objects.  Note that it is
>     always important to destroy objects in an order that ensures
>     that the destructor for one object doesn't refer to an
>     object that has been previously destroyed.  Such ordering
>     isn't easily achieved by a garbage collector without help
>     from the programmer."

Java solved that one -- for all the good it did <wink>.

> All in all, we need a sort of control or programmability over
> the behavior of our programs when a fixed, built-in mechanism of
> the language cannot solely address the problem very well.
> Suppose that Python recycles the memory of unreachable cycles of
> objects without calling __del__.  What if cycles that contain
> `registered' objects are saved from deallocation so that
> clean-up or other arbitrary actions can be performed?
>
> Such an action will break, say, an unreachable doubly-linked
> tree into separated nodes.  Then the nodes will be destroyed by
> a normal process including invocations of __del__.  To make this
> occur,
>  (a) the programmer must register the root of the tree with a
>      function to break up the tree, which will be invoked later
>      automatically by the garbage collector, or
>
>  (b) the programmer must register the root of the tree, and at
>      certain point of the program she/he must explicitly invoke
>      a function which retrieves the trees unreachable except
>      from the `registration' server and breaks them up.
>
>  (In any case, unreachable trees that contain no `registered'
>   objects will be recycled with no invocations of __del__.)
>
> I am not sure which option is more useful in practice, but
> either of them would be acceptable since they are entirely
> optional and compatible with the current semantics of Python.
> With no zapping in the formal semantics, they are conceptually
> clean and their implementations will be straightforward (thus
> they are also very Pythonic ;-)

I'm getting very keen on *some* scheme that gives motivated users control
over what happens when dead cycles are found, but Guido really wants to
track only dicts and handing the user a list/enumeration of disembodied
dicts isn't giving them much to work with <wink>.

> As for a reference to option (b), see, say, R. Kent Dybvig's
> ftp://ftp.cs.indiana.edu/pub/scheme-repository/doc/pubs/guardians.ps.gz
> which I have mentioned before in this newsgroup, though it is
> written for Scheme and actually the coming version of GNU guile
> implements guardians, i.e. `registration' servers described in
> it.

Indeed a very general mechanism!  Combining it with sensible __del__ rules
for non-cycles could well lead to something better than Scheme and Python
<0.5 wink>.

> Note that the guardians may be implemented somehow in 100%
> pure Python even if it doesn't have a garbage collector.  The
> sys.getrefcount built-in function will be used then.

Don't think so.  Knowing the refcount doesn't tell you whether a thing is
trash, e.g.

class A: pass
a = A()
a.seq = [a] * 1000000
del a

and there's no general way to trace the objects reachable from a given
object without a protocol for the reachable objects to cooperate in
revealing who is directly reachable from them (it can be laboriously
accomplished for objects of  builtin types now by exploiting deep knowledge
of the way they happen to be implemented, but toss in an extension type and
you're dead).

In any case, implementing M&S in Python is even slower than implementing it
in C <wink>.

much-ado-over-garbage-ly y'rs  - tim






More information about the Python-list mailing list