Python and Boehm-Demers GC, I have code.

Tim Peters tim_one at email.msn.com
Sat Jul 17 17:03:52 EDT 1999


[Neil Schemenauer]
> ...
> The collection of reference cycles is coming for free.  I expected there
> to be some speed tradeoff.

pybench and pystone have small working sets (they're primarily CPU-speed
tests); so there's very little reachable for BDW to trace, so BDW doesn't
have much to do, so runs fast when it's invoked; and with refcounting on,
most pystone/pybench needs are handled by internal free lists that don't
invoke malloc at all after initial setup, so BDW doesn't even get called
often.  IOW, BDW likely isn't doing much of anything in these tests besides
some initial mallocs.

It's curious that list operations appeared to take a major hit in pybench.
This may be related to "[GC_realloc] is very likely to allocate a new
object, unless MERGE_SIZES is defined in gc_priv.h" from the BDW README.
Python uses realloc over & over when growing lists.

> So far the only problem I see is sorting out the mallocs (ie. extension
> modules may use their own malloc).  This should be done anyhow in order
for
> alternate Python malloc implemetations to be used.

I don't think that's going to be happen:  Python doesn't ask anyone to
change anything about how they like to handle their memory today, and Guido
has pronounced that a Major Feature on multiple occasions.  Vladimir
Marangozov's PyMalloc (see his Starship page) caters to that, letting Python
use its own malloc without anyone else getting involved.

> The other problem is objects that expect not to be collected even
> though Python has no references to them (Tkinter callbacks).  I
> don't think this problem is unsolvable.  Things are much better
> than I expected.

On one particular platform, and that platform's version of threads (or are
you running without threads?), yes -- looks great so far <wink>.

> With some tuning maybe the gc version of Python will perform
> better than the regular version.

With or without refcounting?  As before, I think refcounting's contribution
to Python's current performance is vastly under-appreciated.

[about Tkinter callbacks]
> This is really nasty.  I am trying to allocate this data with
> regular malloc.  If I use malloc and free then this data should
> not get collected and the GC should realize that the Python
> functions still have references to them.

This won't work.  A pointer *must* be reachable from *Python's* "root set",
else BDW won't know it exists.  From the BDW README again:

    Note that pointers inside memory allocated by the standard "malloc"
    are not seen by the garbage collector.  Thus objects pointed to only
    from such a region may be prematurely deallocated.  It is thus
    suggested that the standard "malloc" be used only for memory regions,
    such as I/O buffers, that are guaranteed not to contain pointers to
    garbage collectable memory.

> So far I am not having much luck with this approach.

That's only because it's a hopeless dead end <wink>.  As I suggested before,

    ... or _tkinter.c has to maintain a list of allocated
    PythonCmd_ClientData thingies (until PythonCmdDelete is called
    back from Tcl), ...

IOW, this is certainly solvable, but has to be approached in a way that will
work <wink>.  Let BDW do the malloc for these (else the contained CO pointer
will be invisible to it).  The ClientData struct has to remain reachable
from _tkinter.c, though, until Tcl tells PythonCmdDelete it's got no more
use for it.

Another possibility is to use "standard" malloc/free for ClientData structs,
but arrange for Tkinter.py to hold on to a reference to the callable object
(so BDW can find it from Python's root set).  It already holds on to the
*name* of the synthesized Tcl cmd, in a list bound to data attr
_tclCommands.  Change the list to a dict mapping the name to its associated
CO, and fiddle all uses of _tclCommands accordingly.  That's probably the
fastest way out of this jam (&, indeed, should also run faster than the
list.remove() approach used today).

btw-keep-in-mind-that-software-doesn't-work<wink>-ly y'rs  - tim






More information about the Python-list mailing list