GC In Python: YAS (Yet Another Summary)

Tim Peters tim_one at email.msn.com
Sat Jun 26 13:48:27 EDT 1999

[Andrew Dalke, worries about Boehm collection vs Fortran]
> The latter.   Example object creation code might look like:
> dt_Handle mol = dt_smilin("CCC", 3);
> where dt_Handle is a typedef to unsigned long.  In the SWIGged
> wrapped version, this is mol = dt_smilin("CCC").
> Yes, the library has a pointer table to the object.  The problem
> I see is, how does the Boehm collector (or other GC) know what to
> collect from this vendor library?  I can't see how, given that
> my Python code stores it as an integer.
> That's why I don't see how you say "It fully supports C and C++"
> when this is a package which doesn't do "pointer XORing or other
> crimes" but cannot be usable with a GC w/o modifications.
> But then, I'm a computational biophysicist by training and don't
> know much about GC other than the general concepts.  One of the
> references on the SGI page says Boehm can be used in "uncooperative
> environments" so I'll end by saying that I don't know enough.

You know enough to say this:  no matter *what* Fortran throws at it,
Python's current flavor of GC won't have a problem with it <0.5 wink>.

The BDW collector is a truly wonderful example of the GC art, and can
perform seeming miracles in hostile environments.  But you can't know
whether it will work * a priori*.  It needs to take over all dynamic
allocation (typically "malloc" in C, but your platform's Fortran may not use
that), understand everything about how your platform's compilers and runtime
systems use the machine stack and registers, and implicitly relies on the
compiler not generating "confusing" code (pointer xor'ing is something
compiler-generated code probably never does, but aggressively optimizing
compilers can & do perform transformations equally as confusing to BDW;
Boehm wrote a paper about that, so I'll skip it <wink>).

All of those caveats have to do with BDW being able to find every
bit-pattern that *may* be a live pointer.  It typically doesn't know ints
from doubles from real pointers, and doesn't care:  if a string of bits
"looks like" a heap address, it's treated as one.  This makes it err on the
side of safety, but all assuming that all live pointers are found in their
natural form in the places it knows where to look.

Despite all that, it would probably work fine for you.  I think it's a dead
issue here regardless, though, because Guido isn't going to tell anyone they
have to use a particular routine for memory allocation.  Python could use
its own malloc for its own objects (& Vladimir Marangozov has written a very
nice PyMalloc, available from his Starship page), but that breaks down as
soon as any extension has its own view of how to get memory.

can't-chase-what-you-can't-see-ly y'rs  - tim

More information about the Python-list mailing list