[Python-Dev] Issue 10194 - Adding a gc.remap() function

Peter Ingebretson pingebre at yahoo.com
Tue Oct 26 07:04:39 CEST 2010


I have a patch that adds a new function to the gc module.  The gc.remap() 
function uses the tp_traverse mechanism to find all references to any keys 
in a provided mapping, and remaps these references in-place to instead point 
to the value corresponding to each key.

The motivation for adding this method is to enable writing a module that 
provide an enhanced version of imp.reload.  The builtin reload function 
is very useful for iterating on a single module within the Python interpreter 
shell, but in more complex situations it very limited.

In particular, instances of classes declared in the reloaded module will 
continue to reference the old versions of the classes, and other modules 
that imported elements of the old module using the 'from ... import ...' 
syntax will continue to refer to the stale version of the functions or classes 
that they imported.

The gc.remap() function enables writing a new version of reload which uses 
imp.reload to reload a module and then replaces all references to stale objects 
from the old module to instead point to equivalent newly defined objects.  
This still has many limitations, for instance if an __init__ function has been 
changed the new __init__ will not be run on old instances.  On the other hand, 
in many cases this is sufficient to continue iterating on code without needing 
to restart the Python environment, which can be a significant time savings.

I initially tried to implement this reloading strategy entirely in Python using 
gc.getreferrers() to find references to objects defined in the old module, 
but I found it was too difficult to reliably replace references in objects once 
they had been found.  Since the GC already has a way to find all fields that 
refer to objects, it seemed fairly straightforward to extend that mechanism to 
additionally modify references.

This reloading strategy is documented in more detail here:

http://doublestar.org/in-place-python-reloading/

A potentially controversial aspect of this change is that the signature of the 
visitproc has been modified to take (PyObject **) as an argument instead of 
(PyObject *) so that a visitor can modify fields visited with Py_VISIT.  A few 
traverse functions in the standard library also had to be changed to use 
Py_VISIT on the actual members rather than on aliased pointers.

I also have a prototype of an enhanced reload function using gc.remap.  This 
is only a partial implementation of the proposal, in particular it does not 
rehash dictionaries that have been invalidated as a result of reloading, and 
it does not support custom __reload__ hooks.  A link to the code as well as 
some examples are here:

http://doublestar.org/python-hot-loading-prototype/

Please let me know if you have any feedback on the reloading proposal, the 
hot loading prototype, or on the patch.

Thanks,
Peter



      


More information about the Python-Dev mailing list