[Python-Dev] Issue 10194 - Adding a gc.remap() function
Peter Ingebretson
pingebre at yahoo.com
Tue Oct 26 07:04:39 CEST 2010
I have a patch that adds a new function to the gc module. The gc.remap()
function uses the tp_traverse mechanism to find all references to any keys
in a provided mapping, and remaps these references in-place to instead point
to the value corresponding to each key.
The motivation for adding this method is to enable writing a module that
provide an enhanced version of imp.reload. The builtin reload function
is very useful for iterating on a single module within the Python interpreter
shell, but in more complex situations it very limited.
In particular, instances of classes declared in the reloaded module will
continue to reference the old versions of the classes, and other modules
that imported elements of the old module using the 'from ... import ...'
syntax will continue to refer to the stale version of the functions or classes
that they imported.
The gc.remap() function enables writing a new version of reload which uses
imp.reload to reload a module and then replaces all references to stale objects
from the old module to instead point to equivalent newly defined objects.
This still has many limitations, for instance if an __init__ function has been
changed the new __init__ will not be run on old instances. On the other hand,
in many cases this is sufficient to continue iterating on code without needing
to restart the Python environment, which can be a significant time savings.
I initially tried to implement this reloading strategy entirely in Python using
gc.getreferrers() to find references to objects defined in the old module,
but I found it was too difficult to reliably replace references in objects once
they had been found. Since the GC already has a way to find all fields that
refer to objects, it seemed fairly straightforward to extend that mechanism to
additionally modify references.
This reloading strategy is documented in more detail here:
http://doublestar.org/in-place-python-reloading/
A potentially controversial aspect of this change is that the signature of the
visitproc has been modified to take (PyObject **) as an argument instead of
(PyObject *) so that a visitor can modify fields visited with Py_VISIT. A few
traverse functions in the standard library also had to be changed to use
Py_VISIT on the actual members rather than on aliased pointers.
I also have a prototype of an enhanced reload function using gc.remap. This
is only a partial implementation of the proposal, in particular it does not
rehash dictionaries that have been invalidated as a result of reloading, and
it does not support custom __reload__ hooks. A link to the code as well as
some examples are here:
http://doublestar.org/python-hot-loading-prototype/
Please let me know if you have any feedback on the reloading proposal, the
hot loading prototype, or on the patch.
Thanks,
Peter
More information about the Python-Dev
mailing list