[Python-Dev] Pre-PEP: Redesigning extension modules

Eric Snow ericsnowcurrently at gmail.com
Fri Sep 6 07:26:31 CEST 2013


On Sat, Aug 24, 2013 at 7:07 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:

> PEP 3121 would no longer be necessary. Extension types can do all we need.
> No more special casing of modules, that was the idea.
>

One nice thing about PEP 3121 is the addition of md_state to module objects
to store internal module state.  Wouldn't we be better served by improving
the related API rather than abandoning it?  If md_state were the home for
all mutable internal state then load/reload could focus directly on just
md_state and md_dict and not worry about other internal state, since all
remaining state would be immutable (refcounts notwithstanding).  If the API
made this easier then we could leverage the strengths of PEP 3121 to make
loading safer and more independent.  Of course, we could certainly go the
other way and actively discourage mutable internal state...

This, coupled with the PEP 451-compatible API and with a proxying wrapper,
would go a long way to various "reloading" issues that extension modules
have.

On Sun, Aug 25, 2013 at 5:54 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:

> (regarding reloading into the existing module's namespace)
>
> I'm not sure this can be done in general. What if the module has threads
> running that access the global state? In that case, reinitialising the
> module object itself would almost certainly lead to a crash.
>
> And what if you do "from extmodule import some_function" in a Python
> module? Then reloading couldn't replace that reference, just as for normal
> Python modules. Meaning that you'd still have to keep both modules properly
> alive in order to prevent crashes due to lost global state of the imported
> function.
>
> The difference to Python modules here is that in Python code, you'll get
> some kind of exception if state is lost during a reload. In C code, you'll
> most likely get a crash.
>
> How would you even make sure global state is properly cleaned up? Would you
> call tp_clear() on the module object before re-running the init code? Or
> how else would you enable the init code to do the right thing during both
> the first run (where global state is uninitialised) and subsequent runs
> (where global state may hold valid state and owned Python references)?
>
> Even tp_clear() may not be enough, because it's only meant to clean up
> Python references, not C-level state. Basically, for reloading to be
> correct without changing the object reference, it would have to go all the
> way through tp_dealloc(), catch the object at the very end, right before it
> gets freed, and then re-initialise it.
>

Right.  It would probably require a separate
`PyImportInitializeState_<module>(PyObject *mod)` and/or some API that
helps make it easier to manage mutable internal module state (on md_state).


On Sun, Aug 25, 2013 at 6:36 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:

> PJ Eby, 25.08.2013 06:12:
> > My "Importing" package offers lazy imports by creating module objects
> > in sys.modules that are a subtype of ModuleType, and use a
> > __getattribute__ hook so that trying to use them fires off a reload()
> > of the module.
>
> I wonder if this wouldn't be an approach to fix the reloading problem in
> general. What if extension module loading, at least with the new scheme,
> didn't return the module object itself and put it into sys.modules but
> created a wrapper that redirects its __getattr__ and __setattr__ to the
> actual module object? That would have a tiny performance impact on
> attribute access, but I'd expect that to be negligible given that the usual
> reason for the extension module to exist is that it does non-trivial stuff
> in whatever its API provides. Reloading could then really create a
> completely new module object and replace the reference inside of the
> wrapper.
>
> That way, code that currently uses "from extmodule import xyz" would
> continue to see the original version of the module as of the time of its
> import, and code that just did "import extmodule" and then used attribute
> access at need would always see the current content of the module as it was
> last loaded. I think that, together with keeping module global state in the
> module object itself, would nicely fix both cases.
>

At first blush I like this.

-eric

p.s. Bear with me if I've missed something in the thread.  I'm slogging
through a backlog of email
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20130905/20ddd112/attachment.html>


More information about the Python-Dev mailing list