<div dir="ltr">On Mon, Sep 2, 2013 at 7:53 AM, Nick Coghlan <span dir="ltr"><<a href="mailto:ncoghlan@gmail.com" target="_blank">ncoghlan@gmail.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The extension module discussion on python-dev got me thinking about<br>
the different ways in which the "singleton" assumption for modules can<br>
be broken, and how to ensure that extension modules play nicely in<br>
that environment.<br>
<br>
As I see it, there are 4 ways the "singleton that survives for the<br>
lifetime of the process following initial import" assumption regarding<br>
modules can turn out to be wrong:<br>
<br>
1. In-place reload, overwriting the existing contents of a namespace.<br>
imp.reload() does this. We sort of do it for __main__, except we<br>
usually keep re-using that namespace to run *different* things, rather<br>
than rerunning the same code.<br>
<br>
2. Parallel loading. We remove the existing module from sys.modules<br>
(keeping a reference to it alive), and load a second copy.<br>
Alternatively, we call the loader APIs directly. Either way, we end up<br>
with two independent copies of the "same" module, potentially<br>
reflecting difference system states at the time of execution.<br>
<br>
3. Subinterpreter support. Quite similar to parallel loading, but<br>
we're loading the second copy because we're in a subinterpreter and<br>
can't see the original.<br>
<br>
4. Unloading. We remove the existing module from sys.modules and drop<br>
all other references to it. The module gets destroyed, and we later<br>
import a completely fresh copy.<br>
<br>
Even pure Python modules may not support these, since they may have<br>
side effects, or assume they're in the main interpreter, or other<br>
things. Currently, there is no way to signal this to the import<br>
system, so we're left with implicit misbehaviour when we attempt to<br>
reload the modules with global side effects.<br>
<br>
For a while, I was thinking we could design the import system to "just<br>
figure it out", but now I'm thinking a selection of read/write<br>
properties on spec objects may make more sense:<br>
<br>
allow_reload<br>
allow_unload<br>
allow_reimport<br>
allow_subinterpreter_import<br>
<br>
These would all default to True, but loaders and modules could<br>
selectively turn them off.<br>
<br>
They would also be advisory rather than enforced via all possible<br>
import state manipulation mechanisms. New functions in importlib.util<br>
could provide easier alternatives to directly manipulating<br>
sys.modules:<br>
<br>
- importlib.util.reload (replacement for imp.reload that checks the<br>
spec allows reloading)<br>
- importlib.util.unload (replacement for "del<br>
sys.modules[module.__name__]" that checks the spec allows unloading,<br>
and also unloads all child modules)<br>
- importlib.util.reimport (replacement for<br>
test.support.import_fresh_module that checks the spec of any existing<br>
sys.module entry allows reimporting a parallel copy)<br>
<br>
One of these is not like the others... aside from the existing<br>
extension module specific mechanism defined in PEP 3121, I'm not sure<br>
we can devise a general *loader* level API to force imports for a<br>
particular name to fail in a subinterpreter. So this concern probably<br>
needs to be ignored in favour of a possible future C API level<br>
solution.</blockquote><div><br></div><div>Interesting stuff. While I think this is big enough to be tackled separately from PEP 451, I'll add a note there.</div><div><br></div><div>-eric</div></div></div></div>