[Python-ideas] Prevent importing yourself?

Steven D'Aprano steve at pearwood.info
Wed Feb 3 19:30:43 EST 2016


On Sat, Jan 30, 2016 at 07:58:49PM -0500, Ned Batchelder wrote:
> On 1/30/16 5:47 PM, Steven D'Aprano wrote:
> >On Sat, Jan 30, 2016 at 06:19:35AM -0500, Ned Batchelder wrote:
> >
> >>While we're at it though, re-importing __main__ is a separate kind of
> >>behavior that is often a problem, since it means you'll have the same
> >>classes defined twice.
>
> >As far as I can tell, importing __main__ is fine. It's only when you
> >import __main__ AND the main module under its real name at the same time
> >that you can run into problems -- and even then, not always. The sort of
> >errors I've seen involve something like this:
> >
> >import myscript
> >import __main__  # this is actually myscript
> >a = myscript.TheClass()
> ># later
> >assert isinstance(a, __main__.TheClass)
> >
> >which fails, because myscript and __main__ don't share state, despite
> >actually coming from the same source file.
> >
> >So I think it's pretty rare for something like this to actually happen.
> >I've never seen it happen by accident, I've only seen it done
> >deliberately as a counter-example to to the "modules are singletons"
> >rule.
> 
> Something like this does happen in the real world.  A class is defined 
> in the main module, and then the module is later imported with its real 
> name.  Now you have __main__.Class and module.Class both defined.  You 
> don't need to actually "import __main__" for it to happen. 
> __main__.Class is used implicitly from the main module simply as Class.


Ah, yes of course you are correct, you don't need an explicit "import 
__main__", but you do need an explicit "import module" somewhere.

I think that in order to have a problem, you need something like this 
set of circumstances:


(1) You need a module which is intended to be used as BOTH an executable 
script and an importable library.

(2) Your module ends up directly or indirectly importing itself when 
running as __main__.

(3) Your code relies on the "module is a singleton" invariant that you 
have just violated. (If you ever do something like `if type(obj) is 
MyClass`, you're relying on that invariant.)

(If your module is never __main__, you have no problem. If your module 
is always __main__ and never imported under the real file name, you have 
no problem. But if it is both, you may have a problem.)

I'm prepared to believe that actually diagnosing this error can be 
difficult, especially for those who have never come across this before. 
But I don't think it is especially common.

Rather than trying to ban self imports, could we change sys.modules so 
it caches the module object under *both* the original name and 
'__main__'?


# running "script.py" as the main module
sys.modules['__main__'] = sys.modules['script'] = module_object

# now "import script" will pick up the same module object as __main__

If the script name isn't a valid identifier, there's no need for the 
second entry, since it can't be imported.

(I have a vague feeling I've already asked this question before, but I 
can't find any mail with it. If I have asked it, and it's been answered, 
my apologies. What was the answer again?)


-- 
Steve


More information about the Python-ideas mailing list