[Python-Dev] Import hassle

Guido van Rossum guido@zope.com
Thu, 26 Jul 2001 11:25:51 -0400


> I've been writing quite a few mails lately, all concerning import
> problems. I thought I'd write a little longer mail to explain what I'm
> doing and what I find strange here.

Martin,

Why does this interest you?  This never happens in reality unless your
memory allocator is broken, and then you have worse problems than
"leaks".

Also, why are you posting to python-dev?

> Basically all (at least the 10-20 ones I've checked) the C modules in the
> distribution have one thing in common: if something in their initFoo()
> function fails, they return without freeing any memory. I.e. they return
> an incomplete module.
> 
> The only way I can think of that one of the standard modules could
> fail is when you're out of memory, and that's kinda hard to
> simulate, so I put in a faked failure, i.e. I raised an exception
> and returned prematurely (in one of my own C modules, not one in the
> distribution!).
> 
> The code looks like this:
>     PyErr_SetString(PyExc_ImportError, "foo");
>     return;
>     /* do other things here, this "fails" */
> 
> >>> import Foo
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> ImportError: foo
> >>> import Foo
> >>> dir()
> ['Foo', '__builtins__', '__doc__', '__name__']
> 
> Huh?! How did this happen? What is Foo doing there?

In general, when import fails after a certain point, the module has
already been created in sys.modules.  There is a reason for this,
having to do with recursive imports.

> Even more interesting, say that I create a submodule and throw in a
> bunch of PyCFunctions in it (I stole the code from InitModule since
> I don't know how to fake submodules in a C module in another way, is
> there a way?). I create the module, fail on inserting it into the
> dictionary and DECREF it.  Now, that ought to free the darn
> submodule, doesn't it? Anyway, I wrote a simple "mean" script to
> test this:
> 
> try: import Foo
> except: import Foo
> while 1:
>   try: reload(Foo)
>   except: pass
> 
> And this leaks memory like I-don't-know-what!
> What memory doesn't get freed?

Memory leaks are hard to find.  I prefer to focus on memory leaks that
occur in real situations, rather than theoretical leaks.

> Now to my questions: What exactly SHOULD I do when loading my module fails
> halfway through? Common sense says I should free the memory I've used and
> the module object ought to be unusable.

You should free the memory if you care.  "Disabling" the module is
unnecessary -- in practice, the program usually quits when an import
fails anyway.

> Why-oh-why can I import Foo, catch the exception, import it again and it
> shows up in the dictionary? What's the purpose of this?
> 
> How do I work with submodules in a C module?
> 
> I find the import semantics really weird here, something is not quite
> right...

Consider two modules, A and B, where A imports B and B imports A.
This is perfectly legal, and works fine as long as B's module
initialization doesn't use names defined in A.

In order to make this work, sys.module['A'] is initialized to an empty
module and filled with names during A's initialization; ditto for
sys.modules['B'].

Now suppose A triggers an exception after it has successfully loaded
and imported B.  B already has a reference to A.  A is not completely
initialized, but it's not empty either.  Should we delete B's
reference to A?  No -- that's interference with B's namespace, and we
don't know whether B might have stored references to A elsewhere, so
we don't know if this would be effective.  Should we delete
sys.modules['A']?  I don't think so.  If we delete sys.modules['A'],
and later someone attempts to import A again, the following will
happen: when A imports B, it finds sys.modules['B'], so it doesn't
reload B; it will use the existing B.  But now B has a reference to
the *old* A, not the new one.

There are now two possibilities: either the second import of A somehow
succeeds (this could only happen if somehow the problem that caused it
to trigger an exception was repaired before the second attempted
import), or the second import of A fails again.  If it succeeds, the
situation is still broken, because B references the old, incomplete
A.  If it fails, we my end up in an infinite loop, attempting to
reimport A, failing, and catching the exception forever.  Neither is
good.

--Guido van Rossum (home page: http://www.python.org/~guido/)