[Python-Dev] Re: Can we limit the effects of module execution to sys.modules? (was Fix import errors to have data)

Sat Jul 31 06:17:15 CEST 2004

[Tim, wants to keep insane modules out of sys.modules]

[Jim Fulton]
> I sympathize with your frustration with this problem, but I think that
> the problem is bigger that just sys.modules.  For better or worse, importing
> a module may have side effects that extend beyond sys.modules. For example,
> In some applications, objects get registered into registries that exist in
> already-imported modules.  Perhaps we want to declare this to be a
> poor style.  If a module has an impact beyond new modules added to
> sys.modules, then removing all modules imported into sys.modules as
> a result of attempting the import would produce bugs even more subtle
> than what we have now.

I wouldn't want to remove all, just the modules that failed.  For example,

    A imports B
        B imports C # no problem
        B imports D # and that raises an exception not caught by B

C is fine, I only want to nuke D and B.

As to style, in my own code I strive to make modules "reload safe". 
So, for example, I wouldn't even consider doing one of these things as
a side effect of merely importing a module:

+ Create a lock file.
+ Start a thread.
+ Open a socket.
+ Register with a registry indeed.

Now that said, I've only seen imports wrapped in a try in two ways:

1.

try:
    import X
except ImportError:
    something

That's invariably trying to check for the availability of X, though,
not also trying to check for whether something X imports doesn't
exist.  If you pursue a saner way to write that, I'll always use it.

2.

try:
    import X
except:
    something

That one is almost always a mistake, as a bare "except" is almost
always a mistake in any context.  The author almost always intended
the same thing as #1, but was too lazy or inexperienced to write that.
 Bug in, bugs out.  That a later attempt to import X doesn't also fail
is a bug magnifier.

I've never seen something like

3.

try:
    import X
except ZeroDivisionError:
    something

If, as I suspect, nobody (and "almost nobody" is the same to me
<wink>) *intends* to catch an error from an import other than
ImportError, then import errors other than ImportError are fatal soon
after in practice, and then there's nothing much to worry about.

Catching ImportError still leaves insane modules around, though, and
that does cause real problems.  You've convinced me I'd rather have a
better way to spell "does X exist?" than catching an ImportError from
an attempt to import X.

> Do you think it's practical to limit the effects of module import to
> sys.modules, even by convention?

I'm sure you didn't intend that to be *so* extreme -- like surely a
module is allowed to initialize its own module-level variables.  If I
read "effects" as "effects visible outside the module", then that's
what you said <wink>.

> Could we say that it is a bug for code executed during module import to
> mutate other modules, including mutating objects contained in those other
> modules?  I would support this myself.

It's hard to spell the intent precisely.  "reload safe" covers a world
of non-local (wrt the module in question) that both are and aren't
problematic.  For example, calling random.random() during module
initialization should be fine, but it certainly mutates state, and
irrevocably so, inside the random module. Because it's hard to be
precise here, best practice is likely to remain more a matter of good
judgment than of legislation.

> If it is possible to limit the effects of import (even by convention),
> then I think it would be practical to roll-back changes to sys.modules.
> If it's not practical to limit the effects of  module import then I think
> the problem is effectively unsolveable, short of making Python transactional.

There we don't agree -- I think it's already practical, based on that
virtually no Python application *intends* to catch errors from imports
other than ImportError, so that almost all "real bugs" in module
initialization are intended to stop execution.  In turn, in the cases
where ImportErrors are intentionally caught now, they generally occur
in "import blocks" near the starts of all modules in the failing
import chain, and so none of the modules involved have yet *done* any
non-trivial initialization -- they're all still trying to import the
stuff they need to *start* doing the meat of their initialization.  If
some modules happen to import successfully along the way, fine, they
should stay in sys.modules, and then importing them again later won't
run their initialization code again.  IOW, once a module has announced
its sanity by importing successfully, I want that to "stick" no matter
what happens later.

> Personably I'm inclined to consider errors that occur while executing a
> module to be pretty much fatal.  If a module has begun executing, I really
> don't know what state it's in or what state it might have left other modules
> in.  I'd rather report the error and get some human to fix it.

I think that's widespread belief too.  Heck, if Zope doesn't violate
it, who else would be so perverse <wink>?

> OTOH, I'm happy to recover from the inability to find a module as long as no
> module code has been executed.

Having a clearer way to determine module availability/existence would
be a real help.

> FWIW, In Zope, we generally generally limit non-transactional state
> changes to program startup. For that reason, we make little or no attempt to
> survive startup errors.

I've never tried to survive a startup error myself either, nor have
any Python projects I'm aware of attached to any of my previous
employers.  Anyone else?