[Python-ideas] Implicit submodule imports
Nathaniel Smith
njs at pobox.com
Fri Sep 26 20:15:35 CEST 2014
On 26 Sep 2014 15:59, "Andrew Barnert" <abarnert at yahoo.com> wrote:
>
> On Sep 25, 2014, at 18:02, Nathaniel Smith <njs at pobox.com> wrote:
>
> > On Thu, Sep 25, 2014 at 11:31 PM, Greg Ewing
> > <greg.ewing at canterbury.ac.nz> wrote:
> >> Nathaniel Smith wrote:
> >>>
> >>> They are really really hard
> >>> to do cleanly, and you risk all kinds of breakage in edge-cases (e.g.
> >>> try reload()'ing a module that's been replaced by an object).
> >>
> >> One small thing that might help is to allow the
> >> __class__ of a module to be reassigned to a
> >> subclass of the module type. That would allow
> >> a module to be given custom behaviours, while
> >> remaining a real module object so that reload()
> >> etc. continue to work.
> >
> > Heh, I was actually just pondering whether it would be opening too big
> > a can of worms to suggest this myself. This is the best design I
> > managed to come up with last time I looked at it, though in existing
> > versions of python it requires ctypes hackitude to accomplish the
> > __class__ reassignment. (The advantages of this approach are that (1)
> > you get to use the full class machinery to define your "metamodule",
> > (2) any existing references to the module get transformed in-place, so
> > you don't have to worry about ending up with a mixture of old and new
> > instances existing in the same program, (3) by subclassing and
> > avoiding copying you automatically support all the subtleties and
> > internal fields of actual module objects in a forward- and
> > backward-compatible way.)
>
> When I tried this a year or two ago, I did I with an import hook that
allows you to specify metaclass=absolute.qualified.spam in any comment that
comes before any non-comment lines, so you actually construct the module
object as a subclass instance rather than re-classing it.
>
> In theory that seems a lot cleaner. In practice it's a weird way to
specify your type; it only works if the import-hooking module and the
module that defines your type have already been imported and otherwise
silently does the wrong thing; and my implementation was pretty hideous.
>
> Is there a cleaner version of that we could do if we were modifying the
normal import machinery instead of hooking it, and if it didn't have to
work pre-3.4, and if it were part of the language instead of a hack?
>
> IIRC (too hard to check from my phone on the train), a module is built by
calling exec with a new global dict and then calling the module constructor
with that dict, so it's just a matter of something like:
>
> cls = g.get('__metamodule__', module)
> if not issubclass(cls, module):
> raise TypeError('metamodule
> {} is not a module type'.format(cls))
> mod = cls(name, doc, g)
> # etc.
>
> Then you could import the module subclass and assign it to __metamodule__
from inside, rather than needing to pre-import stuff, and you'd get
perfectly understandable errors, and so on.
>
> It seems less hacky and more flexible than re-classing the module after
construction, for the same reason metaclasses and, for that matter, normal
class constructors are better than reclassing after the fact.
>
> Of course I could be misremembering how modules are constructed, in which
case... Never mind.
Alas, in this regard module objects are different than classes; they're
constructed and placed in sys.modules before the body is exec'ed. And
unfortunately it has to work that way, because if foo/__init__.py does
'import foo.bar', then the module 'foo' has to be immediately resolvable,
before __init__.py finishes executing. A similar issue arises for circular
imports.
So this would argue for your 'magic comment' or special syntax approach,
sort of like how __future__ imports work. This part we could do non-hackily
if we modified the import mechanism itself. But we'd still have the other
problem you mention, that the metamodule would have to be defined before
the module is imported. I think this is a showstopper, given that the main
use cases for metamodule support involve using it for top-level package
namespaces. If the numpy project wants to define a metamodule for the
'numpy' namespace, then where do they put it?
So I think we necessarily will always start out with a regular module
object, and our goal is to end up with a metamodule instance instead. If
this is right then it means that even in principle we really only have two
options, so we should focus our attention on these.
Option 1: allocate a new object, shallowly copy over all the old object
properties into the new one, and then find all references to the old object
and replace them with the new object. This is possible right now, but error
prone: cloning a module object requires intimate knowledge of which fields
exist, and swapping all the references requires that we be careful to
perform the swap very early, when the only reference is the one in
sys.modules.
Option 2: the __class__ switcheroo. This avoids the two issues above. In
exchange it's fairly brain-hurty.
Oh wait, I just thought of a third option. It only works for packages, but
that's okay, you can always convert a module into a package by a simple
mechanical transformation. The proposal is that before exec'ing
__init__.py, we check for the existence of a __preinit__.py, and if found
we do something like
sys.modules[package] = sentinel to block circular imports
namespace = {}
exec __preinit__.py in namespace
cls = namespace.get("__metamodule___", ModuleType)
mod = cls(name, doc, namespace)
sys.modules[package] = mod
exec __init__.py in namespace
So preinit runs in the same namespace as init, but with a special
restriction that if it tries to (directly or indirectly) import the current
package, then this will trigger an ImportError. This is somewhat
restrictive, but it does allow arbitrary code to be run before the module
object is created.
-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20140926/7f18afd9/attachment-0001.html>
More information about the Python-ideas
mailing list