On Thu, Sep 14, 2017 at 8:07 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Wed, Sep 13, 2017 at 12:24:31PM +0900, INADA Naoki wrote:
> I'm worring about performance much.
>
> Dict has ma_version from Python 3.6 to be used for future optimization
> including global caching.
> Adding more abstraction layer may make it difficult.

Can we make it opt-in, by replacing the module __dict__ when and only if
needed? Perhaps we could replace it on the fly with a dict subclass that
defines __missing__? That's virtually the same as __getattr__.

Then modules which haven't replaced their __dict__ would not see any
slow down at all.

Does any of this make sense, or am I talking nonsense on stilts?

This is more or less what I was describing here:

https://mail.python.org/pipermail/python-ideas/2017-September/047034.html

I am also looking at Neil's approach this weekend though.

I would be happy with a __future__ that enacted whatever concessions are necessary to define a module as if it were a class body, with import statements maybe being implicitly global. This "new-style" module would preferably avoid the need to populate `sys.modules` with something that can't possibly exist yet (since it's being defined!). Maybe we allow module bodies to contain a `return` or `yield`, making them a simple function or generator? The presence of either would activate this "new-style" module loading:

* Modules that call `return` should return the completed module. Importing yourself indirectly would likely cause recursion or be an error (lazy importing would really help here!). Could conceptually expand to something like:

```
global __class__
global __self__

class __class__:
    def __new__(... namespace-dunders-and-builtins-passed-as-kwds ...):
        # ... module code ...
        # ... closures may access __self__ and __class__ ...
        return FancyModule(__name__)

__self__ = __class__(__builtins__={...}, __name__='fancy', ...)
sys.modules[__self__.__name__] = __self__
```

* Modules that call `yield` should yield modules. This could allow defining zero modules, multiple modules, overwriting the same module multiple times. Module-level code may then yield an initial object so self-referential imports, in lieu of deferred loading, work better. They might decide to later upgrade the initial module's __class__ (similar to today) or replace outright. Could conceptually expand to something like:

```
global __class__
global __self__

def __hidden_TOS(... namespace-dunders-and-builtins-passed-as-kwds ...):
    # ... initial module code ...
    # ... closures may access __self__ and __class__ ...
    module = yield FancyModuleInitialThatMightRaiseIfUsed(__name__)
    # ... more module code ...
    module.__class__ = FancyModule

for __self__ in __hidden_TOS(__builtins__={...}, __name__='fancy', ...):
    __class__ = __self__.__class__
    sys.modules[__self__.__name__] = __self__
```

Otherwise I still have a few ideas around using what we've got, possibly in a backwards compatible way:

```
global __builtins__ = {...}
global __class__
global __self__

# Loader dunders.
__name__ = 'fancy'

# Deferred loading could likely stop this from raising in most cases.
# globals is a deferred import dict using __missing__.
# possibly sys.modules itself does deferred imports using __missing__.
sys.modules[__name__] = RaiseIfTouchedElseReplaceAllRefs(globals())

class __class__:
    [global] import current_module # ref in cells replaced with __self__
    [global] import other_module

    def bound_module_function(...):
        pass

    [global] def simple_module_function(...):
        pass

    # ... end module body ...

    # Likely still a descriptor.
    __dict__ = globals()

__self__ = __class__()
sys.modules[__self__.__name__] = __self__
 ```

Something to think about.

Thanks,

-- 

C Anthony