[Python-ideas] Re: init in module names

8 Dec 2020

      On Tue, Dec 08, 2020 at 11:47:22AM -0800, Gregory Szorc wrote:
...
It was recently brought to my attention via
https://github.com/indygreg/PyOxidizer/issues/317 that "__init__" in module
names is something that exists in Python code in the wild.
Can we be clear whether you are talking about "__init__" **in** module 
names (a substring, like "my__init__module.py") or "__init__" **as** a 
module name (not a substring, "__init__.py" exactly)?

My guess is that you are only talking about the second case, can you 
confirm please?
...
In that GitHub issue and https://bugs.python.org/issue42564, I discovered
that what's happening is the stdlib PathFinder meta path importer is "dumb"
and doesn't treat "__init__" in module names specially.
I would hope and expect that it doesn't. If somebody explicitly asks to 
do something, Python should do what they ask, and not something 
different.

Analogy: if I explicitly call `someobject.__init__(*args)` then I would 
expect Python to call that method, and not to translate that into a call 
to `type(someobject).__new__(*args)` because "__init__ is special".

The interpreter should do as its told and not try to guess what I meant.
...
If someone uses
syntax like "import foo.__init__" or "from .__init__ import foo",
PathFinder operates on "__init__" like any other string value and proceeds
to probe the filesystem for the relevant {.py, .pyc, .so, etc} files. The
"__init__" files do exist in probed locations and PathFinder summarily
constructs a new module object, albeit with "__init__" in its name. The end
result is you have 2 module objects and sys.modules entries referring to
the same file, keyed to different names (e.g. "foo" and "foo.__init__").
Right. But given that the caller has *explicitly* asked for 
"foo.__init__" to be imported, presumably that is exactly the 
behaviour they want.

Are there cases where people inadvertly import "foo.__init__" and are 
then surprised to get a different module from "foo" alone?

Personally, I think this is a case for education. If you are explicitly 
touching *any* dunder name, it is up to you to know what you are doing.
...
There is a strong argument to be made that "__init__" in module names
should be treated specially. It seems wrong to me that you are allowed to
address the same module/file through different names
Can you make that strong argument please? "It seems wrong to me" is a 
very weak argument.
...
(let's pretend filesystem path normalization doesn't exist)
Let's not pretend, because it does exist.

There is also the "module importing itself" issue, and hard links, and 
I'm sure that there are other clever ways to get two module objects out 
of a single module file. Deep copying doesn't work, but modules are very 
simple objects and you can copy them by hand:

    import spam
    eggs = type(spam)("eggs", vars(spam).copy())
...
and that the filesystem
encoding of Python module files/names is addressable through the importer
names. This feels like a bug that inadvertently shipped.
Not to me. The current behaviour is exactly what I would expect.
...
However, code in the wild is clearly relying on "__init__" in module names
being allowed. And changing the behavior is backwards incompatible and
could break this code.
Right, so "it feels wrong" is not a sufficient reason to make that 
breaking change.

I think that you would need to demonstrate that:

(1) people are inadvertly importing "__init__", not realising the 
consequences;

(2) leading to bugs in their code;

(3) that this happens *more often* than people intentionally and 
knowingly importing "__init__";

(4) and that there is a work-around for those intentionally importing 
"__init__".

-- 
Steve

[Python-ideas] Re: __init__ in module names

Steven D'Aprano

[Python-ideas] Re: init in module names