In that GitHub issue and
https://bugs.python.org/issue42564,
I discovered that what's happening is the stdlib PathFinder meta path
importer is "dumb" and doesn't treat "__init__" in module names
specially. If someone uses syntax like "import foo.__init__" or "from
.__init__ import foo", PathFinder operates on "__init__" like any other
string value and proceeds to probe the filesystem for the relevant {.py,
.pyc, .so, etc} files. The "__init__" files do exist in probed
locations and PathFinder summarily constructs a new module object,
albeit with "__init__" in its name. The end result is you have 2 module
objects and sys.modules entries referring to the same file, keyed to
different names (e.g. "foo" and "foo.__init__").
There
is a strong argument to be made that "__init__" in module names should
be treated specially. It seems wrong to me that you are allowed to
address the same module/file through different names (let's pretend
filesystem path normalization doesn't exist) and that the filesystem
encoding of Python module files/names is addressable through the
importer names. This feels like a bug that inadvertently shipped.
However,
code in the wild is clearly relying on "__init__" in module names being
allowed. And changing the behavior is backwards incompatible and could
break this code.
Anyway, I was encouraged by
Brett Cannon to email this list to assess the appetite for introducing a
backwards incompatible change to this behavior. So here's my
strawman/hardline proposal:
1. 3.10 introduces a
DeprecationWarning for "__init__" appearing as any module part
component (`"__init__" in fullname.split(".")`).
2. Some future release (I'm unsure which) turns it into a hard error.
(A
less aggressive proposal would be to normalize "__init__" in module
names to something more reasonable - maybe stripping trailing
".__init__" from module names. But I'll start by proposing the stricter
solution.)
What do others think we should do?
Gregory