Hi Larry, On Sat, Apr 23, 2022 at 1:53 AM Larry Hastings <larry@hastings.org> wrote:
But rather than speculate further, perhaps someone who works on one of the static type analysis checkers will join the discussion and render an informed opinion about how easy or hard it would be to support "forward class" and "continue class".
I work on a Python static type checker. I think a major issue with this proposal is that (in the separate-modules case) it requires monkey-patching as an import side effect, which is quite hard for both humans and static analysis tools to reason effectively about. Imagine we have a module `foo` that contains `forward class Bar`, a separate module `foo.impl` that contains `continue class Bar: ...`, and then a module `baz` that contains `import foo`. What type of object is `foo.Bar` during the import of `baz`? Will it work for the module body of `baz` to create a singleton instance `my_bar = foo.Bar()`? The answer is that we have no idea. `foo.Bar` might be a non-instantiable "forward class declaration" (or proxy object, in your second variation), or it might be a fully-constituted class. Which one it is depends on accidents of import order anywhere else in the codebase. If any other module happens to have imported `foo.impl` before `baz` is imported, then `foo.Bar` will be the full class. If nothing else has imported `foo.impl`, then it will be a non-instantiable declaration/proxy. This question of import order potentially involves any other module in the codebase, and the only way to reliably answer it is to run the entire program; neither a static type checker nor a reader of the code can reliably answer it in the general case. It will be very easy to write a module `baz` that does `import foo; my_bar = foo.Bar()` and have it semi-accidentally work initially, then later break mysteriously due to a change in imports in a seemingly unrelated part of the codebase, which causes `baz` to now be imported before `foo.impl` is imported, instead of after. There is another big problem for static type checkers with this hypothetical module `baz` that only imports `foo`. The type checker cannot know the shape of the full class `Bar` unless it sees the right `continue Bar: ...` statement. When analyzing `baz`, it can't just go wandering the filesystem aimlessly in hopes of encountering some module with `continue Bar: ...` in it, and hope that's the right one. (Even worse considering it might be `continue snodgrass: ...` or anything else instead.) So this means a type checker must require that any module that imports `Bar` MUST itself import `foo.impl` so the type checker has a chance of understanding what `Bar` actually is. This highlights an important difference between this proposal and languages with real forward declarations. In, say, C++, a forward declaration of a function or class contains the full interface of the function or class, i.e. everything a type checker (or human reader) would need to know in order to know how it can use the function or class. In this proposal, that is not true; lots of critical information about the _interface_ of the class (what methods and attributes does it have, what are the signatures of its methods?) are not knowable without also seeing the "implementation." This proposal does not actually forward declare a class interface; all it declares is the existence of the class (and its inheritance hierarchy.) That's not sufficient information for a type checker or a human reader to make use of the class. Taken together, this means that every single `import foo` in the codebase would have to be accompanied by an `import foo.impl` right next to it. In some cases (if `foo.Bar` is not used in module-level code and we are working around a cycle) it might be safe for the `import foo.impl` to be within an `if TYPE_CHECKING:` block; otherwise it would need to be a real runtime import. But it must always be there. So every single `import foo` in the codebase must now become two or three lines rather than one. There are of course other well-known problems with import-time side effects. All the imports of `foo.impl` in the codebase would exist only for their side effect of "completing" Bar, not because anyone actually uses a name defined in `foo.impl`. Linters would flag these imports as unused, requiring extra cruft to silence the linter. Even worse, these imports would tend to appear unused to human readers, who might remove them and be confused why that breaks the program. All of these import side-effect problems can be resolved by dis-allowing module separation and requiring `forward class` and `continue class` to appear in the same module. But then the proposal no longer helps with resolving inter-module cycles, only intra-module ones. Because of these issues (and others that have been mentioned), I don't think this proposal is a good solution to forward references. I think PEP 649, with some tricks that I've mentioned elsewhere to allow introspecting annotations usefully even if not all names referenced in them are defined yet at the time of introspection, is much less invasive and more practical. Carl