+1 for using __all__ as a heuristic to find public symbols. E.g, doing so would cover the aforementioned __init__.py of pytest.

There's a website called grep.app that lets you search a subset of Github; I've found it very useful. It was a little tricky to get it to work here, since py.typed is usually empty, but https://grep.app/search?q=.%3F&regexp=true&filter[path.pattern][0]=py.typed seems to do what we want. Datapoints here would be good!

My intuition is that Rule 1 would require a lot of changes, but that almost all of the discrepancy between public symbol intention and Rule 1 interpretation would be from __init__.py's where people didn't use stub-style explicit reexports (which IME are quite uncommon in .py in the wild) and didn't use __all__ or used a dynamic __all__. For an example, see https://github.com/numpy/numpy/blob/master/numpy/__init__.py (numpy has a py.typed).

It might be worth special casing __init__.py to have "implicit reexport" (potentially only for same-package symbols). I believe this is pretty principled as far as special cases go: reexporting in __init__.py lets you skip a layer of nesting and so is used specially in this context.

We could also push to ensure that type checkers understand a couple common __all__ idioms. For instance, we'd be basically able to accurately reflect numpy's top level public symbols by understanding `__all__.extend(<list of strings literal or some_other_module.__all__>)`.

On Fri, 25 Sep 2020 at 17:55, Guido van Rossum <guido@python.org> wrote:
Would it be possible to look at the origin of the imported name? If a 3rd party package imports something from the stdlib, I would venture that in 99.99% of the cases it is not meant for export. OTOH if a `__init__.py` file imports something from `.` it could well be meant for export (this is a pretty common pattern, though I doubt universal). When importing from another package it may vary. (A special case is `pytest`, whose `__init__.py` imports lots of stuff from `_pytest`, all of which is for export.)

Another heuristic is to look at `__all__`. If it's present, you have a good idea of what's meant for export (though there are some packages that define additional symbols that must be imported explicitly, while `__all__` is used to guide `from ... import *`).

I assume there aren't that many py.typed packages yet. Maybe you can somehow find them on GitHub and see how your simpler rule works out? (This list doesn't reach most typing users.)

On Fri, Sep 25, 2020 at 4:59 PM Eric Traut <eric@traut.com> wrote:
PEP 561 indicates that “package maintainers who wish to support type checking of their code MUST add a marker file named py.typed…”. It doesn’t define what “support type checking” means or what expectations are implied. This has led to a situation where packages claim to support type checking but omit many type annotations. There’s currently no tooling that validates the level of “type completeness” for a package, so even well-intentioned package maintainers are unable to confirm that their packages are properly and completely annotated. This leads to situations where type checkers and language servers need to fall back on type inference, which is costly and gives inconsistent results across tools. Ideally, all py.typed packages would have their entire public interface completely annotated.

I’m working on a new feature in Pyright that allows package maintainers to determine whether any of the public symbols in their package are missing type annotations. To do this, I need to clearly define what constitutes a “public symbol”. In most cases, the rules are pretty straightforward and follow the naming guidelines set forth in PEP 8 and PEP 484. For example, symbols that begin with an underscore are excluded from the list of public symbols.

One area of ambiguity is related to import statements. PEP 484 indicates that within stub files, a symbol is not considered exported unless it is used within an import statement of the form `import x as y` or `from x import y as z` or `from x import *`. The problem is that this rule applies only to “.pyi” files and not to “.py” files. For packages that use inlined types, it’s ambiguous whether an import statement of the form `import x` or `from y import x` should treat `x` as a public symbol that is exported from that module.

I can think of a few solutions here:
1. For py.typed packages, type checkers should always apply PEP 484 import rules for “.py” files. If a symbol `x` is imported with an `import x` or `from y import x`, it is treated as “not public”, and any attempt to import it from another package will result in an error.
2. For py.typed packages, PEP 484 rules are _not_ applied for import statements. This maintains backward compatibility. Package maintainers can opt in to PEP 484 rules using some well-defined mechanism. For example, we could define a special flag “stub_import_rules” that can be added to a “py.typed” file. Type checkers could then conditionally use PEP 484 rules for imports.

Option 1 will likely break some assumptions for existing packages. Option 2 avoids that break, but it involves more complexity.

Any suggestions? Thoughts?

 -Eric

---
Eric Traut
Contributor to Pyright and Pylance
Microsoft
_______________________________________________
Typing-sig mailing list -- typing-sig@python.org
To unsubscribe send an email to typing-sig-leave@python.org
https://mail.python.org/mailman3/lists/typing-sig.python.org/
Member address: guido@python.org


--
--Guido van Rossum (python.org/~guido)
_______________________________________________
Typing-sig mailing list -- typing-sig@python.org
To unsubscribe send an email to typing-sig-leave@python.org
https://mail.python.org/mailman3/lists/typing-sig.python.org/
Member address: hauntsaninja@gmail.com