Mailman 3 Support for import hooks for static, type checkers and IDEs - Typing-sig

newer
PEP 692: Using TypedDict for more...

Support for import hooks for static, type checkers and IDEs

older
Opened and changed typing issues...

Bernat Gabor

April 20, 2022

12:37 p.m.

As some of you might be aware the python packaging ecosystem has adopted a standardized solution of editable installations in form of https://peps.python.org/pep-0660. Historically editable installations worked via `pth` files that inject the projects working directory onto the sys.path. This solution, while it mostly works, is fairly inflexible when it comes to more advanced functionality (such as excluding files from some modules, loading the code from in-memory or the database). A more powerful solution from the runtimes point of view is to use import hooks to dynamically allow/exclude module imports. The problem with this solution today is that static (and type) checkers are not capable of knowing/indexing the project layout, and as such most IDEs e.g. underline the imports with a red line as cannot be resolved from their POV. A possible suggestion was to allow the py.typed (or another file) to contain a rough layout of the project, e.g. a list of {module x is importable from file at path y}. This file then could be loaded and indexed by both type and static checkers, and periodically be refreshed by tooling for discovering new files/modules. See discussion at https://discuss.python.org/t/pep-660-support-and-ides/13878/13. Opinions or better options to tackle this problem? Thanks, Bernat

Attachments:

attachment.html (text/html — 1.5 KB)

Show replies by date

Paul Moore

April 2022

12:42 p.m.

On Wed, 20 Apr 2022 at 13:37, Bernat Gabor <gaborjbernat@gmail.com> wrote:

...

As some of you might be aware the python packaging ecosystem has adopted a standardized solution of editable installations in form of https://peps.python.org/pep-0660. Historically editable installations worked via `pth` files that inject the projects working directory onto the sys.path. This solution, while it mostly works, is fairly inflexible when it comes to more advanced functionality (such as excluding files from some modules, loading the code from in-memory or the database). A more powerful solution from the runtimes point of view is to use import hooks to dynamically allow/exclude module imports. The problem with this solution today is that static (and type) checkers are not capable of knowing/indexing the project layout, and as such most IDEs e.g. underline the imports with a red line as cannot be resolved from their POV.

A possible suggestion was to allow the py.typed (or another file) to contain a rough layout of the project, e.g. a list of {module x is importable from file at path y}. This file then could be loaded and indexed by both type and static checkers, and periodically be refreshed by tooling for discovering new files/modules. See discussion at https://discuss.python.org/t/pep-660-support-and-ides/13878/13.

Opinions or better options to tackle this problem?

Repeating a point I made on Discourse, this shouldn't be viewed as solely a packaging (or editable install) issue, the same issues presumably arise for code imported from a zipfile (although IDEs may special-case zipfiles, I guess) or similar import hooks (one to allow importing code from a sqlite database, for example). How do IDEs handle the case where someone adds a zipfile to their sys.path? How would we *like* to handle that case? Paul

Eric Traut

4:22 p.m.

Pyright (and Pylance) use two primary mechanisms to discover relevant import resolution paths: 1. We invoke the default Python interpreter and run a tiny script that simply returns the sys.path list. (Note: We need to be very careful when we do this to avoid executing arbitrary code on the user's machine without their knowledge or consent. This means the script cannot import any library other than `os` or `sys` because modules with names other than `os` and `sys` can be overridden by local source files.) The resulting list of paths includes site-packages, PYTHON_PATH additions, paths to zip files, and directories referred to by pth files. 2. We allow users to manually specify additional import resolution paths using an "extraPaths" configuration option. These are searched after the paths retrieved from sys.path. We discourage users from dynamically manipulating sys.path in their code because we cannot statically discover such manipulations. In rare cases where this cannot be avoided (typically in legacy code bases), we tell users to manually specify these paths using the "extraPaths" mechanism. Import hooks that dynamically manipulate or override sys.path in code present similar problems for tools that need to understand import resolution. I presume that other type checkers, language servers, and linters do something similar to Pyright as described above? Paul, you asked about zip files. Pyright handles these just like any other directory provided in sys.path. It uses a zip library to traverse the directory and file structure rather than making direct file system calls, but there's no other special casing involved in locating the zip file because its location is provided in sys.path. Bernat, if my assumption is correct that all static tools use a mechanism similar to Pyright, it would be convenient if the Python interpreter knew about additional import paths and added these to the sys.path list at startup. This would eliminate the need for the entire Python tooling ecosystem to understand yet another mechanism for discovering import resolution paths. More importantly, it would keep the knowledge of this mechanism in one place. Is that a possibility? There's still a small window of opportunity to get such a change into Python 3.11. The alternative is to standardize some now mechanism (presumably through a new PEP) that allows library authors to specify additional import resolution paths. As you suggested, this could be an addition to "py.typed" or some other new file that is included with the package. The downside to this approach is that it would need to be adopted by every static type checker, language server, IDE, linter, and other static analysis tool. This would be a costly solution that could take years to roll out completely. And depending on the complexity of the file format and its contents, it could take much longer to eliminate the bugs and other idiosyncrasies in all of these implementations. I'm hoping we can avoid this. Another option is to simply fall back on manual specification of import resolution paths within each tool (e.g. the "extraPaths" configuration option in Pyright). This would be a burden on users and would eliminate most of the convenience benefits of PEP 660. It would likely present a major hurdle for the adoption of PEP 660, so this option isn't very appealing either. -Eric -- Eric Traut Contributor to Pyright & Pylance Microsoft

Bernat Gabor

4:30 p.m.

Hello, The biggest problem I see here is that import hooks are not just about adding another path to the sys.path. Often they can be more complicated such as loading a module from a DB or other dynamic operations (e.g. loading a C Extension). How does pylance and pyright handle or would like to handle those cases? Thank 🙏 On Wed, 20 Apr 2022, 17:23 Eric Traut, <eric@traut.com> wrote:

...

Pyright (and Pylance) use two primary mechanisms to discover relevant import resolution paths:

1. We invoke the default Python interpreter and run a tiny script that simply returns the sys.path list. (Note: We need to be very careful when we do this to avoid executing arbitrary code on the user's machine without their knowledge or consent. This means the script cannot import any library other than `os` or `sys` because modules with names other than `os` and `sys` can be overridden by local source files.) The resulting list of paths includes site-packages, PYTHON_PATH additions, paths to zip files, and directories referred to by pth files.

2. We allow users to manually specify additional import resolution paths using an "extraPaths" configuration option. These are searched after the paths retrieved from sys.path.

We discourage users from dynamically manipulating sys.path in their code because we cannot statically discover such manipulations. In rare cases where this cannot be avoided (typically in legacy code bases), we tell users to manually specify these paths using the "extraPaths" mechanism.

Import hooks that dynamically manipulate or override sys.path in code present similar problems for tools that need to understand import resolution.

I presume that other type checkers, language servers, and linters do something similar to Pyright as described above?

Paul, you asked about zip files. Pyright handles these just like any other directory provided in sys.path. It uses a zip library to traverse the directory and file structure rather than making direct file system calls, but there's no other special casing involved in locating the zip file because its location is provided in sys.path.

Bernat, if my assumption is correct that all static tools use a mechanism similar to Pyright, it would be convenient if the Python interpreter knew about additional import paths and added these to the sys.path list at startup. This would eliminate the need for the entire Python tooling ecosystem to understand yet another mechanism for discovering import resolution paths. More importantly, it would keep the knowledge of this mechanism in one place. Is that a possibility? There's still a small window of opportunity to get such a change into Python 3.11.

The alternative is to standardize some now mechanism (presumably through a new PEP) that allows library authors to specify additional import resolution paths. As you suggested, this could be an addition to "py.typed" or some other new file that is included with the package. The downside to this approach is that it would need to be adopted by every static type checker, language server, IDE, linter, and other static analysis tool. This would be a costly solution that could take years to roll out completely. And depending on the complexity of the file format and its contents, it could take much longer to eliminate the bugs and other idiosyncrasies in all of these implementations. I'm hoping we can avoid this.

Another option is to simply fall back on manual specification of import resolution paths within each tool (e.g. the "extraPaths" configuration option in Pyright). This would be a burden on users and would eliminate most of the convenience benefits of PEP 660. It would likely present a major hurdle for the adoption of PEP 660, so this option isn't very appealing either.

-Eric

-- Eric Traut Contributor to Pyright & Pylance Microsoft _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gaborjbernat@gmail.com

Eric Traut

5:05 p.m.

Static analysis tools cannot handle those cases automatically. These scenarios require manual configuration on the part of the user. Thankfully, such dynamic hooks are rarely used in practice — perhaps in part because they don't work well with development tools. Pylance has over 4M active users, and we've received only a couple of questions over the past three years involving dynamic import hooks. Looking through the mypy issue tracker (which goes back 9+ years), I see only a couple of references to import hooks, and those issues have received little or no attention. So I don't think there's a need to solve the general problem of dynamic import hooks. -Eric

Bernat Gabor

5:54 p.m.

I'd like to disagree with the "dynamic import hook problem" doesn't need to be solved. I believe the reason why they are not popular today is that static check tools/IDEs don't support it well and therefore discourage users from adopting them. However, if we start adding dynamic import hooks for every editable installation for every project the need for handling them correctly would significantly increase and also comments on those issue trackers would jump up. If possible I'd like to at least explore how we can solve the dynamic import hooks. We might conclude that's not feasible or desirable. Nevertheless. then we can at least have good reasoning on what are the obstacles in achieving this are. Bernat On Wed, Apr 20, 2022 at 6:06 PM Eric Traut <eric@traut.com> wrote:

...

Static analysis tools cannot handle those cases automatically. These scenarios require manual configuration on the part of the user. Thankfully, such dynamic hooks are rarely used in practice — perhaps in part because they don't work well with development tools. Pylance has over 4M active users, and we've received only a couple of questions over the past three years involving dynamic import hooks. Looking through the mypy issue tracker (which goes back 9+ years), I see only a couple of references to import hooks, and those issues have received little or no attention. So I don't think there's a need to solve the general problem of dynamic import hooks.

-Eric _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gaborjbernat@gmail.com

Paul Moore

6:34 p.m.

On Wed, 20 Apr 2022 at 18:55, Bernat Gabor <gaborjbernat@gmail.com> wrote:

...

I'd like to disagree with the "dynamic import hook problem" doesn't need to be solved. I believe the reason why they are not popular today is that static check tools/IDEs don't support it well and therefore discourage users from adopting them. However, if we start adding dynamic import hooks for every editable installation for every project the need for handling them correctly would significantly increase and also comments on those issue trackers would jump up. If possible I'd like to at least explore how we can solve the dynamic import hooks. We might conclude that's not feasible or desirable. Nevertheless. then we can at least have good reasoning on what are the obstacles in achieving this are.

I agree with this position. We had the same problems with imports from zipfiles. For *years*, code was written that made unfortunate assumptions that packages would live in the filesystem, and that hindered development of things like zipapps, pyInstaller, etc. Setuptools even had an "is this package zip safe" flag, that attempted to mark what code could reliably be included in a zipfile - which doesn't actually solve the problem, just makes it easier to spot when you have an issue. Now that importlib.resources is in the stdlib, it's possible to write code that *doesn't make assumptions that packages live on the filesystem, and we can move on with some confidence. It would be a shame to lose all those benefits just because a new set of tools are appearing which make the same limiting assumptions. Also, I would point out that the fact that tools like pyright treat zipfiles like directories is exactly the same sort of special casing that tools had to do before importlib.resources was created. I'd much rather we had a solution that meant tools *didn't* have to all add that special casing, just to support one possible import hook that happens to be shipped with Python. Paul PS I should also say that I understand that in full generality this is a *hard* problem and there will need to be some compromises. One of which may well be that for harder cases, users will need to manually specify some things, or maybe even accept that tooltips/auto-completion may not work in some cases. In which case, the key will be for the tools to degrade gracefully, and we should make sure that's possible, rather than simply saying "don't do that".

Jia Chen

May 2022

3:22 p.m.

Hi Bernat and Paul, Apologies for the late response. And thank you so much for bringing up this issue onto the typing-sig mailing list! As one of the maintainers of the Pyre type checker, I agree with Eric's statement that import hooks are way too dynamic for static analyzers to handle. I want to re-emphasize the point here that static checkers and similar language toolings cannot run arbitrary Python code for import resolution, regardless of whether the said code lives in Python's stdlib or not. Many type checkers (such as Pyright and Pyre) are themselves not written in Python. And even for type checkers that are written in Python, they are often used to type-check projects that needs to be executed in a different Python venv/interpreter. In most cases, "invoke that import hook that happens to be shipped with Python" is just not an option for us. The only sensible way for static tooling to support a new dynamic import hook is to replicate the entire logic of that hook within the tool itself. This requires a lot of work and also implies heavy maintenance burden on our side. The alternative solution, i.e. emitting a static project layout where we explicitly specify how to get the source content for each and every module, would be much better for us than the dynamic import approach. But even supporting this approach would be tricky. In addition to the concern Eric raised before, the other issue I want to point out is how to implement efficient incremental checks: Static analyzers and IDE tooling typically do not re-analyze the entire project every time the users hit "save" in their editors -- what happens instead is that we rely on some sort of filesystem watcher to let us know which files have been added/changed, and only re-analyze the added/changed files while re-using the cached analysis results for unchanged files. If we were to allow additional modes to get the sources, we would need to think of a way to figure out, e.g., when would a source file that come from a database/network can change (periodically scanning for changes, as Bernat suggested, does not scale well for large projects). This is not impossible to do, but it also requires tons of engineering to get right, and I am skeptical on whether such engineering effort is worth it, given that the vast majority of Python users do not need to grab sources from anywhere other than the filesystem. I hope my explanation above could provide you with additional context on why the additional flexibilities on import resolutions is problematic for us: The more flexible&dynamic the mechanism is, the more engineering burden we will get as authors of static analyzers. In an ideal world with unlimited time&resources, this might not be an issue. But the reality is that resources are scarce for us, and we do have other priorities such as maintaining&evolving the Python type system. My guess is that solutions with pre-existing mechanisms like `pth` files would be a path forward with least resistance for this community. Proposing some kind of static layout format that only allows filesystem imports would be a suboptimal but (barely) acceptable approach. Anything beyond that would be difficult to push for adoption. - Jia

Paul Moore

4:16 p.m.

On Mon, 2 May 2022 at 16:23, Jia Chen via Typing-sig <typing-sig@python.org> wrote:

...

Hi Bernat and Paul,

Apologies for the late response. And thank you so much for bringing up this issue onto the typing-sig mailing list!

As one of the maintainers of the Pyre type checker, I agree with Eric's statement that import hooks are way too dynamic for static analyzers to handle.

Thanks. I agree - it seems self-evident to me that static checkers are unlikely to ever be able to handle import hooks. While I can't speak for Bernát, I personally never expected that - I was expecting that type checking simply wouldn't happen for editable installs (that used import hooks), and that would be fine. It may be that this isn't the case, though - to be honest, I have no visibility of how many people/projects are using import hook based editable installs, or how many of those people are inconvenienced by the current state of affairs. Ideally, we should survey the community to find out how much of a problem this really is, but I'm not sure there's a good way to do that, so we may have to rely on the traditional approach of guessing and expecting the worst ;-)

...

The alternative solution, i.e. emitting a static project layout where we explicitly specify how to get the source content for each and every module, would be much better for us than the dynamic import approach. But even supporting this approach would be tricky.

That's interesting to me, as I'd be disappointed if we ended up spending time implementing a solution that didn't actually help. The "static layout" proposal is more Bernát's idea, I don't really know what specifically he thinks is possible to include here. But we should certainly discuss the format of any such static layout file before making changes.

...

This is not impossible to do, but it also requires tons of engineering to get right, and I am skeptical on whether such engineering effort is worth it, given that the vast majority of Python users do not need to grab sources from anywhere other than the filesystem.

I would expect that any solution would restrict itself to pointing at other locations in the filesystem. For the editable install use case, I don't imagine anyone is contemplating having package sources anywhere other than in the filesystem (the whole point of an editable install, after all, is to make available the live project source, and for that to be editable it would need to be on the filesystem). I *do* imagine that outside of editable installs, people might use import hooks to load modules from non-filesystem locations. The zipimport module is an obvious example, which I understand type checkers special-case for. That seems like a reasonable "practicality beats purity" workaround, but doesn't alter the fact that people can do similar things which the type checkers *won't* have hard-coded for. As an occasional user of type checkers, I'd hope that they would degrade gracefully in that situation (treating names that can't be accessed as untyped), but that's all - as I said above, "static" is the key here. I certainly wouldn't expect them to find that source code. But maybe I'm more tolerant than other users ;-) However none of that matters for the case of editable installs, which can entirely reasonably be focused on sources in the filesystem (IMO).

...

I hope my explanation above could provide you with additional context on why the additional flexibilities on import resolutions is problematic for us: The more flexible&dynamic the mechanism is, the more engineering burden we will get as authors of static analyzers. In an ideal world with unlimited time&resources, this might not be an issue. But the reality is that resources are scarce for us, and we do have other priorities such as maintaining&evolving the Python type system.

Thanks, it's very helpful. I'd say none of it was a real surprise to me, and I understand exactly where you are coming from. For me, what was actually surprising was that *other* people were surprised by this limitation ;-)

...

My guess is that solutions with pre-existing mechanisms like `pth` files would be a path forward with least resistance for this community. Proposing some kind of static layout format that only allows filesystem imports would be a suboptimal but (barely) acceptable approach. Anything beyond that would be difficult to push for adoption.

I'd certainly hope that type checkers are aware of the implications of `.pth` files - they have been in common use for years now, not just for editable installs, but as a general mechanism for tailoring the import machinery. And indeed, the new editable install mechanism does allow tools to continue to use `.pth` files - so I don't think there's a significant change here on that front. People will still use `.pth` files, and (I imagine) would expect type checkers to be able to cope with them (or maybe they are happy with `.pth` files not being supported by type checkers - I'm not clear from your comment whether `.pth` support already exists in type checkers, or would need to be added). The trigger for this discussion is really that the new editable install mechanisms allow tools to explore *alternative* ways of delivering editable installs. Such alternatives have trade-offs and at the moment, I think it's fair to say we're still discovering what those trade-offs are. Type checkers are simply one aspect of this. From my point of view, given all that you said above, it would be interesting to know what a type checker might want to see from an installed package (that included some source code that's delivered by, let's say "nefarious means" :-)) to let it correctly type check the names it can't see from simply scanning the normal `sys.path` locations. A package `foo` has its metadata stored in a directory in site-packages called something like `foo-1.0.dist-info/`. We could easily add a metadata file in that directory containing information for type checkers. Maybe something as simple as a list of pathnames for additional source code files to scan? Would something like that be useful? Paul

Jia Chen

5:03 a.m.

...

I was expecting that type checking simply wouldn't happen for editable installs.

Ah good point. I am also curious how many folks really expect editable installs to be type checked, given tht type checking already happens on the original source.

...

I'm not clear from your comment whether `.pth` support already exists in type checkers, or would need to be added

The support varies from type checker to type checker. According to Eric, `pyright` already supports it (https://github.com/microsoft/pylance-release/issues/2114#issuecomment-107520...). Pyre doesn't at the moment, but it should not be too difficult to add.

...

it would be interesting to know what a type checker might want to see from an installed package (that included some source code that's delivered by, let's say "nefarious means" :-)) to let it correctly type check the names it can't see from simply scanning the normal `sys.path` locations.

I think at the end of the day, if `sys.path` cannot include everything, then what we want to see is what other (statically-determinable) paths we need to search for imports. But if I understand correctly, the point Eric brought up earlier was less about the technical difficulty of adding more places/mechanisms to look for imports -- as long as we agree on a standardized approach, that's always doable. His real concern was about the necessity and sustainability of such a proposal: why not just include "other paths we need to search for imports" into `sys.path` directly (e.g. with `pth` files)? Would it be nice if we could always maintain the assumption that `sys.path` is all-encompassing and all we need to resolve all imports, as opposed to introducing new tweaks every now and then, which forces type checkers to play this catch-up game every time the aforementioned assumption gets broken? - Jia

Paul Moore

11:10 a.m.

On Wed, 4 May 2022 at 06:03, Jia Chen via Typing-sig <typing-sig@python.org> wrote:

...

I think at the end of the day, if `sys.path` cannot include everything, then what we want to see is what other (statically-determinable) paths we need to search for imports.

But if I understand correctly, the point Eric brought up earlier was less about the technical difficulty of adding more places/mechanisms to look for imports -- as long as we agree on a standardized approach, that's always doable. His real concern was about the necessity and sustainability of such a proposal: why not just include "other paths we need to search for imports" into `sys.path` directly (e.g. with `pth` files)? Would it be nice if we could always maintain the assumption that `sys.path` is all-encompassing and all we need to resolve all imports, as opposed to introducing new tweaks every now and then, which forces type checkers to play this catch-up game every time the aforementioned assumption gets broken?

Unfortunately, that simply isn't remotely true, unless we restrict what developers are allowed to do. (Assuming, as noted above, that those developers expect to get typing information - an assumption that I question, as I said). Import hooks are a feature of Python, and while I'm fine with saying they are too dynamic for static type checkers to support, ignoring their existence isn't reasonable. I'm not saying there's an issue with type checkers here, apart from being victims of their own success (they work so well in most cases that users forget that they can only act on static information). The canonical example of "why an editable install can't just add a directory to sys.path" is the following: Project directory mylib.py setup.py The setup.py file is the setuptools configuration, and states that the project consists of a single Python module, `mylib.py`. If we package this file and install it normally, it puts the single file `mylib.py` into site-packages, in a directory on `sys.path`. If we want to do an "editable" install, the traditional approach adds the project directory to sys.path via a `.pth` file that gets installed to site-packages. Which does indeed make `import mylib` work, and gets the code for mylib from the user's project directory. But it also allows `import setup` to work (doing horrid things, as `setup.py` is not intended to be imported). This is traditionally seen as a problem, but one people just live with (for better or worse). Part of the packaging community wants to provide "stricter" editable installs, which make *only* `mylib` importable, but not `setup`. That simply can't be done by saying "this directory gets scanned for importable files". I wrote a helper library that handles this case by using import hooks, and that's probably what triggered this discussion, as build backends are using that library (or at least the same approach). Personally, my attitude is "don't use that mechanism if you care about type checker support, and type checkers can't handle it". This discussion is mostly about whether that approach can be made to be typechecker-friendly. But if you can't handle anything other than adding the complete contents of a directory to the list of "what is to be scanned", I think we're probably dead in the water (because of the setup.py use case). There are other possible mechanisms (such as creating a staging directory with symlinks to just the expected files and exposing that directory on sys.path) which may be something the packaging community could explore. There are problems (symlinks aren't guaranteed available everywhere) but those aren't typing issues. I don't know whether anyone is looking at those options (I'm not, personally). Paul

Bernat Gabor

7:34 p.m.

Hello everyone, Sorry for the late reply here, but I've been down with COVID after the conference so couldn't reply before. I'll try to explain and address some of the points raised above. Ah good point. I am also curious how many folks really expect editable

...

installs to be type checked, given tht type checking already happens on the original source.

The primary use case isn't necessarily the type checker. The primary use case here is to provide auto-complete (and type information) within IDEs. However, when we contacted the IDE maintainers their primary feedback was that their index is primarily built similarly to how type checkers do. Therefore, if we solve the problem for type checkers IDEs can reuse that.

...

I'm not clear from your comment whether `.pth` support already exists in type checkers, or would need to be added

The support varies from type checker to type checker. While PTH files (or more in general mechanisms that amend the sys.path) are a common way to achieve the editable effect it does have some downsides (as Paul described it in his detailed previous email). They have a few additional big downsides too: - only support file system paths (e.g. no support for loading code from a DB) - this actually impacts runtime: once you alter the sys.path you are no longer using import hooks, but instead fallback to a file path-based source loader. The primary goal we want to achieve here though is to *keep using import hooks at runtime while still providing static type/layout/shape information for static checkers.* We established that static checkers (type, IDE index, etc) cannot evaluate runtime logic by design. Therefore they *cannot directly use the dynamic import hooks*, but instead, they'll need to operate on some *separately provisioned files/paths* that checkers can handle. Note, this is not a totally new concept for type checkers, e.g. mypy supports the MYPYPATH environment variable to extend the static checkers' discovery location, however, we don't have a standardized way across type (and other static) checkers. Furthermore, it's not today possible for each individual package to request indexing from additional paths/files. This is in my opinion the feature gap we need to address. So here's my rough proposal I think could work. We should extend the binary distribution format ( https://packaging.python.org/en/latest/specifications/binary-distribution-fo...) to allow the addition of an additional STATIC_EXTRA.JSON file. If this is present IDE/type/static checkers are expected to handle them during indexing, on top of the current system. This JSON file would allow specifying the following operations as an ordered list: - Add an entire folder to be indexed (this would essentially have the same impact as a PTH file, but without impacting the interpreter at runtime). - Add a file to be indexed under a given module name (if the parent module is not already indexed the checker can refuse to load the file). The way this would function: - for editable installation, this metadata file can be (re)generated as part of the editable wheel build and the path(s) will link back directly to the source directory, - for generated files during the build (e.g. generating python code from service schema files), these can be updated during the package build into a side folder and linked in from there, - for import hooks that load code from a database the user might explode the content of the database onto the disk during package installation and link it by using the `STATIC_EXTRA.json` file from there (while still loading from the DB directly during runtime), For files directly available on the disk the static checker would be able to keep the index up to date without needing to actually move the file to the python interpreter site-packages. For files not available directly on disk (generated code or loaded from the database) the index would not reflect always reality and would require the user to periodically update the index files. But we'd get as close as we can without needing the static checkers to start up the Python interpreter. Let me know if anyone has any concerns with this approach, if not I'll try to summarize it as a PEP proposal. I believe this would make import hooks more usable in the language. Thanks, Bernat On Wed, May 4, 2022 at 5:11 AM Paul Moore <p.f.moore@gmail.com> wrote:

...

On Wed, 4 May 2022 at 06:03, Jia Chen via Typing-sig <typing-sig@python.org> wrote:

...
I think at the end of the day, if `sys.path` cannot include everything, then what we want to see is what other (statically-determinable) paths we need to search for imports.

But if I understand correctly, the point Eric brought up earlier was less about the technical difficulty of adding more places/mechanisms to look for imports -- as long as we agree on a standardized approach, that's always doable. His real concern was about the necessity and sustainability of such a proposal: why not just include "other paths we need to search for imports" into `sys.path` directly (e.g. with `pth` files)? Would it be nice if we could always maintain the assumption that `sys.path` is all-encompassing and all we need to resolve all imports, as opposed to introducing new tweaks every now and then, which forces type checkers to play this catch-up game every time the aforementioned assumption gets broken?

Unfortunately, that simply isn't remotely true, unless we restrict what developers are allowed to do. (Assuming, as noted above, that those developers expect to get typing information - an assumption that I question, as I said). Import hooks are a feature of Python, and while I'm fine with saying they are too dynamic for static type checkers to support, ignoring their existence isn't reasonable. I'm not saying there's an issue with type checkers here, apart from being victims of their own success (they work so well in most cases that users forget that they can only act on static information).

The canonical example of "why an editable install can't just add a directory to sys.path" is the following:

Project directory mylib.py setup.py

The setup.py file is the setuptools configuration, and states that the project consists of a single Python module, `mylib.py`. If we package this file and install it normally, it puts the single file `mylib.py` into site-packages, in a directory on `sys.path`.

If we want to do an "editable" install, the traditional approach adds the project directory to sys.path via a `.pth` file that gets installed to site-packages. Which does indeed make `import mylib` work, and gets the code for mylib from the user's project directory. But it also allows `import setup` to work (doing horrid things, as `setup.py` is not intended to be imported). This is traditionally seen as a problem, but one people just live with (for better or worse).

Part of the packaging community wants to provide "stricter" editable installs, which make *only* `mylib` importable, but not `setup`. That simply can't be done by saying "this directory gets scanned for importable files". I wrote a helper library that handles this case by using import hooks, and that's probably what triggered this discussion, as build backends are using that library (or at least the same approach). Personally, my attitude is "don't use that mechanism if you care about type checker support, and type checkers can't handle it". This discussion is mostly about whether that approach can be made to be typechecker-friendly. But if you can't handle anything other than adding the complete contents of a directory to the list of "what is to be scanned", I think we're probably dead in the water (because of the setup.py use case).

There are other possible mechanisms (such as creating a staging directory with symlinks to just the expected files and exposing that directory on sys.path) which may be something the packaging community could explore. There are problems (symlinks aren't guaranteed available everywhere) but those aren't typing issues. I don't know whether anyone is looking at those options (I'm not, personally).

Paul _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gaborjbernat@gmail.com

Eric Traut

11:13 p.m.

Bernat, sorry to hear you were down with Covid. I hope you are feeling better now. As you point out, this issue affects more than static type checkers. It affects all tooling that relies on static analysis. That includes language servers (like pylance), smart editors (like PyCharm), code reformatters (like black and isort), scanners for detecting code vulnerabilities and IP (code provenance) issues, etc. In other words, it affects a wide variety of tools that are collectively used by the majority of Python developers in their day-to-day work. Supporting scenarios where code is not located in the file system is difficult for any static analysis tool. I don’t think this will ever work well. Thankfully, this is used very rarely today. The few developers who are relying on loading code from alternative locations like DBs presumably have very specific needs and understand the tradeoffs of the approach. My recommendation is that we not optimize for this case because it greatly complicates things. And even the best of solutions will fall apart at the edges. Let’s explore solutions for code that lives in the file system, and once we’re happy with that solution, we can look for ways that might partially accommodate the harder case. I think your proposal for “STATIC_EXTRA.json” (or something similar) could work technically, but it will be a long time before all tools implement it. As you point out, adoption of support for “.pth” files is still incomplete many years after they were introduced. The import mechanism in Python is already extremely complex for static analysis tools to support. We need to handle all of the idiosyncrasies of traditional and namespace packages, stub libraries, native binaries, zip-based eggs, and the complex prioritization rules dictated by PEP 561. Each one of these by itself is not that complex, but some of them interact in extremely complex ways. This is a continual source of bugs and inconsistencies across tools and between tools and the runtime. Adding yet more complexity by introducing something like “STATIC_EXTRA.json” should be done only if we think that it provides sufficient new value. I’m unconvinced in this case. You’ve indicated that “.pth” files have some shortcomings that this new mechanism would address, but the shortcomings that you’ve described don’t seem very significant or problematic. You mentioned that `STATIC_EXTRA.json` could be generated at install or package build time. The same thing is true of a “.pth” file. The way this would function: - for editable installation, the “.pth” can be (re)generated - for generated files during the build, a “.pth” file can generated or updated to point to the directory where they are generated - for import hooks that load code from a DB, the content of the DB can be exploded to disk and a “.pth” file can point to them I understand your point about the presence of “.pth” files changing the runtime behavior for import hooks that access code that lives outside of the file system. The small number of developers who want to store their code in DBs can decide whether the tradeoff is worth it. Most static analysis tools already support “.pth” files. Jia mentioned that pyre doesn’t yet, but it should be straightforward for them to add this. My recommendation is that we rally behind “.pth” files. I think this approach is more pragmatic than introducing yet another mechanism that needs to be implemented by all of the tools in the ecosystem. -Eric -- Eric Traut Contributor to Pyright & Pylance Microsoft

Bernat Gabor

11:25 p.m.

Hello, I think you severely underestimate how bad it is for code meant only for the static analysis tools to be part of the runtime interpreter. Your proposal effectively renders import hooks totally unusable as would be shadowed by whatever the pth file injects. Furthermore you also ignore on selective inclusion and exclusion of python files from within a folder (another problem PTH files cannot handle). Additionally I'm aware that many core developers would prefer removing support for PTH files, so that also discourages me to use them. Also pth files are not usable to inject a generated code as part of another package (PTH files can only inject paths at the root of purelib/platlib, not under a given package). For all those reasons I don't consider your proposal a viable path ahead. Thanks, Bernat On Sat, 7 May 2022, 17:14 Eric Traut, <eric@traut.com> wrote:

...

Bernat, sorry to hear you were down with Covid. I hope you are feeling better now.

As you point out, this issue affects more than static type checkers. It affects all tooling that relies on static analysis. That includes language servers (like pylance), smart editors (like PyCharm), code reformatters (like black and isort), scanners for detecting code vulnerabilities and IP (code provenance) issues, etc. In other words, it affects a wide variety of tools that are collectively used by the majority of Python developers in their day-to-day work.

Supporting scenarios where code is not located in the file system is difficult for any static analysis tool. I don’t think this will ever work well. Thankfully, this is used very rarely today. The few developers who are relying on loading code from alternative locations like DBs presumably have very specific needs and understand the tradeoffs of the approach. My recommendation is that we not optimize for this case because it greatly complicates things. And even the best of solutions will fall apart at the edges. Let’s explore solutions for code that lives in the file system, and once we’re happy with that solution, we can look for ways that might partially accommodate the harder case.

I think your proposal for “STATIC_EXTRA.json” (or something similar) could work technically, but it will be a long time before all tools implement it. As you point out, adoption of support for “.pth” files is still incomplete many years after they were introduced.

The import mechanism in Python is already extremely complex for static analysis tools to support. We need to handle all of the idiosyncrasies of traditional and namespace packages, stub libraries, native binaries, zip-based eggs, and the complex prioritization rules dictated by PEP 561. Each one of these by itself is not that complex, but some of them interact in extremely complex ways. This is a continual source of bugs and inconsistencies across tools and between tools and the runtime. Adding yet more complexity by introducing something like “STATIC_EXTRA.json” should be done only if we think that it provides sufficient new value. I’m unconvinced in this case. You’ve indicated that “.pth” files have some shortcomings that this new mechanism would address, but the shortcomings that you’ve described don’t seem very significant or problematic.

You mentioned that `STATIC_EXTRA.json` could be generated at install or package build time. The same thing is true of a “.pth” file. The way this would function:

- for editable installation, the “.pth” can be (re)generated - for generated files during the build, a “.pth” file can generated or updated to point to the directory where they are generated - for import hooks that load code from a DB, the content of the DB can be exploded to disk and a “.pth” file can point to them

I understand your point about the presence of “.pth” files changing the runtime behavior for import hooks that access code that lives outside of the file system. The small number of developers who want to store their code in DBs can decide whether the tradeoff is worth it.

Most static analysis tools already support “.pth” files. Jia mentioned that pyre doesn’t yet, but it should be straightforward for them to add this. My recommendation is that we rally behind “.pth” files. I think this approach is more pragmatic than introducing yet another mechanism that needs to be implemented by all of the tools in the ecosystem.

-Eric

-- Eric Traut Contributor to Pyright & Pylance Microsoft _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gaborjbernat@gmail.com

Guido van Rossum

12:10 a.m.

On Sat, May 7, 2022 at 4:14 PM Eric Traut <eric@traut.com> wrote:

...

Most static analysis tools already support “.pth” files. Jia mentioned that pyre doesn’t yet, but it should be straightforward for them to add this. My recommendation is that we rally behind “.pth” files. I think this approach is more pragmatic than introducing yet another mechanism that needs to be implemented by all of the tools in the ecosystem.

Hm. You're the first person I've met in a long time who regards PTH files as something we should do more of -- IIUC they are almost universally considered a necessary evil. Especially the fact that they can be used to do other stuff than add entries to sys.path. But also that -- a long sys.path tends to slow down module search, and it seems just wrong that adding N packages would append N sys.path entries: if we wanted each package to have its own entry we should have designed it as a mapping, not a search path. TBH, if all we're trying to solve is editable installs, I am assuming that those would be used by a vanishingly small proportion of users (otherwise we're doing something else wrong with package installs), so I would personally be okay if an editable install wasn't visible to my IDE or type checker, or if those tools were to fall back to typeshed or a PEP 561 stub package. Regarding your claim that almost nobody loads code from a database, apparently that's actually very popular in the banking sector ( https://calpaterson.com/bank-python.html) -- and those people have a *lot* of Python code. I believe that VS Code is using (or going to use, or may use?) a Virtual Filesystem to access user code. Would it really be so terrible if there were some corner of the VFS namespace to map to the Python module (spec) namespace? It could be served from a Python process that can give you the source of a module given its full name (without executing the module, of course -- the import loader machinery should be able to guarantee that already). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Bernat Gabor

12:29 a.m.

...

TBH, if all we're trying to solve is editable installs, I am assuming that those would be used by a vanishingly small proportion of users (otherwise we're doing something else wrong with package installs), so I would personally be okay if an editable install wasn't visible to my IDE or type checker, or if those tools were to fall back to typeshed or a PEP 561 stub package.

Generally, they are two types of people when developing libraries: - who put everything in the current working directory and use the fact that the current working directory is added to the sys.path automatically - these people do not need editable installations as the lib appears to be installed without requiring installation, - who prefer to put the library into a src folder and use editable installations to manifest the content of the src folder to the import system (either via pth files, symlinks or import hooks). In my experience, there's roughly a 60 to 40 divide in favour of the first. Especially more experienced/advanced users tend to prefer the src layout over the inline approach. Based on this I don't think it's accurate to say that it's used by a vanishingly small proportion of users. I personally use it for all my projects, and for why src layout can be better than the inline source see Paul Ganssle blog post here https://blog.ganssle.io/tag/tox.html explaining some of the problems (what's in your source directory is not necessarily what'll be put in your purelib/platlib during installation). Also, I do not think that what we want to solve is editable installations per se. What I'd like to solve here is supporting import hooks for static checkers. Editable installations are just a good example of where import hooks can be used. Sadly, while the db/dynamic code stub files can be used to work around the problem; for disks that live on the disk just not under the site-packages folder, they're less useful. All the best, Bernat On Sat, May 7, 2022 at 6:11 PM Guido van Rossum <guido@python.org> wrote:

...

On Sat, May 7, 2022 at 4:14 PM Eric Traut <eric@traut.com> wrote:

...
Most static analysis tools already support “.pth” files. Jia mentioned that pyre doesn’t yet, but it should be straightforward for them to add this. My recommendation is that we rally behind “.pth” files. I think this approach is more pragmatic than introducing yet another mechanism that needs to be implemented by all of the tools in the ecosystem.

Hm. You're the first person I've met in a long time who regards PTH files as something we should do more of -- IIUC they are almost universally considered a necessary evil. Especially the fact that they can be used to do other stuff than add entries to sys.path. But also that -- a long sys.path tends to slow down module search, and it seems just wrong that adding N packages would append N sys.path entries: if we wanted each package to have its own entry we should have designed it as a mapping, not a search path.

TBH, if all we're trying to solve is editable installs, I am assuming that those would be used by a vanishingly small proportion of users (otherwise we're doing something else wrong with package installs), so I would personally be okay if an editable install wasn't visible to my IDE or type checker, or if those tools were to fall back to typeshed or a PEP 561 stub package.

Regarding your claim that almost nobody loads code from a database, apparently that's actually very popular in the banking sector ( https://calpaterson.com/bank-python.html) -- and those people have a *lot* of Python code.

I believe that VS Code is using (or going to use, or may use?) a Virtual Filesystem to access user code. Would it really be so terrible if there were some corner of the VFS namespace to map to the Python module (spec) namespace? It could be served from a Python process that can give you the source of a module given its full name (without executing the module, of course -- the import loader machinery should be able to guarantee that already).

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gaborjbernat@gmail.com

Guido van Rossum

1:02 a.m.

On Sat, May 7, 2022 at 17:30 Bernat Gabor <gaborjbernat@gmail.com> wrote:

...

TBH, if all we're trying to solve is editable installs, I am assuming that

...
those would be used by a vanishingly small proportion of users (otherwise we're doing something else wrong with package installs), so I would personally be okay if an editable install wasn't visible to my IDE or type checker, or if those tools were to fall back to typeshed or a PEP 561 stub package.

Generally, they are two types of people when developing libraries:

- who put everything in the current working directory and use the fact that the current working directory is added to the sys.path automatically - these people do not need editable installations as the lib appears to be installed without requiring installation, - who prefer to put the library into a src folder and use editable installations to manifest the content of the src folder to the import system (either via pth files, symlinks or import hooks).

In my experience, there's roughly a 60 to 40 divide in favour of the first. Especially more experienced/advanced users tend to prefer the src layout over the inline approach. Based on this I don't think it's accurate to say that it's used by a vanishingly small proportion of users. I personally use it for all my projects, and for why src layout can be better than the inline source see Paul Ganssle blog post here https://blog.ganssle.io/tag/tox.html explaining some of the problems (what's in your source directory is not necessarily what'll be put in your purelib/platlib during installation).

Ah, that’s not what I meant. I meant very few users are developing packages. E.g. there are millions of pandas users but few of them are modifying pandas. (I hope. :-) Also, I do not think that what we want to solve is editable installations

...

per se. What I'd like to solve here is supporting import hooks for static checkers. Editable installations are just a good example of where import hooks can be used.

Agreed. Sadly, while the db/dynamic code stub files can be used to work around the

...

problem; for disks that live on the disk just not under the site-packages folder, they're less useful.

Not sure I follow. —Guido All the best,

...

Bernat

On Sat, May 7, 2022 at 6:11 PM Guido van Rossum <guido@python.org> wrote:

...
On Sat, May 7, 2022 at 4:14 PM Eric Traut <eric@traut.com> wrote:

...
Most static analysis tools already support “.pth” files. Jia mentioned that pyre doesn’t yet, but it should be straightforward for them to add this. My recommendation is that we rally behind “.pth” files. I think this approach is more pragmatic than introducing yet another mechanism that needs to be implemented by all of the tools in the ecosystem.

Hm. You're the first person I've met in a long time who regards PTH files as something we should do more of -- IIUC they are almost universally considered a necessary evil. Especially the fact that they can be used to do other stuff than add entries to sys.path. But also that -- a long sys.path tends to slow down module search, and it seems just wrong that adding N packages would append N sys.path entries: if we wanted each package to have its own entry we should have designed it as a mapping, not a search path.

TBH, if all we're trying to solve is editable installs, I am assuming that those would be used by a vanishingly small proportion of users (otherwise we're doing something else wrong with package installs), so I would personally be okay if an editable install wasn't visible to my IDE or type checker, or if those tools were to fall back to typeshed or a PEP 561 stub package.

Regarding your claim that almost nobody loads code from a database, apparently that's actually very popular in the banking sector ( https://calpaterson.com/bank-python.html) -- and those people have a *lot* of Python code.

I believe that VS Code is using (or going to use, or may use?) a Virtual Filesystem to access user code. Would it really be so terrible if there were some corner of the VFS namespace to map to the Python module (spec) namespace? It could be served from a Python process that can give you the source of a module given its full name (without executing the module, of course -- the import loader machinery should be able to guarantee that already).

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

_______________________________________________

...
Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gaborjbernat@gmail.com

--

--Guido (mobile)

Bernat Gabor

1:26 a.m.

Sadly, while the db/dynamic code stub files can be used to work around the problem; for files that live on the disk just not under the site-packages folder, they're less useful. What I mean that if I have a SRC folder on the disk and I'm using import hooks to expose them to the python interpreter as they would be installed into the purelib/platlib locations I cannot use the stub mechanism to expose these files to the static checkers. At best I could take all python files and copy them via a pyi files to a stub package. I'd like to get to a world where editable installations (even when implemented via import hooks) mostly work in IDEs. This will make significantly easier the life of these library developers. And while I agree users of the library are likely two order of magnitude bigger than maintainers, I think we can also agree that for a sustainable ecosystem the life of maintainers should also be easy as possible. Maintaining a library is already hard let's not make it harder by taking away the convenience of autocomplete in IDEs. On Sat, 7 May 2022, 19:02 Guido van Rossum, <guido@python.org> wrote:

...

On Sat, May 7, 2022 at 17:30 Bernat Gabor <gaborjbernat@gmail.com> wrote:

...
TBH, if all we're trying to solve is editable installs, I am assuming

...
that those would be used by a vanishingly small proportion of users (otherwise we're doing something else wrong with package installs), so I would personally be okay if an editable install wasn't visible to my IDE or type checker, or if those tools were to fall back to typeshed or a PEP 561 stub package.

Generally, they are two types of people when developing libraries:

- who put everything in the current working directory and use the fact that the current working directory is added to the sys.path automatically - these people do not need editable installations as the lib appears to be installed without requiring installation, - who prefer to put the library into a src folder and use editable installations to manifest the content of the src folder to the import system (either via pth files, symlinks or import hooks).

In my experience, there's roughly a 60 to 40 divide in favour of the first. Especially more experienced/advanced users tend to prefer the src layout over the inline approach. Based on this I don't think it's accurate to say that it's used by a vanishingly small proportion of users. I personally use it for all my projects, and for why src layout can be better than the inline source see Paul Ganssle blog post here https://blog.ganssle.io/tag/tox.html explaining some of the problems (what's in your source directory is not necessarily what'll be put in your purelib/platlib during installation).

Ah, that’s not what I meant. I meant very few users are developing packages. E.g. there are millions of pandas users but few of them are modifying pandas. (I hope. :-)

Also, I do not think that what we want to solve is editable installations

...
per se. What I'd like to solve here is supporting import hooks for static checkers. Editable installations are just a good example of where import hooks can be used.

Agreed.

Sadly, while the db/dynamic code stub files can be used to work around the

...
problem; for disks that live on the disk just not under the site-packages folder, they're less useful.

Not sure I follow.

—Guido

All the best,

...
Bernat

On Sat, May 7, 2022 at 6:11 PM Guido van Rossum <guido@python.org> wrote:

...
On Sat, May 7, 2022 at 4:14 PM Eric Traut <eric@traut.com> wrote:

...
Most static analysis tools already support “.pth” files. Jia mentioned that pyre doesn’t yet, but it should be straightforward for them to add this. My recommendation is that we rally behind “.pth” files. I think this approach is more pragmatic than introducing yet another mechanism that needs to be implemented by all of the tools in the ecosystem.

Hm. You're the first person I've met in a long time who regards PTH files as something we should do more of -- IIUC they are almost universally considered a necessary evil. Especially the fact that they can be used to do other stuff than add entries to sys.path. But also that -- a long sys.path tends to slow down module search, and it seems just wrong that adding N packages would append N sys.path entries: if we wanted each package to have its own entry we should have designed it as a mapping, not a search path.

TBH, if all we're trying to solve is editable installs, I am assuming that those would be used by a vanishingly small proportion of users (otherwise we're doing something else wrong with package installs), so I would personally be okay if an editable install wasn't visible to my IDE or type checker, or if those tools were to fall back to typeshed or a PEP 561 stub package.

Regarding your claim that almost nobody loads code from a database, apparently that's actually very popular in the banking sector ( https://calpaterson.com/bank-python.html) -- and those people have a *lot* of Python code.

I believe that VS Code is using (or going to use, or may use?) a Virtual Filesystem to access user code. Would it really be so terrible if there were some corner of the VFS namespace to map to the Python module (spec) namespace? It could be served from a Python process that can give you the source of a module given its full name (without executing the module, of course -- the import loader machinery should be able to guarantee that already).

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

_______________________________________________

...
Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gaborjbernat@gmail.com

--

--Guido (mobile)

Bernat Gabor

1:41 a.m.

...

If there are serious discussions about deprecating ".pth" files, have there been any proposals for a replacement mechanism? Bernat, do you think that the "STATIC_EXTRA.json" mechanism you have in mind could be that replacement?

No that's not my goal here. I think the primary reason people stay away from import hooks is that IDEs don't support them. I know many people in my work team spoke up against adopting PEP-660 when they found out it would lose auto-imports. Very-very few people take the effort to work on improving the existing systems. The majority just work around the problem by using the old solutions (e.g. PTH) file with all their problems. Every so often they ran into an issue caused by it, but then just blame wasting a few hours for not having better solutions available. I think the same is true with loading python modules from a database. Bigger companies might have a proprietary custom plugin that works around the issue, and the rest just live with what they have, they are well aware that if company Y wants to load code from DB X pylance adding support for it is likely to close to zero. After all, it's a custom DB and custom checker. Finally, for me personally as a packaging tools maintainer, it would be very important that what we come up here does not impact the interpreter at runtime (that should keep using the dynamic import hook). However, for ease of development of libraries, I'd like libraries to have a mechanism to feed their snapshot of dynamic code to the static checker. Otherwise extending and writing a library would become even harder, and we're already struggling with finding maintainers for projects. I'm OK with this better solution taking a longer time than our current workaround ones (after all until we get there we can keep using the current workarounds). Bernat On Sat, May 7, 2022 at 7:26 PM Bernat Gabor <gaborjbernat@gmail.com> wrote:

...

Sadly, while the db/dynamic code stub files can be used to work around the problem; for files that live on the disk just not under the site-packages folder, they're less useful.

What I mean that if I have a SRC folder on the disk and I'm using import hooks to expose them to the python interpreter as they would be installed into the purelib/platlib locations I cannot use the stub mechanism to expose these files to the static checkers. At best I could take all python files and copy them via a pyi files to a stub package. I'd like to get to a world where editable installations (even when implemented via import hooks) mostly work in IDEs. This will make significantly easier the life of these library developers. And while I agree users of the library are likely two order of magnitude bigger than maintainers, I think we can also agree that for a sustainable ecosystem the life of maintainers should also be easy as possible. Maintaining a library is already hard let's not make it harder by taking away the convenience of autocomplete in IDEs.

On Sat, 7 May 2022, 19:02 Guido van Rossum, <guido@python.org> wrote:

...
On Sat, May 7, 2022 at 17:30 Bernat Gabor <gaborjbernat@gmail.com> wrote:

...
TBH, if all we're trying to solve is editable installs, I am assuming

...
that those would be used by a vanishingly small proportion of users (otherwise we're doing something else wrong with package installs), so I would personally be okay if an editable install wasn't visible to my IDE or type checker, or if those tools were to fall back to typeshed or a PEP 561 stub package.

Generally, they are two types of people when developing libraries:

- who put everything in the current working directory and use the fact that the current working directory is added to the sys.path automatically - these people do not need editable installations as the lib appears to be installed without requiring installation, - who prefer to put the library into a src folder and use editable installations to manifest the content of the src folder to the import system (either via pth files, symlinks or import hooks).

In my experience, there's roughly a 60 to 40 divide in favour of the first. Especially more experienced/advanced users tend to prefer the src layout over the inline approach. Based on this I don't think it's accurate to say that it's used by a vanishingly small proportion of users. I personally use it for all my projects, and for why src layout can be better than the inline source see Paul Ganssle blog post here https://blog.ganssle.io/tag/tox.html explaining some of the problems (what's in your source directory is not necessarily what'll be put in your purelib/platlib during installation).

Ah, that’s not what I meant. I meant very few users are developing packages. E.g. there are millions of pandas users but few of them are modifying pandas. (I hope. :-)

Also, I do not think that what we want to solve is editable installations

...
per se. What I'd like to solve here is supporting import hooks for static checkers. Editable installations are just a good example of where import hooks can be used.

Agreed.

Sadly, while the db/dynamic code stub files can be used to work around

...
the problem; for disks that live on the disk just not under the site-packages folder, they're less useful.

Not sure I follow.

—Guido

All the best,

...
Bernat

On Sat, May 7, 2022 at 6:11 PM Guido van Rossum <guido@python.org> wrote:

...
On Sat, May 7, 2022 at 4:14 PM Eric Traut <eric@traut.com> wrote:

...
Most static analysis tools already support “.pth” files. Jia mentioned that pyre doesn’t yet, but it should be straightforward for them to add this. My recommendation is that we rally behind “.pth” files. I think this approach is more pragmatic than introducing yet another mechanism that needs to be implemented by all of the tools in the ecosystem.

Hm. You're the first person I've met in a long time who regards PTH files as something we should do more of -- IIUC they are almost universally considered a necessary evil. Especially the fact that they can be used to do other stuff than add entries to sys.path. But also that -- a long sys.path tends to slow down module search, and it seems just wrong that adding N packages would append N sys.path entries: if we wanted each package to have its own entry we should have designed it as a mapping, not a search path.

TBH, if all we're trying to solve is editable installs, I am assuming that those would be used by a vanishingly small proportion of users (otherwise we're doing something else wrong with package installs), so I would personally be okay if an editable install wasn't visible to my IDE or type checker, or if those tools were to fall back to typeshed or a PEP 561 stub package.

Regarding your claim that almost nobody loads code from a database, apparently that's actually very popular in the banking sector ( https://calpaterson.com/bank-python.html) -- and those people have a *lot* of Python code.

I believe that VS Code is using (or going to use, or may use?) a Virtual Filesystem to access user code. Would it really be so terrible if there were some corner of the VFS namespace to map to the Python module (spec) namespace? It could be served from a Python process that can give you the source of a module given its full name (without executing the module, of course -- the import loader machinery should be able to guarantee that already).

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

_______________________________________________

...
Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gaborjbernat@gmail.com

--

--Guido (mobile)

Mikhail Golubev

3:45 p.m.

Hi everyone! Sorry for joining the conversation *that* late. In short, PyCharm does support the current means of editable installs (.pth files) and does it very similar to what Eric described for Pyright. Namely, we have an interpreter "introspection" phase when we dynamically evaluate the content of sys.path with a helper script, as well as generate stubs for binary modules, retrieve the list of installed packages, etc. Because extra roots registered with .pth files automatically become visible in sys.path, we recognize them without any special handling. Although I'm not a huge fan of the idea, it is possible to add an extra step in this phase for checking either some physical file with metadata, such as STATIC_EXTRA.json, suggested by Bernat, or even another runtime value in addition to sys.path. Two questions, though. When should this process be repeated? For .pth files, we rely on regular filesystem events from under site-packages. Am I right that STATIC_EXTRA.json should appear under "{package}-{version}.dist-info" directory, so we can expect that if it ever changes it will trigger another filesystem event in a known location? who prefer to put the library into a src folder and use editable

...

installations to manifest the content of the src folder to the import system (either via pth files, symlinks or import hooks).

This is a bit confusing to me. Aren't editable installs, as they are, a mechanism too complicated just for exposing the content of "src" directory in sys.path? I believe that existing tooling offers plenty of options for altering the content of sys.path before running an executable script or tests, e.g. in PyCharm we automatically add the directories marked as "source roots" in PYTHONPATH before launching project run configurations, in tox.ini it's possible to prepend PYTHONPATH with an extra directory before launching tests, etc. There must be something else editable installs are necessary for and what "flat" projects, without the "src" subfolder, cannot achieve automatically. Is it extra steps in a package's setup.py, or working with multi-package projects where a few libraries are being developed simultaneously and share the same virtual environment? Could someone give me more context, please? On Sun, May 8, 2022 at 4:42 AM Bernat Gabor <gaborjbernat@gmail.com> wrote:

...

If there are serious discussions about deprecating ".pth" files, have

...
there been any proposals for a replacement mechanism? Bernat, do you think that the "STATIC_EXTRA.json" mechanism you have in mind could be that replacement?

No that's not my goal here. I think the primary reason people stay away from import hooks is that IDEs don't support them. I know many people in my work team spoke up against adopting PEP-660 when they found out it would lose auto-imports. Very-very few people take the effort to work on improving the existing systems. The majority just work around the problem by using the old solutions (e.g. PTH) file with all their problems. Every so often they ran into an issue caused by it, but then just blame wasting a few hours for not having better solutions available. I think the same is true with loading python modules from a database. Bigger companies might have a proprietary custom plugin that works around the issue, and the rest just live with what they have, they are well aware that if company Y wants to load code from DB X pylance adding support for it is likely to close to zero. After all, it's a custom DB and custom checker.

Finally, for me personally as a packaging tools maintainer, it would be very important that what we come up here does not impact the interpreter at runtime (that should keep using the dynamic import hook). However, for ease of development of libraries, I'd like libraries to have a mechanism to feed their snapshot of dynamic code to the static checker. Otherwise extending and writing a library would become even harder, and we're already struggling with finding maintainers for projects. I'm OK with this better solution taking a longer time than our current workaround ones (after all until we get there we can keep using the current workarounds).

Bernat

On Sat, May 7, 2022 at 7:26 PM Bernat Gabor <gaborjbernat@gmail.com> wrote:

...
Sadly, while the db/dynamic code stub files can be used to work around the problem; for files that live on the disk just not under the site-packages folder, they're less useful.

What I mean that if I have a SRC folder on the disk and I'm using import hooks to expose them to the python interpreter as they would be installed into the purelib/platlib locations I cannot use the stub mechanism to expose these files to the static checkers. At best I could take all python files and copy them via a pyi files to a stub package. I'd like to get to a world where editable installations (even when implemented via import hooks) mostly work in IDEs. This will make significantly easier the life of these library developers. And while I agree users of the library are likely two order of magnitude bigger than maintainers, I think we can also agree that for a sustainable ecosystem the life of maintainers should also be easy as possible. Maintaining a library is already hard let's not make it harder by taking away the convenience of autocomplete in IDEs.

On Sat, 7 May 2022, 19:02 Guido van Rossum, <guido@python.org> wrote:

...
On Sat, May 7, 2022 at 17:30 Bernat Gabor <gaborjbernat@gmail.com> wrote:

...
TBH, if all we're trying to solve is editable installs, I am assuming

...
that those would be used by a vanishingly small proportion of users (otherwise we're doing something else wrong with package installs), so I would personally be okay if an editable install wasn't visible to my IDE or type checker, or if those tools were to fall back to typeshed or a PEP 561 stub package.

Generally, they are two types of people when developing libraries:

- who put everything in the current working directory and use the fact that the current working directory is added to the sys.path automatically - these people do not need editable installations as the lib appears to be installed without requiring installation, - who prefer to put the library into a src folder and use editable installations to manifest the content of the src folder to the import system (either via pth files, symlinks or import hooks).

In my experience, there's roughly a 60 to 40 divide in favour of the first. Especially more experienced/advanced users tend to prefer the src layout over the inline approach. Based on this I don't think it's accurate to say that it's used by a vanishingly small proportion of users. I personally use it for all my projects, and for why src layout can be better than the inline source see Paul Ganssle blog post here https://blog.ganssle.io/tag/tox.html explaining some of the problems (what's in your source directory is not necessarily what'll be put in your purelib/platlib during installation).

Ah, that’s not what I meant. I meant very few users are developing packages. E.g. there are millions of pandas users but few of them are modifying pandas. (I hope. :-)

Also, I do not think that what we want to solve is editable

...
installations per se. What I'd like to solve here is supporting import hooks for static checkers. Editable installations are just a good example of where import hooks can be used.

Agreed.

Sadly, while the db/dynamic code stub files can be used to work around

...
the problem; for disks that live on the disk just not under the site-packages folder, they're less useful.

Not sure I follow.

—Guido

All the best,

...
Bernat

On Sat, May 7, 2022 at 6:11 PM Guido van Rossum <guido@python.org> wrote:

...
On Sat, May 7, 2022 at 4:14 PM Eric Traut <eric@traut.com> wrote:

...
Most static analysis tools already support “.pth” files. Jia mentioned that pyre doesn’t yet, but it should be straightforward for them to add this. My recommendation is that we rally behind “.pth” files. I think this approach is more pragmatic than introducing yet another mechanism that needs to be implemented by all of the tools in the ecosystem.

Hm. You're the first person I've met in a long time who regards PTH files as something we should do more of -- IIUC they are almost universally considered a necessary evil. Especially the fact that they can be used to do other stuff than add entries to sys.path. But also that -- a long sys.path tends to slow down module search, and it seems just wrong that adding N packages would append N sys.path entries: if we wanted each package to have its own entry we should have designed it as a mapping, not a search path.

TBH, if all we're trying to solve is editable installs, I am assuming that those would be used by a vanishingly small proportion of users (otherwise we're doing something else wrong with package installs), so I would personally be okay if an editable install wasn't visible to my IDE or type checker, or if those tools were to fall back to typeshed or a PEP 561 stub package.

Regarding your claim that almost nobody loads code from a database, apparently that's actually very popular in the banking sector ( https://calpaterson.com/bank-python.html) -- and those people have a *lot* of Python code.

I believe that VS Code is using (or going to use, or may use?) a Virtual Filesystem to access user code. Would it really be so terrible if there were some corner of the VFS namespace to map to the Python module (spec) namespace? It could be served from a Python process that can give you the source of a module given its full name (without executing the module, of course -- the import loader machinery should be able to guarantee that already).

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

_______________________________________________

...
Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gaborjbernat@gmail.com

--

--Guido (mobile)

_______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mikhail.golubev@jetbrains.com

-- Mikhail Golubev Software Developer JetBrains http://www.jetbrains.com The drive to develop

Eric Traut

1:20 a.m.

To be clear, I'm not a big fan of ".pth" files either, but it is a mechanism that already exists today and has broad (although admittedly not universal) support. I wasn't aware of any discussions about deprecating support for ".pth" files. If that's the case, then I agree it's not a good idea to expand their use. If there are serious discussions about deprecating ".pth" files, have there been any proposals for a replacement mechanism? Bernat, do you think that the "STATIC_EXTRA.json" mechanism you have in mind could be that replacement? And do you think it would be embraced by the broader community — including core developers and the packaging community? I'd prefer to avoid the situation where we need to support ".pth" files and "STATIC_EXTRA.json" files and yet some other new mechanism that eventually replaces ".pth" files. I want to make sure everyone understands that truly dynamic behaviors implemented in an import hook will not work well with static analysis tools. We need file locations that are described in a declarative manner — whether in a ".pth" file, a "STATIC_EXTRA.json" file, in the "sys.path" list, or some other some other static form. I get nervous when the stated goal is to support import hooks more broadly and expand their usage. I understand the appeal of the flexibility provided by import hooks, but that flexibility comes at a cost. Rather than stating that our goal is to support import hooks generally, we might be better served by looking at specific use cases and solving for those.

...

Regarding your claim that almost nobody loads code from a database, apparently that's actually very popular in the banking sector.

I'm basing that assertion on the fact that we haven't been contacted by any pylance or pyright users who have requested that we support this. I can find only one request that is related to import hooks (but doesn't talk about loading code from DBs) in the mypy issue tracker. I conclude that this usage is either really uncommon, there is little or no intersection between developers who load code from DBs and people who use type checkers and language servers, or developers who use dynamic loading mechanisms already understand the tradeoffs.

...

I believe that VS Code is using (or going to use, or may use?) a Virtual Filesystem to access user code.

My point is that all static tool developers will also need to implement this new mechanism. If it provides enough value over existing mechanisms, then it might be worth it. If the incremental value is low or it addresses only rare or niche use cases, then it's probably better to stick with existing mechanisms. -Eric

Jelle Zijlstra

1:25 a.m.

...

To be clear, I'm not a big fan of ".pth" files either, but it is a mechanism that already exists today and has broad (although admittedly not universal) support.

I wasn't aware of any discussions about deprecating support for ".pth" files. If that's the case, then I agree it's not a good idea to expand their use.

I think the behavior that makes people uneasy about .pth files is the fact

El sáb, 7 may 2022 a las 18:21, Eric Traut (<eric@traut.com>) escribió: that lines starting with "import " are exec-ed (as documented at https://docs.python.org/3/library/site.html). That behavior has some clear security and predictability issues and is hard for static type checkers to follow. However, I don't think the .pth file behavior that simply adds directories to the $PATH is as controversial.

Ofek Lev

August 2022

1:51 p.m.

I think whatever solution we decide should utilize a new standard directory in the shared location https://peps.python.org/pep-0427/#the-data-directory

938

Age (days ago)

1056

Last active (days ago)

List overview

Download

22 comments

8 participants

participants (8)

Bernat Gabor
Eric Traut
Guido van Rossum
Jelle Zijlstra
Jia Chen
Mikhail Golubev
Ofek Lev
Paul Moore

Support for import hooks for static, type checkers and IDEs

tags

participants (8)