Add -P command line option to not add sys.path[0]
Hi, There are 4 main ways to run Python: (1) python -m module [...] (2) python script.py [...] (3) python -c code [...] (4) python [...] (1) and (2) insert the directory of the module/script at sys.path[0]. (3) and (4) insert an empty string at sys.path[0]. This behavior is convenient and is maybe part of Python usability success: importing a module in the current directory is as easy as "import other_module" (load other_module.py). But it's also a threat to security: an attacker can override a stdlib module by creating a Python script with the same name than a stdlib module, like os.py or shutil.py. People learning Python commonly create a file with the same name than a stdlib module (ex: random.py) and then are clueless in face of an ImportError exception. Changing the default behavior was discussed multiple times. No consensus was reached, maybe because most users like the current default behavior and are not affected by corner cases (see below). I propose adding a -P option to Python command line interface to "not add sys.path[0]": https://github.com/python/cpython/pull/31542 See the documentation in the PR for the exact behavior of this option. I prefer to add an environment variable, only pass the option explicitly on the command line. Since Python 3.4, there is already the -I ("isolated mode") option: https://docs.python.org/dev/using/cmdline.html#cmdoption-I The -I option has other effects like disabling user site directories, it option doesn't fit use cases of the -P option. One annoying issue of the Python default behavior is that running a script in /usr/bin/ as root can create or override .pyc files in the /usr directory, even in the /usr/bin/ directory. Example of this surprising and annoying issue: https://github.com/benjaminp/six/issues/359#issuecomment-996159668 The -P option can be used in #!/usr/bin/python shebang to avoid this issue. -- An alternative would be to change the default behavior to not add sys.path[0], and add an option to opt-in for Python 3.10 behavior. Here are my notes about it: https://github.com/vstinner/misc/blob/main/cpython/pep_path0.rst What do you think? Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On 4/26/22, Victor Stinner <vstinner@python.org> wrote:
There are 4 main ways to run Python:
(1) python -m module [...] (2) python script.py [...] (3) python -c code [...] (4) python [...]
(1) and (2) insert the directory of the module/script at sys.path[0].
Running a module with -m inserts the current working directory (the path, not an empty string) at sys.path[0], followed by the module directory at sys.path[1]. Only one entry is added if they're the same directory.
On Tue, Apr 26, 2022 at 2:50 AM Victor Stinner <vstinner@python.org> wrote:
Hi,
There are 4 main ways to run Python:
(1) python -m module [...] (2) python script.py [...] (3) python -c code [...] (4) python [...]
(1) and (2) insert the directory of the module/script at sys.path[0]. (3) and (4) insert an empty string at sys.path[0].
This behavior is convenient and is maybe part of Python usability success: importing a module in the current directory is as easy as "import other_module" (load other_module.py). But it's also a threat to security: an attacker can override a stdlib module by creating a Python script with the same name than a stdlib module, like os.py or shutil.py.
People learning Python commonly create a file with the same name than a stdlib module (ex: random.py) and then are clueless in face of an ImportError exception.
Changing the default behavior was discussed multiple times. No consensus was reached, maybe because most users like the current default behavior and are not affected by corner cases (see below).
I propose adding a -P option to Python command line interface to "not add sys.path[0]": https://github.com/python/cpython/pull/31542
We would use this in the Python extension for VS Code for case (1) as we have had issues with (1) when running tools on people's behalf. People will accidentally shadow the stdlib and then do something unexpected as an import side-effect in their shadowing module like delete files. Not something you want happening when you're just wanting to run Pylint. 😅 -Brett
See the documentation in the PR for the exact behavior of this option. I prefer to add an environment variable, only pass the option explicitly on the command line.
Since Python 3.4, there is already the -I ("isolated mode") option: https://docs.python.org/dev/using/cmdline.html#cmdoption-I
The -I option has other effects like disabling user site directories, it option doesn't fit use cases of the -P option.
One annoying issue of the Python default behavior is that running a script in /usr/bin/ as root can create or override .pyc files in the /usr directory, even in the /usr/bin/ directory. Example of this surprising and annoying issue: https://github.com/benjaminp/six/issues/359#issuecomment-996159668
The -P option can be used in #!/usr/bin/python shebang to avoid this issue.
--
An alternative would be to change the default behavior to not add sys.path[0], and add an option to opt-in for Python 3.10 behavior. Here are my notes about it: https://github.com/vstinner/misc/blob/main/cpython/pep_path0.rst
What do you think?
Victor -- Night gathers, and now my watch begins. It shall not end until my death. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/IU5Q2AXA... Code of Conduct: http://python.org/psf/codeofconduct/
On 4/26/2022 10:46 AM, Victor Stinner wrote:
I propose adding a -P option to Python command line interface to "not add sys.path[0]": https://github.com/python/cpython/pull/31542
See the documentation in the PR for the exact behavior of this option. I prefer to add an environment variable, only pass the option explicitly on the command line.
Another viable option might be to add an option to imply "import site", which would work together with -I to: * ignore environment variables (-E/-I) * omit implicit CWD imports (-I) * still process .pth files (-?) * still include site-packages and user site-packages in sys.path (-?) It seems likely that the proposed -P would almost always be used with -E, since if you can't control CWD then you presumably can't control environment variables either. The existing ._pth functionality starts by implying -I, and allows "import site" in the file to explicitly include site. A command-line option matching this behaviour would be consistent. There's also already configuration in our structures for import site, so there'd be no need to add new fields to public APIs for the option. The biggest issue I see is that the obvious command line options for "import site" are already used to imply "do not import site". But then, -P isn't obvious either. Maybe an -X option would suffice? Cheers, Steve
The only purpose of proposed -P option is to "not add sys.path[0]". There are use cases which only need that. Victor On Tue, Apr 26, 2022 at 8:37 PM Steve Dower <steve.dower@python.org> wrote:
On 4/26/2022 10:46 AM, Victor Stinner wrote:
I propose adding a -P option to Python command line interface to "not add sys.path[0]": https://github.com/python/cpython/pull/31542
See the documentation in the PR for the exact behavior of this option. I prefer to add an environment variable, only pass the option explicitly on the command line.
Another viable option might be to add an option to imply "import site", which would work together with -I to: * ignore environment variables (-E/-I) * omit implicit CWD imports (-I) * still process .pth files (-?) * still include site-packages and user site-packages in sys.path (-?)
It seems likely that the proposed -P would almost always be used with -E, since if you can't control CWD then you presumably can't control environment variables either.
The existing ._pth functionality starts by implying -I, and allows "import site" in the file to explicitly include site. A command-line option matching this behaviour would be consistent. There's also already configuration in our structures for import site, so there'd be no need to add new fields to public APIs for the option.
The biggest issue I see is that the obvious command line options for "import site" are already used to imply "do not import site". But then, -P isn't obvious either. Maybe an -X option would suffice?
Cheers, Steve
-- Night gathers, and now my watch begins. It shall not end until my death.
The use case for -P still uses environment variables like PYTHONWARNINGS or PYTHONUTF8. That's why -I (isolated) cannot be used. If there is an use case for a ._pth file importing the site module, maybe a different option can be added, no? Adding -P doesn't prevent that. But it seems like use cases are different enough to justify two options, no? Victor On Tue, Apr 26, 2022 at 11:52 PM Victor Stinner <vstinner@python.org> wrote:
The only purpose of proposed -P option is to "not add sys.path[0]". There are use cases which only need that.
Victor
On Tue, Apr 26, 2022 at 8:37 PM Steve Dower <steve.dower@python.org> wrote:
On 4/26/2022 10:46 AM, Victor Stinner wrote:
I propose adding a -P option to Python command line interface to "not add sys.path[0]": https://github.com/python/cpython/pull/31542
See the documentation in the PR for the exact behavior of this option. I prefer to add an environment variable, only pass the option explicitly on the command line.
Another viable option might be to add an option to imply "import site", which would work together with -I to: * ignore environment variables (-E/-I) * omit implicit CWD imports (-I) * still process .pth files (-?) * still include site-packages and user site-packages in sys.path (-?)
It seems likely that the proposed -P would almost always be used with -E, since if you can't control CWD then you presumably can't control environment variables either.
The existing ._pth functionality starts by implying -I, and allows "import site" in the file to explicitly include site. A command-line option matching this behaviour would be consistent. There's also already configuration in our structures for import site, so there'd be no need to add new fields to public APIs for the option.
The biggest issue I see is that the obvious command line options for "import site" are already used to imply "do not import site". But then, -P isn't obvious either. Maybe an -X option would suffice?
Cheers, Steve
-- Night gathers, and now my watch begins. It shall not end until my death.
-- Night gathers, and now my watch begins. It shall not end until my death.
Ah, I see that I didn't explain something well. The issue has two sides: one side is a fix security vulnerability, the second side is more about Python *usability*. The Python usability issue is that running "math.py" overrides the Python stdlib module called "math". "math.py" is just an example, Python 3.11 contains has 305 modules: a name conflict is likely, especially by users learning Python fall into this trap (create a file called "random.py" to play with the "random" module). The issue so common that IPython added a "launcher" to work around this issue, to remove sys.path[0]: https://github.com/ipython/ipykernel/commit/3f7a03d356eee3500261acf9eec6bad2... Another example is the pytest project: running "pytest [...]" is different than "python -m pytest [...]". The second command adds the current working directory which can change the test behavior. It's a documented issue: https://docs.pytest.org/en/latest/how-to/usage.html#calling-pytest-through-p... For the pytest use case, you still want to add the user site directory to sys.path and you still want to accept PYTHON environment variables like PYTHONWARNINGS. The only thing that you don't want is to add the current working directory to sys.path. Read also this discussion by Miro Hrončok about this usability issue in Fedora: https://discuss.python.org/t/python-flag-envvar-to-not-put-current-directory... Victor On Wed, Apr 27, 2022 at 5:57 PM Victor Stinner <vstinner@python.org> wrote:
The use case for -P still uses environment variables like PYTHONWARNINGS or PYTHONUTF8. That's why -I (isolated) cannot be used.
If there is an use case for a ._pth file importing the site module, maybe a different option can be added, no? Adding -P doesn't prevent that. But it seems like use cases are different enough to justify two options, no?
Victor
On Tue, Apr 26, 2022 at 11:52 PM Victor Stinner <vstinner@python.org> wrote:
The only purpose of proposed -P option is to "not add sys.path[0]". There are use cases which only need that.
Victor
On Tue, Apr 26, 2022 at 8:37 PM Steve Dower <steve.dower@python.org> wrote:
On 4/26/2022 10:46 AM, Victor Stinner wrote:
I propose adding a -P option to Python command line interface to "not add sys.path[0]": https://github.com/python/cpython/pull/31542
See the documentation in the PR for the exact behavior of this option. I prefer to add an environment variable, only pass the option explicitly on the command line.
Another viable option might be to add an option to imply "import site", which would work together with -I to: * ignore environment variables (-E/-I) * omit implicit CWD imports (-I) * still process .pth files (-?) * still include site-packages and user site-packages in sys.path (-?)
It seems likely that the proposed -P would almost always be used with -E, since if you can't control CWD then you presumably can't control environment variables either.
The existing ._pth functionality starts by implying -I, and allows "import site" in the file to explicitly include site. A command-line option matching this behaviour would be consistent. There's also already configuration in our structures for import site, so there'd be no need to add new fields to public APIs for the option.
The biggest issue I see is that the obvious command line options for "import site" are already used to imply "do not import site". But then, -P isn't obvious either. Maybe an -X option would suffice?
Cheers, Steve
-- Night gathers, and now my watch begins. It shall not end until my death.
-- Night gathers, and now my watch begins. It shall not end until my death.
-- Night gathers, and now my watch begins. It shall not end until my death.
On Tue, Apr 26, 2022 at 8:37 PM Steve Dower <steve.dower@python.org> wrote:
The biggest issue I see is that the obvious command line options for "import site" are already used to imply "do not import site". But then, -P isn't obvious either. Maybe an -X option would suffice?
I propose the short option "-P" rather than a long option like -X dont_add_path0 to be able to use the option in a Unix shebang. Sadly, Unix shebangs don't support long options (it's only possible to use a single long option per shebang, it's not convenient). Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On Tue, Apr 26, 2022 at 11:46 AM Victor Stinner <vstinner@python.org> wrote:
I propose adding a -P option to Python command line interface to "not add sys.path[0]": https://github.com/python/cpython/pull/31542
My plan is to merge this change at 2022-05-05, the day before the Python 3.11 feature freeze, unless someone has a good reason to not add this option or if you consider that we need more time to think about this issue. This issue is being discussed for 11 years, see for example: * https://bugs.python.org/issue13475 * https://discuss.python.org/t/python-flag-envvar-to-not-put-current-directory... Victor
On Wed, 27 Apr 2022 at 15:32, Victor Stinner <vstinner@python.org> wrote:
On Tue, Apr 26, 2022 at 11:46 AM Victor Stinner <vstinner@python.org> wrote:
I propose adding a -P option to Python command line interface to "not add sys.path[0]": https://github.com/python/cpython/pull/31542
My plan is to merge this change at 2022-05-05, the day before the Python 3.11 feature freeze,
Why leave it until the last minute? That just makes it harder to revert if someone immediately finds a problem with it.
unless someone has a good reason to not add this option or if you consider that we need more time to think about this issue.
This issue is being discussed for 11 years, see for example:
It seems very rushed to propose this and implement it days before 3.11 freeze. If it's been an issue for 11 years, then (a) why didn't anyone propose this solution months ago, and (b) surely it can wait another year? I don't have any particular objection to the feature, but whether you mean it to or not, the short timescale gives the impression that you're trying to rush something in without giving people time to discuss or consider alternatives. People have other things to do, and can't simply produce a response in a matter of hours. Apart from anything else, aren't a lot of people who might be interested going to be occupied with PyCon right now? Steve Dower mentioned some (IMO) reasonable points, and you pretty much dismissed them without any discussion. That doesn't seem like the right way to handle this. In particular, I think the question of how this flag interacts with all the other flags and settings that affect sys.path and how "isolated" Python is, is an important thing to consider[^1]. Paul [^1]: We've had multiple attempts to get locale and UTF8 handling right, and now have a mess of flags, environment variables, and options that frankly only the experts can understand. I fear that we could end up with the same issue for "Python initialisation flags" if we don't take a less rushed approach.
Since I didn't get any serious review on my pull request since February, I created this thread on python-dev to get more people looking into this issue. On Wed, Apr 27, 2022 at 5:30 PM Paul Moore <p.f.moore@gmail.com> wrote:
On Wed, 27 Apr 2022 at 15:32, Victor Stinner <vstinner@python.org> wrote:
On Tue, Apr 26, 2022 at 11:46 AM Victor Stinner <vstinner@python.org> wrote:
I propose adding a -P option to Python command line interface to "not add sys.path[0]": https://github.com/python/cpython/pull/31542
My plan is to merge this change at 2022-05-05, the day before the Python 3.11 feature freeze,
Why leave it until the last minute? That just makes it harder to revert if someone immediately finds a problem with it.
I wrote my PR in February. If it goes wrong, we will have until October to revert it. The idea is to merge it before beta1 to have 6 months to play with it and check for corner cases.
It seems very rushed to propose this and implement it days before 3.11 freeze. If it's been an issue for 11 years, then (a) why didn't anyone propose this solution months ago, and (b) surely it can wait another year?
Different solutions were proposed over the last 11 years. See for example: https://bugs.python.org/issue13475 Sadly, no solution was merged into Python, only discussed.
Steve Dower mentioned some (IMO) reasonable points, and you pretty much dismissed them without any discussion. That doesn't seem like the right way to handle this. In particular, I think the question of how this flag interacts with all the other flags and settings that affect sys.path and how "isolated" Python is, is an important thing to consider[^1].
See the init_config.rst documentation of my PR: isolated=1 implies add_path=0 (no behavior change) https://github.com/python/cpython/pull/31542/files Running Python with a ._pth file implies isolated=1 and so add_path=0 (no behavior change). It seems like Steve's proposal is orthogonal, but I don't think that it's exclusive. We can add a second option, no?
[^1]: We've had multiple attempts to get locale and UTF8 handling right, and now have a mess of flags, environment variables, and options that frankly only the experts can understand. I fear that we could end up with the same issue for "Python initialisation flags" if we don't take a less rushed approach.
The locale encoding, the Python filesystem encoding and the Python UTF-8 Mode are way more complicated problems. I spent years to fix issues about these, so I'm well aware of these issue. By the way, I also designed PEP 587 PyConfig API and I implemented it. Here the -P option effect is restricted to a single function: pymain_run_python(). My pull request can be summarized as: - else if (!config->isolated) { + else if (config->add_path0) { Do you think that we should pay attention to something in specific? Right now, I propose to not add an environment variable and -P is unrelated to -E (ignore env vars). Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On Wed, 27 Apr 2022 at 16:50, Victor Stinner <vstinner@python.org> wrote:
Since I didn't get any serious review on my pull request since February, I created this thread on python-dev to get more people looking into this issue.
Pull requests don't get much visibility from the wider community - I know I can't keep up with all the PRs submitted. Creating a thread on python-dev seems reasonable. Doing so with about a week until feature freeze seems less so. I'm not going to comment further on this specific proposal, except maybe to respond to other people's comments. I don't have enough time in the coming week to think through the implications and possibilities myself. Paul
Oh sorry, I mean that I prefer to *not* add an environment variable, but I'm not strongly against it. How would the environment varaible be used? A command line option is not enough? Victor On Wed, Apr 27, 2022 at 4:44 PM Antoine Pitrou <antoine@python.org> wrote:
On Tue, 26 Apr 2022 11:46:41 +0200 Victor Stinner <vstinner@python.org> wrote:
I prefer to add an environment variable, only pass the option explicitly on the command line.
I don't really understand this sentence, can you rephrase?
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/L4KBLOSE... Code of Conduct: http://python.org/psf/codeofconduct/
-- Night gathers, and now my watch begins. It shall not end until my death.
On Wed, 27 Apr 2022 17:37:20 +0200 Victor Stinner <vstinner@python.org> wrote:
Oh sorry, I mean that I prefer to *not* add an environment variable, but I'm not strongly against it.
How would the environment varaible be used? A command line option is not enough?
An environment variable is an easy to influence a program or system whose inner workings you don't control (for example a system that spawns child Python processes). And it sounds like a good idea to allow that given that it improves security? Regards Antoine.
Victor
On Wed, Apr 27, 2022 at 4:44 PM Antoine Pitrou <antoine@python.org> wrote:
On Tue, 26 Apr 2022 11:46:41 +0200 Victor Stinner <vstinner@python.org> wrote:
I prefer to add an environment variable, only pass the option explicitly on the command line.
I don't really understand this sentence, can you rephrase?
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/L4KBLOSE... Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Apr 27, 2022 at 5:56 PM Antoine Pitrou <antoine@python.org> wrote:
An environment variable is an easy to influence a program or system whose inner workings you don't control (for example a system that spawns child Python processes). And it sounds like a good idea to allow that given that it improves security?
Ok, you changed my mind and I added PYTHONDONTADDPATH0=1 env var. Example: $ ./python -c 'import sys, pprint; pprint.pprint(sys.path)' ['', '/usr/local/lib/python311.zip', '/home/vstinner/python/main/Lib', '/home/vstinner/python/main/build/lib.linux-x86_64-3.11-pydebug', '/home/vstinner/.local/lib/python3.11/site-packages'] $ PYTHONDONTADDPATH0=1 ./python -c 'import sys, pprint; pprint.pprint(sys.path)' ['/usr/local/lib/python311.zip', '/home/vstinner/python/main/Lib', '/home/vstinner/python/main/build/lib.linux-x86_64-3.11-pydebug', '/home/vstinner/.local/lib/python3.11/site-packages'] Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On 27. 04. 22 20:45, Barry wrote:
On 27 Apr 2022, at 17:22, Victor Stinner <vstinner@python.org> wrote:
Ok, you changed my mind and I added PYTHONDONTADDPATH0=1 env var. Example:
Maybe the env var say what it is not adding rather then where it adds it. PYTHONDONTADDPWD=1
But it is not "just" the PWD. In the case of shebangs, it's actually the script's directory. E.g. a script in /usr/bin/ normally has /usr/bin/ in sys.path (which is not desired, hence we (Fedora) would probably add the -P flag to default shebangs for programs in /usr/bin/). -- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok
On 27 Apr 2022, at 20:21, Miro Hrončok <mhroncok@redhat.com> wrote:
On 27. 04. 22 20:45, Barry wrote:
On 27 Apr 2022, at 17:22, Victor Stinner <vstinner@python.org> wrote:
Ok, you changed my mind and I added PYTHONDONTADDPATH0=1 env var. Example: Maybe the env var say what it is not adding rather then where it adds it. PYTHONDONTADDPWD=1
But it is not "just" the PWD. In the case of shebangs, it's actually the script's directory. E.g. a script in /usr/bin/ normally has /usr/bin/ in sys.path (which is not desired, hence we (Fedora) would probably add the -P flag to default shebangs for programs in /usr/bin/).
naming is so hard... may do "don't add implicit dirs"? PYTHONDONTADDIMPLICTDIRS=1 (I wish there where _ to show the word boundaries, but that ship has sailed) Barry
-- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok
Hi, I updated my PR https://github.com/python/cpython/pull/31542 and I plan to merge it soon. It seems like most people need and like this feature. About the feature name, nobody liked the "add_path0" name which is misleading: "path0" is not easy to get, and the path is prepended, not added. I renamed "add_path0" to "safe_path" (opposite meaning) and I renamed "PYTHONDONTADDPATH0" env var to "PYTHONSAFEPATH": shorter, easy to write. In terms of usability, IMO "safe_path=1" is easier to understand than "add_path0=0". For the exact meaning of this option, well, I wrote documentation. In the documentation, I replaced "don't add sys.path[0]" with "don't prepend an unsafe path to sys.path: (...)" with an explanation of which "unsafe path" is prepended. Adding this -P command line option makes the Python command line even more complicated, with existing -I and -E options, the "._pth" file, etc. But well, not all users want the same thing ;-) Victor
participants (9)
-
Antoine Pitrou
-
Barry
-
Barry Scott
-
Brett Cannon
-
Eryk Sun
-
Miro Hrončok
-
Paul Moore
-
Steve Dower
-
Victor Stinner