Fwd: Re: Use of "python" shebang an installation error?
(oops, had to resend, forgot to change the destination to <distutils-sig@python.org>) On Mon, Jul 20, 2020 at 12:38 PM John Thorvald Wodder II <jwodder@gmail.com> wrote:
On 2020 Jul 20, at 15:25, David Mathog <dmathog@gmail.com> wrote:
Lately I have been working on a CentOS 8 machine, and it has "python2" and "python3", but no "python". Many packages install scripts with a shebang like:
#!/usr/bin/env python
and those do not work on this OS. Seems like rather a large missing dependency which goes by without triggering a fatal error.
How exactly are these packages getting installed? Last time I checked, both pip and setuptools automatically set the shebang in scripts (both console_script entry points and scripts declared with the "scripts" argument to `setup()`) to use the path of the running Python interpreter. Are these packages installed using your system package manager? If so, you should take the problem up with its maintainers.
Good point, I have been installing so many packages I get confused about which installer was used for which package. It turned out that many (but not all) of the files which contained #!/usr/bin/env python shebangs were installed using standard OS level tools (cmake, configure, make and the like). Example package, hisat2. I guess there isn't much choice for those but to scan the directories for python scripts and fix the shebangs. Installs that are initially into venvs and used pip3 are still an issue. Example: python3 -m venv johnnydep cd johnnydep grep -r '/usr/bin/env python$' . #finds: ./lib/python3.6/site-packages/pip/_vendor/appdirs.py:#!/usr/bin/env python ./lib/python3.6/site-packages/pip/_vendor/chardet/cli/chardetect.py:#!/usr/bin/env python ./lib/python3.6/site-packages/pip/_vendor/requests/certs.py:#!/usr/bin/env python ./lib/python3.6/site-packages/pkg_resources/_vendor/appdirs.py:#!/usr/bin/env python ./lib/python3.6/site-packages/johnnydep/pipper.py:#!/usr/bin/env python cd bin ls -1 | grep python lrwxrwxrwx. 1 modules modules 7 Jul 20 14:09 python -> python3 lrwxrwxrwx. 1 modules modules 16 Jul 20 14:09 python3 -> /usr/bin/python3 source activate pip3 install johnnydep head -1 johnnydep #!/home/common/lib/python3.6/Envs/johnnydep/bin/python #same for "tabulate" and all other shebangs in bin. cd .. grep -r '/usr/bin/env python$' . #same as before grep -r '/home/common/lib/python3.6/Envs/johnnydep/bin/python' . #just the files in the bin directory. It looks like none of the "#!/usr/bin/env python" shebangs within the venv are going to be used after the install, so perhaps those are harmless. The shebangs like #!/home/common/lib/python3.6/Envs/johnnydep/bin/python are OK within the venv, but once they are "devirtualized" they become a problem. That was a known problem though - my devirtualizer code already patches all of the ones in the bin directory. I have not seen any elsewhere (yet) within the venv, but there is probably no rule that keeps them from appearing in "share" or elsewhere. The "python" in use in the venv is just a symbolic link to "python3" which is itself a symbolic link to the actual program "/usr/bin/python3". It is constructed that way based on "python -m venv" which uses pieces which come from the CentOS 8 python3-libs-3.6.8-23.el8.x86_64 RPM. Is there some requirement that a venv have a "python"? Odd that RedHat (and so CentOS) provide a "python" there, but not in the OS itself. Thanks, David Mathog
On 2020-07-21 22:50, David Mathog wrote:
(oops, had to resend, forgot to change the destination to <distutils-sig@python.org>)
On Mon, Jul 20, 2020 at 12:38 PM John Thorvald Wodder II <jwodder@gmail.com> wrote:
On 2020 Jul 20, at 15:25, David Mathog <dmathog@gmail.com> wrote:
Lately I have been working on a CentOS 8 machine, and it has "python2" and "python3", but no "python". Many packages install scripts with a shebang like:
#!/usr/bin/env python
and those do not work on this OS. Seems like rather a large missing dependency which goes by without triggering a fatal error.
How exactly are these packages getting installed? Last time I checked, both pip and setuptools automatically set the shebang in scripts (both console_script entry points and scripts declared with the "scripts" argument to `setup()`) to use the path of the running Python interpreter. Are these packages installed using your system package manager? If so, you should take the problem up with its maintainers.
Good point, I have been installing so many packages I get confused about which installer was used for which package. It turned out that many (but not all) of the files which contained
#!/usr/bin/env python
shebangs were installed using standard OS level tools (cmake, configure, make and the like). Example package, hisat2. I guess there isn't much choice for those but to scan the directories for python scripts and fix the shebangs.
Hi, For many of these tools, you can pass in something like PYTHON=/usr/bin/python3 at build or install time to select the interpreter.
Installs that are initially into venvs and used pip3 are still an issue. Example:
python3 -m venv johnnydep cd johnnydep grep -r '/usr/bin/env python$' . #finds: ./lib/python3.6/site-packages/pip/_vendor/appdirs.py:#!/usr/bin/env python ./lib/python3.6/site-packages/pip/_vendor/chardet/cli/chardetect.py:#!/usr/bin/env python ./lib/python3.6/site-packages/pip/_vendor/requests/certs.py:#!/usr/bin/env python ./lib/python3.6/site-packages/pkg_resources/_vendor/appdirs.py:#!/usr/bin/env python ./lib/python3.6/site-packages/johnnydep/pipper.py:#!/usr/bin/env python cd bin ls -1 | grep python lrwxrwxrwx. 1 modules modules 7 Jul 20 14:09 python -> python3 lrwxrwxrwx. 1 modules modules 16 Jul 20 14:09 python3 -> /usr/bin/python3 source activate pip3 install johnnydep head -1 johnnydep #!/home/common/lib/python3.6/Envs/johnnydep/bin/python #same for "tabulate" and all other shebangs in bin. cd .. grep -r '/usr/bin/env python$' . #same as before grep -r '/home/common/lib/python3.6/Envs/johnnydep/bin/python' . #just the files in the bin directory.
It looks like none of the "#!/usr/bin/env python" shebangs within the venv are going to be used after the install, so perhaps those are harmless.
The shebangs like
#!/home/common/lib/python3.6/Envs/johnnydep/bin/python
are OK within the venv, but once they are "devirtualized" they become a problem. That was a known problem though - my devirtualizer code already patches all of the ones in the bin directory. I have not seen any elsewhere (yet) within the venv, but there is probably no rule that keeps them from appearing in "share" or elsewhere.
The "python" in use in the venv is just a symbolic link to "python3" which is itself a symbolic link to the actual program "/usr/bin/python3". It is constructed that way based on "python -m venv" which uses pieces which come from the CentOS 8 python3-libs-3.6.8-23.el8.x86_64 RPM. Is there some requirement that a venv have a "python"? Odd that RedHat (and so CentOS) provide a "python" there, but not in the OS itself.
Yes, a venv will always have a "python". For the OS itself, that was a deliberate (but tough) decision. It's not easy to find out what "python" should mean in a particular script (as you found out), and the errors you get when using the wrong one are not always enlightening. For an OS released before end of life of Python 2, but supported well after it, we* "refuse to guess" and don't provide "python" by default. In a venv, though, all is clear: you get the Python you created the venv with. There's some more info here: - across Linux distros: https://www.python.org/dev/peps/pep-0394/ - for EL8: https://developers.redhat.com/blog/2018/11/14/python-in-rhel-8/ From elsewhere in the thread:
The best I can do now is run
pdvctrl reshebang $TARGET_DIR
or
pdvctrl reshebang $ROOT_DIR...
and fix them up after the fact. (pdvctrl from python_devirtualizer here: https://sourceforge.net/projects/python-devirtualizer/). Even then it usually has to guess that "python" means "python3" and not "python2", and sometimes it guesses wrong. My recommendation is to try not to guess: either provide an option, or look at the Python in the venv you're devirtualizing.
Also note that a "reshebang" tool is included in Python source code[0], but it's not part of the standard library, so it doesn't always come with Python. (On Fedora/RHEL/Centos it's in the python3-devel package as /usr/bin/pathfix.py). We* use is quite often, and found some edge cases involving flags in shebangs. You might want to look into it if you run into those for devirtualizer's reshebang. * here, "we" == Red Hat's Python maintenance team [0]: https://github.com/python/cpython/blob/master/Tools/scripts/pathfix.py
On Tue, 21 Jul 2020, at 21:50, David Mathog wrote:
./lib/python3.6/site-packages/pip/_vendor/appdirs.py:#!/usr/bin/env python
Python packaging tools like pip generally differentiate between *scripts*, which are installed to be run from the command line, and *modules*, which are imported from other Python code. Files under site-packages are modules. Any special handling for shebangs, execute bits, or Windows .exe wrappers is usually done only for scripts. It's not unusual to see a shebang in modules - I think some editors put it in whenever you create a new Python file. But it doesn't usually do anything. If you want to run a module directly, the normal way now is with "python -m", which doesn't use the shebang. Thomas
On Wed, Jul 22, 2020 at 3:41 AM Thomas Kluyver <thomas@kluyver.me.uk> wrote:
On Tue, 21 Jul 2020, at 21:50, David Mathog wrote:
./lib/python3.6/site-packages/pip/_vendor/appdirs.py:#!/usr/bin/env python
Python packaging tools like pip generally differentiate between *scripts*, which are installed to be run from the command line, and *modules*, which are imported from other Python code. Files under site-packages are modules. Any special handling for shebangs, execute bits, or Windows .exe wrappers is usually done only for scripts.
It's not unusual to see a shebang in modules - I think some editors put it in whenever you create a new Python file. But it doesn't usually do anything. If you want to run a module directly, the normal way now is with "python -m", which doesn't use the shebang.
So in summary: 1. Invalid shebangs for modules in site-packages "should" be harmless - ignore them and hope for the best. 2. Shebangs for scripts "should" be correct. (They are while still inside a venv, but that shebang has to be corrected when the installation is moved to a normal environment, which my code is doing now.) Scripts usually end up in a "bin" directory on linux. Is that part of the installation standard or could a package put them in an arbitrary path (other than under "site-packages") under the venv's root, for instance in a directory named "scripts"? Fixing the shebangs by processing only "bin" is easy, traversing the entire tree is a bit messier. It would be good not to have to do so if that will never find an invalid shebang. Thanks, David Mathog Thanks,
Thomas -- Distutils-SIG mailing list -- distutils-sig@python.org To unsubscribe send an email to distutils-sig-leave@python.org https://mail.python.org/mailman3/lists/distutils-sig.python.org/ Message archived at https://mail.python.org/archives/list/distutils-sig@python.org/message/HPTRB...
On 2020 Jul 22, at 14:30, David Mathog <dmathog@gmail.com> wrote:
Scripts usually end up in a "bin" directory on linux. Is that part of the installation standard or could a package put them in an arbitrary path (other than under "site-packages") under the venv's root, for instance in a directory named "scripts"?
Pip always puts commands — both those declared with console_scripts entry points and those declared as "scripts" — in a bin/ directory (or whatever the equivalent is on Windows). The individual packages get no say in this. -- John Wodder
On Wed, 22 Jul 2020 at 19:31, David Mathog <dmathog@gmail.com> wrote:
but that shebang has to be corrected when the installation is moved to a normal environment, which my code is doing now.)
Moving files that are installed by Python packaging tools isn't supported. It might work, and you can probably make it work with some effort, but it's very much a case of "don't do it unless you know what you're doing". Correcting shebang lines is definitely something you will need to do. Paul
On Wed, Jul 22, 2020 at 1:27 PM Paul Moore <p.f.moore@gmail.com> wrote:
On Wed, 22 Jul 2020 at 19:31, David Mathog <dmathog@gmail.com> wrote:
but that shebang has to be corrected when the installation is moved to a normal environment, which my code is doing now.)
Moving files that are installed by Python packaging tools isn't supported. It might work, and you can probably make it work with some effort, but it's very much a case of "don't do it unless you know what you're doing". Correcting shebang lines is definitely something you will need to do.
I understand that moving files is iffy. However, given that I want only 1 copy of each installed python package on the system and I need to be able to install different versions of the same package (to resolve module version number conflicts between packages), moving the files around and replacing most copies with links to the single copy seemed like the only way to go. Here: https://www.python.org/dev/peps/pep-0394/#recommendation It says: When packaging third party Python scripts, distributors are encouraged to change less specific shebangs to more specific ones. This ensures software is used with the latest version of Python available, and it can remove a dependency on Python 2. The details on what specifics to set are left to the distributors; though. Example specifics could include: Changing python shebangs to python3 when Python 3.x is supported. Changing python shebangs to python2 when Python 3.x is not yet supported. Changing python3 shebangs to python3.8 if the software is built with Python 3.8. and then immediately after it says: When a virtual environment (created by the PEP 405 venv package or a similar tool such as virtualenv or conda) is active, the python command should refer to the virtual environment's interpreter and should always be available. The python3 or python2 command (according to the environment's interpreter version) should also be available. Which seems to be exactly the opposite of the preceding stanza. Ie, "always be as specific as possible" then "be general, and also provide specific" Personally I think the generic use of "python" both in shebangs and when invoking scripts as "python script" should be deprecated, with warnings from the installers to force developers to strip it out. It only works now by chance. Sure, there is a high probability it will work, but if one is on the wrong system it fails. If python4 (whenever it arrives) is not fully backwards compatible with python3 the generic use of "python" is going to cause untold grief. Whereas in that scenario all the code which uses "python3" should continue to function normally. Regards, David Mathog
On 23/7, 2020, at 06:51, David Mathog <dmathog@gmail.com> wrote:
On Wed, Jul 22, 2020 at 1:27 PM Paul Moore <p.f.moore@gmail.com> wrote:
On Wed, 22 Jul 2020 at 19:31, David Mathog <dmathog@gmail.com> wrote:
but that shebang has to be corrected when the installation is moved to a normal environment, which my code is doing now.)
Moving files that are installed by Python packaging tools isn't supported. It might work, and you can probably make it work with some effort, but it's very much a case of "don't do it unless you know what you're doing". Correcting shebang lines is definitely something you will need to do.
I understand that moving files is iffy. However, given that I want only 1 copy of each installed python package on the system and I need to be able to install different versions of the same package (to resolve module version number conflicts between packages), moving the files around and replacing most copies with links to the single copy seemed like the only way to go.
Here:
https://www.python.org/dev/peps/pep-0394/#recommendation
It says:
When packaging third party Python scripts, distributors are encouraged to change less specific shebangs to more specific ones. This ensures software is used with the latest version of Python available, and it can remove a dependency on Python 2. The details on what specifics to set are left to the distributors; though. Example specifics could include:
Changing python shebangs to python3 when Python 3.x is supported. Changing python shebangs to python2 when Python 3.x is not yet supported. Changing python3 shebangs to python3.8 if the software is built with Python 3.8.
and then immediately after it says:
When a virtual environment (created by the PEP 405 venv package or a similar tool such as virtualenv or conda) is active, the python command should refer to the virtual environment's interpreter and should always be available. The python3 or python2 command (according to the environment's interpreter version) should also be available.
Which seems to be exactly the opposite of the preceding stanza. Ie,
"always be as specific as possible"
then
"be general, and also provide specific"
The first paragraph is saying it is recommended to rewrite shebangs such as #!python3 to the actual Python interpreter the script is installed against, e.g. the interpreter in a virtual environment. The second paragraph is describing “which° command the installer should choose to refer to an interpreter. For CPython 3.8, for example, up to three commands may be available in a given virtual environment: {prefix}/bin/python {prefix}/bin/python3 {prefix}/bin/python3.8 and the installer should choose the most generic one, i.e. {prefix}/bin/python, because this avoids dealing with interpreter-specific naming conventions, e.g. the Python version (3 or 3.8), implementation (pypy or jython).
Personally I think the generic use of "python" both in shebangs and when invoking scripts as "python script" should be deprecated, with warnings from the installers to force developers to strip it out. It only works now by chance. Sure, there is a high probability it will work, but if one is on the wrong system it fails. If python4 (whenever it arrives) is not fully backwards compatible with python3 the generic use of "python" is going to cause untold grief. Whereas in that scenario all the code which uses "python3" should continue to function normally.
You are assuming “python3” or “python4” is a reliable command to refer to—which is an understandable misconception coming from a Linux background, but a misconception nonetheless. The wheel specification chose “python” for one reason: it is the only name that’s guaranteed to exist across operating systems and interpreter implementations. Also, by your logic (“python” would break when Python 4 comes out), wheels really should have break when a minor Python version is released, since a script written on Python 3.4 does not always work on 3.6 (as an example). So the installer really should use “python3.4” instead, does it not? But it does not, and nothing breaks right now—because wheels have other means to declare compatibility (wheel tags). An installer should be capable to put a script under the interpreter version only if the wheel tags allow it to. If the shebang needs to care about compatibility, something is already going very wrong. TP
Regards,
David Mathog -- Distutils-SIG mailing list -- distutils-sig@python.org To unsubscribe send an email to distutils-sig-leave@python.org https://mail.python.org/mailman3/lists/distutils-sig.python.org/ Message archived at https://mail.python.org/archives/list/distutils-sig@python.org/message/HAZUE...
On Wed, Jul 22, 2020 at 4:34 PM Tzu-ping Chung <uranusjr@gmail.com> wrote:
If the shebang needs to care about compatibility, something is already going very wrong.
We agree there, and it has. That python3 was not completely backwards compatible with python2 meant that it broke a lot of code. The EOL of python2 and the apparent intent of the major distros to drop it means that unmaintained python code will become unusable code. Neither of these outcomes is common for a major computer language. For instance, old K&R style C or F77 code from the 1990's will still compile with modern compilers (albeit with a blizzard of warning messages and possibly 32 bit to 64 bit issues). This matters quite a bit in scientific circles because published computational work becomes unreproducible if the tools break even when the input data is still available. When these issues are encountered I notify the program's author, assuming that there is still somebody maintaining the code. The most recent instance of this was "lastz" http://www.bx.psu.edu/~rsharris/lastz/ which in addition to the lastz program itself contains a bunch of python scripts. The shebang's used "python", they were Python2 code, and so they didn't work. The author in this case agreed that was a problem and is currently working on upgrading those scripts. I think the intent of the first quoted section was to say that if a script used a feature in Python 3.N that was absent in 3.(N-1) and below then 3.N should be used. That is perfectly reasonable. What isn't reasonable is the assumption that using just "python" is not a problem in a language which demonstrably does not maintain backwards compatibility between major versions (see above). Perhaps this circle could be squared if python had a "-r" (single letter for standard) command line parameter, then this: #!/usr/bin/env python -r N.M could conceivably be handled gracefully by the single "python", even if only to throw an error and state that version "N.M" is not supported. That would be far better than responding to version incompatibility with a slew of syntax errors, which is what happens now. It would handle both "2.7 is too old" and "3.9 required but this is a 3.8 installation". Regards, David Mathog
TP
Regards,
David Mathog -- Distutils-SIG mailing list -- distutils-sig@python.org To unsubscribe send an email to distutils-sig-leave@python.org https://mail.python.org/mailman3/lists/distutils-sig.python.org/ Message archived at https://mail.python.org/archives/list/distutils-sig@python.org/message/HAZUE...
participants (6)
-
David Mathog
-
John Thorvald Wodder II
-
Paul Moore
-
Petr Viktorin
-
Thomas Kluyver
-
Tzu-ping Chung