Linux-sig
Threads by month
- ----- 2024 -----
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
July 2021
- 1 discussions
Draft PEP: Graceful cooperation between external and Python package managers
by Geoffrey Thomas 25 Aug '21
by Geoffrey Thomas 25 Aug '21
25 Aug '21
Hi Linux-SIG,
At the recent PyCon US "Linux in Distros" sprint, we wrote a draft PEP
about making external package managers like apt/dnf/etc. and
Python-specific package managers like pip play more nicely together. This
includes the "sudo pip install" problem, but it's a little more general
than that.
The short version is it has two recommendations:
1. When a distro indicates it's managing a Python installation, tools like
pip should only install into a virtualenv, by default (with a way to
override it), and show an error message that the distro can customize.
2. Distros should have two site-packages directories, one for
distro-packaged files and one for local-sysadmin-installed files (e.g.,
/usr/lib/python3.x/site-packags vs.
/usr/local/lib/python3.x/site-packages), and tools like pip should only
create, delete, or modify files in the latter directory.
The draft PEP is on my GitHub at https://github.com/geofft/peps branch
"pip-only-in-virtualenv" in the file pep-9999.rst.
I've uploaded a rendered copy to https://kerberos.club/tmp/pep-9999.html
and included the text of the PEP below.
Since Linux-SIG has a bunch of packagers and users of distro Python, we
figured we'd run this by all of you to make sure this idea is broadly
reasonable before formally bringing it to discuss.python.org.
Feedback is welcome in any form - either replies here, or comments on
GitHub (there's a pull request at https://github.com/geofft/peps/pull/3 if
you prefer that interface). At this stage I'm mostly interested in
feedback on the general idea and the high-level changes to tool semantics
- we've tried to handle a good number of use cases in hopefully-reasonable
ways but we definitely might have missed something.
Thanks!
PEP: 9999
Title: Graceful cooperation between external and Python package managers
Author: Geoffrey Thomas <geofft(a)ldpreload.com>,
Matthias Klose <doko(a)ubuntu.com>,
Filipe Laíns <lains(a)riseup.net>,
Donald Stufft <donald(a)python.org>,
Tzu-Ping Chung <uranusjr(a)gmail.com>,
Stefano Rivera <stefanor(a)debian.org>,
Elana Hashman <ehashman(a)debian.org>,
Pradyun Gedam <mail(a)pradyunsg.me>
Discussions-To: TODO discourse
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 2021-05-18
Post-History:
Abstract
========
A long-standing practical problem for Python users has been
conflicts between OS package managers and Python-specific
package management tools like pip. These conflicts include
both Python-level API incompatibilities and conflicts over
file ownership.
Historically, Python-specific package management tools have
defaulted to installing packages into an implicit global
context. With the standardization and popularity of virtual
environments, a better solution for most (but not all) use
cases is to use Python-specific package management tools
only within a virtual environment.
This PEP proposes a mechanism for a Python installation to
communicate to tools like pip that its global package
installation context is managed by some means external to
Python, such as an OS package manager. It specifies that
Python-specific package management tools should neither
install nor remove packages into the interpreter's global
context, by default, and should instead guide the end user
towards using a virtual environment.
It also standardizes an interpretation of the ``sysconfig``
schemes so that, if a Python-specific package manager is
about to install a package in an interpreter-wide context,
it can do so in a manner that will avoid conflicting with
the external package manager and reduces the risk of
breaking software shipped by the external package manager.
Terminology
===========
A few terms used in this PEP have multiple meanings in the
contexts that it spans. For clarity, this PEP uses the
following terms in specific ways:
distribution
A collection of various sorts of software, ideally
designed to work properly together, including (in
contexts relevant to this document) the Python
interpreter itself, software written in Python, and
software written in other languages. That is, this is
"distribution" in the sense of "Linux distro" or
"Berkeley Software Distribution."
A distribution can be an operating system (OS) of its
own, such as Debian, Fedora, or FreeBSD. It can also be
an overlay distribution that installs on top of an
existing OS, such as Homebrew or MacPorts.
To avoid confusion, this document does not use
"distribution" in the sense of a source or binary
distribution package of a single piece of Python
language software, that is, in the sense of
``setuptools.dist.Distribution`` or "sdist".
The provider of a distribution - the team or company
that collects and publishes the software and makes any
needed modifications - is its **distributor**.
package
A unit of software that can be installed and used within
Python. That is, this refers to what Python-specific
packaging tools tend to call a "distribution package" or
simply a "distribution"; the colloquial abbreviation
"package" is used in the sense of the Python Package
Index.
This document does not use "package" in the sense of an
importable name that contains Python modules, though in
many cases, a distribution package consists of a single
importable package of the same name.
This document generally does not use the term "package"
to refer to units of installation by a distribution's
package manager (such as ``.deb`` or ``.rpm`` files).
When needed, it uses phrasing such as "a distribution's
package." (Again, in many cases, a Python package is
shipped inside a distribution's package named something
like ``python-`` plus the Python package name.)
Python-specific package manager
A tool for installing, upgrading, and/or removing Python
packages in a manner that conforms to Python packaging
standards (such as PEP 376 [#PEP-376]_ and PEP 427
[#PEP-427]_). The most popular Python-specific package
manager is pip [#pip]_; other examples include the old
Easy Install command [#easy-install]_ as well as direct
usage of a ``setup.py`` command.
(Conda [#conda]_ is a bit of a special case, as the
``conda`` command can install much more than just Python
packages, making it more like a distribution package
manager in some senses. Since the ``conda`` command
generally only operates on Conda-created environments,
most of the concerns in this document do not apply to
``conda`` when acting as a Python-specific package
manager.)
distribution package manager
A tool for installing, upgrading, and/or removing a
distribution's packages in an installed instance of that
distribution, which is capable of installing Python
packages as well as non-Python packages, and therefore
generally has its own database of installed software
unrelated to PEP 376 [#PEP-376]. Examples include ``apt``,
``dpkg``, ``dnf``, ``rpm``, ``pacman``, and ``brew``.
The salient feature is that if a package was installed
by a distribution package manager, removing or upgrading
it in a way that would satisfy a Python-specific package
manager will generally leave a distribution package
manager in an inconsistent state.
This document also uses phrases like "external package
manager" or "system's package manager" to refer to a
distribution package manager in certain contexts.
shadow
To shadow an installed Python package is to cause some
other package to be preferred for imports without
removing any files from the shadowed package. This
requires multiple entries on ``sys.path``: if package A
2.0 installs module ``a.py`` in one ``sys.path`` entry,
and package A 1.0 installs module ``a.py`` in a later
``sys.path`` entry, then ``import a`` returns the module
from the former, and we say that A 2.0 shadows A 1.0.
Motivation
==========
Thanks to Python's immense popularity, software
distributions (by which we mean Linux and other OS
distributions as well as overlay distributions like Homebrew
and MacPorts) generally ship Python for two purposes: as a
software package to be used in its own right by end users,
and as a language dependency for other software in the
distribution.
For example, Fedora and Debian (and their downstream
distributions, as well as many others) ship a
``/usr/bin/python3`` binary which provides the ``python3``
command available to end users as well as the
``#!/usr/bin/python3`` shebang for Python-language software
included in the distribution. Because there are no official
binary releases of Python for Linux/UNIX, almost all Python
end users on these OSes use the Python interpreter built and
shipped with their distribution.
The ``python3`` executable available to the users of the distribution
and the ``python3`` executable available as a dependency for other
software in the distribution are typically the same binary. This means that if an
end user installs a Python package using a tool like ``pip``
outside the context of a virtual environment, that
package is visible to Python-language software shipped by
the distribution. If the newly-installed package (or one of its
dependencies) is a newer, backwards-incompatible version of
a package that was installed through the distribution, it
may break software shipped by the distribution.
This may pose a critical problem for the integrity of distributions,
which often have package-management tools that are
themselves written in Python. For example, it's possible to
unintentionally break Fedora's ``dnf`` command with a ``pip
install`` command, making it hard to recover.
This applies both to system-wide installs (``sudo pip
install``) as well as user home directory installs (``pip
install --user``), since packages in either location show up
on the ``sys.path`` of ``/usr/bin/python3``.
There is a worse problem with system-wide installs: if you
attempt to recover from this situation with ``sudo pip
uninstall``, you may end up removing packages that are
shipped by the system's package manager. In fact, this can
even happen if you simply upgrade a package - pip will try
to remove the old version of the package, as shipped by the
OS. At this point it may not be possible to recover the
system to a consistent state using just the software
remaining on the system.
Over the past many years, a consensus has emerged that the
best way to install Python libraries or applications (when
not using a distribution's package) is to use a virtual
environment. This approach was popularized by the PyPA
`virtualenv`_ project, and a simple version of that approach
is now available in the Python standard library as ``venv``.
Installing a Python package into a virtualenv prevents it
from being visible to the unqualified ``/usr/bin/python3``
interpreter and prevents breaking system software.
.. _virtualenv: https://virtualenv.pypa.io/en/latest/
In some cases, however, it's useful and intentional to
install a Python package from outside of the distribution
that influences the behavior of distribution-shipped
commands. This is common in the case of software like Sphinx
or Ansible which have a mechanism for writing
Python-language extensions. A user may want to use their
distribution's version of the base software (for reasons of
paid support or security updates) but install a small
extension from PyPI, and they'd want that extension to be
importable by the software in their base system.
While this continues to carry the risk of installing a newer
version of a dependency than the operating system expects or
otherwise negatively affecting the behavior of an
application, it does not need to carry the risk of removing
files from the operating system. A tool like pip should be
able to install packages in some directory on the default
``sys.path``, if specifically requested, without deleting
files owned by the system's package manager.
Therefore, this PEP proposes two things.
First, it proposes **a way for distributors of a Python
interpreter to mark that interpreter as having its packages
managed by means external to Python**, such that
Python-specific tools like pip should not change the
installed packages in the interpreter's global ``sys.path``
in any way (add, upgrade/downgrade, or remove) unless
specifically overridden. It also provides a means for the
distributor to indicate how to use a virtual environment as
an alternative.
This is an opt-in mechanism: by default, the Python
interpreter compiled from upstream sources will not be so
marked, and so running ``pip install`` with a self-compiled
interpreter, or with a distribution that has not explicitly
marked its interpreter, will work as it always has worked.
Second, it sets the rule that when installing packages to an
interpreter's global context (either to an unmarked
interpreter, or if overriding the marking),
**Python-specific package managers should modify or delete
files only within the directories of the sysconfig
scheme in which they would create files**. This permits a
distributor of a Python interpreter to set up two
directories, one for its own managed packages, and one for
unmanaged packages installed by the end user, and ensure
that installing unmanaged packages will not delete (or
overwrite) files owned by the external package manager.
Rationale
=========
As described in detail in the next section, the first
behavior change involves creating a marker file named
``EXTERNALLY-MANAGED``, whose presence indicates that
non-virtual-environment package installations are managed by
some means external to Python, such as a distribution's
package manager. This file is specified
to live in the ``stdlib`` directory in the default
``sysconfig`` scheme, which marks the interpreter /
installation as a whole, not a particular location on
``sys.path``. The reason for this is that, as identified
above, there are two related problems that risk breaking an
externally-managed Python: you can install an incompatible
new version of a package system-wide (e.g., with ``sudo pip
install``), and you can install one in your user account
alone, but in a location that is on the standard Python
command's ``sys.path``
(e.g., with ``pip install --user``). If the marker file were
in the system-wide ``site-packages`` directory, it would not
clearly apply to the second case. The `Alternatives`_
section has further discussion of possible locations.
The second behavior change takes advantage of the existing
``sysconfig`` setup in distributions that have already
encountered this class of problem, and specifically
addresses the problem of a Python-specific package manager
deleting or overwriting files that are owned by an external
package manager.
Use cases
---------
The changed behavior in this PEP is intended to "do the
right thing" for as many use cases as possible. In this
section, we consider the changes specified by this PEP for
several representative use cases / contexts. Specifically, we
ask about the two behaviors that could be changed by this
PEP:
1. Will a Python-specific installer tool like ``pip
install`` permit installations by default, after
implementation of this PEP?
2. If you do run such a tool, should it be willing to delete
packages shipped by the external (non-Python-specific)
package manager for that context, such as a distribution
package manager?
(For simplicity, this section discusses pip as the
Python-specific installer tool, though the analysis should
apply equally to any other Python-specific package
management tool.)
This table summarizes the use cases discussed in detail
below:
==== ================================= =========================== ===================================================
Case Description ``pip install`` permitted Deleting externally-installed packages permitted
==== ================================= =========================== ===================================================
1 Unpatched CPython Currently yes; stays yes Currently yes; stays yes
2 Distribution ``/usr/bin/python3`` Currently yes; becomes no Currently yes (except on Debian); becomes no
(assuming the distribution
adds a marker file)
3 Distribution Python in venv Currently yes; stays yes There are no externally-installed packages
4 Distribution Python in venv Currently yes; stays yes Currently no; stays no
with ``--system-site-packages``
5 Distribution Python in Docker Currently yes; stays yes Currently yes; becomes no
(assuming the Docker image
removes the marker file)
6 Conda environment Currently yes; stays yes Currently yes; stays yes
7 Dev-facing distribution Currently yes; becomes no Currently often yes; becomes no
(assuming they add a (assuming they configure ``sysconfig`` as needed)
marker file)
8 Distribution building packages Currently yes; can stay yes Currently yes; becomes no
9 ``PYTHONHOME`` copied from Currently yes; becomes no Currently yes; becomes no
a distribution Python stdlib
10 ``PYTHONHOME`` copied from Currently yes; stays yes Currently yes; stays yes
upstream Python stdlib
==== ================================= =========================== ===================================================
In more detail, the use cases above are:
1. A standard unpatched CPython, without any special
configuration of or patches to ``sysconfig`` and without
a marker file. This PEP does not change its behavior.
Such a CPython should (regardless of this PEP) not be
installed in a way that that overlaps any
distribution-installed Python on the same system. For
instance, on an OS that ships Python in ``/usr/bin``, you
should not install a custom CPython built with
``./configure --prefix=/usr``, or it will overwrite some
files from the distribution and the distribution will
eventually overwrite some files from your installation.
Instead, your installation should be in a separate
directory (perhaps ``/usr/local``, ``/opt``, or your home
directory).
Therefore, we can assume that such a CPython has its own
``stdlib`` directory and its own ``sysconfig`` schemes
that do not overlap any distribution-installed Python. So
any OS-installed packages are not visible or relevant
here.
If there is a concept of "externally-installed" packages
in this case, it's something outside the OS and generally
managed by whoever built and installed this CPython.
Because the installer chose not to add a marker file or
modify ``sysconfig`` schemes, they're choosing the
current behavior, and ``pip install`` can remove any
packages available in this CPython.
2. A distribution's ``/usr/bin/python3``, either when
running ``pip install`` as root or ``pip install
--user``, following our `Recommendations for
distributions`_.
These recommendations include shipping a marker file in
the ``stdlib`` directory, to prevent ``pip install`` by
default, and placing distribution-shipped packages in a
location other than the default ``sysconfig`` scheme, so
that ``pip`` as root does not write to that location.
Many distributions (including Debian, Fedora, and their
derivatives) are already doing the latter.
On Debian and derivatives, ``pip install`` does not
currently delete distribution-installed packages, because
Debian carries a `patch to pip to prevent this`__. So,
for those distributions, this PEP is not a behavior
change; it simply standardizes that behavior in a way
that is no longer Debian-specific and can be included
into upstream pip.
.. __: https://sources.debian.org/src/python-pip/20.3.4-2/debian/patches/hands-off…
(We have seen user reports of externally-installed
packages being deleted on Debian or a derivative. We
suspect this is because the user has previously run
``sudo pip install --upgrade pip`` and therefore now has
a version of ``/usr/bin/pip`` without the Debian patch;
standardizing this behavior in upstream package
installers would address this problem.)
3. A distribution Python when used inside a virtual
environment (either from ``venv`` or ``virtualenv``).
Inside a virtual environment, all packages are owned by
that environment. Even when ``pip``, ``setuptools``,
etc. are installed into the environment, they are and
should be managed by tools specific to that environment;
they are not system-managed.
4. A distribution Python when used inside a virtual
environment with ``--system-site-packages``. This is like
the previous case, but worth calling out explicitly,
because anything on the global ``sys.path`` is visible.
Currently, the answer to "Will ``pip` delete
externally-installed packages`` is no, because pip has a
special case for running in a virtual environment and
attempting to delete packages outside it. After this PEP,
the answer remains no, but the reasoning becomes more
general: system site packages will be outside any of the
``sysconfig`` schemes used for package management in the
environment.
5. A distribution Python when used in a single-application
container image (e.g., a Docker container). In this use
case, the risk of breaking system software is lower,
since generally only a single application runs in the
container, and the impact is lower, since you can rebuild
the container and you don't have to struggle to recover a
running machine. There are also a large number of
existing Dockerfiles with an unqualified ``RUN pip
install ...`` statement, etc., and it would be good not
to break those. So, builders of base container images
may want to ensure that the marker file is not present,
even if the underlying OS ships one by default.
There is a small behavior change: currently, ``pip`` run
as root will delete externally-installed packages, but
after this PEP it will not. We don't propose a way to
override this. However, since the base image is generally
minimal, there shouldn't be much of a use case for simply
uninstalling packages (especially without using the
distribution's own tools). The common case is when pip
wants to upgrade a package, which previously would have
deleted the old version (except on Debian). After this
change, the old version will still be on disk, but pip
will still *shadow* externally-installed packages, and we
believe this to be sufficient for this not to be a
breaking change in practice - a Python ``import``
statement will still get you the newly-installed package.
If it becomes necessary to have a way to do this, we
suggest that the distribution should document a way for
the installer tool to access the ``sysconfig`` scheme
used by the distribution itself. See the
`Recommendations for distributions`_ section for more
discussion.
It is the view of the authors of this PEP that it's still
a good idea to use virtual environments with
distribution-installed Python interpreters, even in
single-application container images. Even though they run
a single *application*, that application may run commands
from the OS that are implemented in Python, and if you've
installed or upgraded the distribution-shipped Python
packages using Python-specific tools, those commands may
break.
6. Conda specifically supports the use of non-``conda``
tools like pip to install software not available in the
Conda repositories. In this context, Conda acts as the
external package manager / distribution and pip as the
Python-specific one.
In some sense, this is similar to the first case, since
Conda provides its own installation of the Python
interpreter.
We don't believe this PEP requires any changes to Conda,
and versions of pip that have implemented the changes in
this PEP will continue to behave as they currently do
inside Conda environments. (That said, it may be worth
considering whether to use separate ``sysconfig`` schemes
for pip-installed and Conda-installed software, for the
same reasons it's a good idea for other distributions.)
7. By a "developer-facing distribution," we mean a specific
type of distribution where direct users of Python or
other languages in the distribution are expected or
encouraged to make changes to the distribution itself if
they wish to add libraries. Common examples include
private "monorepos" at software development companies,
where a single repository builds both third-party and
in-house software, and the direct users of the
distribution's Python interpreter are generally software
developers writing said in-house software. User-level
package managers like Nixpkgs_ may also count,
because they encourage users of Nix who are Python
developers to `package their software for Nix`__.
In these cases, the distribution may want to respond to
an attempted ``pip install`` with guidance encouraging
use of the distribution's own facilities for adding new
packages, along with a link to documentation.
If the distribution supports/encourages creating a
virtual environment from the distribution's Python
interpreter, there may also be custom instructions for
how to properly set up a virtual environment (as for
example Nixpkgs does).
.. _Nixpkgs: https://github.com/NixOS/nixpkgs
.. __: https://nixos.wiki/wiki/Python
8. When building distribution Python packages for a
distribution Python (case 2), it may be useful to have
``pip install`` be usable as part of the distribution's
package build process. (Consider, for instance, building a
``python-xyz`` RPM by using ``pip install .`` inside an
sdist / source tarball for ``xyz``.) The distribution may
also want to use a more targeted but still
Python-specific installation tool such as installer_.
.. _installer: https://installer.rtfd.io/
For this case, the build process will need to find some
way to suppress the marker file to allow ``pip install``
to work, and will probably need to point the
Python-specific tool at the distribution's ``sysconfig``
scheme instead of the shipped default. See the
`Recommendations for distributions`_ section for more
discussion on how to implement this.
As a result of this PEP, pip will no longer be able to
remove packages already on the system. However, this
behavior change is fine because a package build process
should not (and generally cannot) include instructions to
delete some other files on the system; it can only
package up its own files.
9. A distribution Python used with ``PYTHONHOME`` to set up
an alternative Python environment (as opposed to a
virtual environment), where ``PYTHONHOME`` is set to some
directory copied directly from the distribution Python
(e.g., ``cp -a /usr/lib/python3.x pyhome/lib``).
Assuming there are no modifications, then the behavior is
just like the underlying distribution Python (case 2).
So there are behavior changes - you can no longer ``pip
install`` by default, and if you override it, it will no
longer delete externally-installed packages (i.e.,
Python packages that were copied from the OS and live in
the OS-managed ``sys.path`` entry).
This behavior change seems to be defensible, in that if
your ``PYTHONHOME`` is a straight copy of the
distribution's Python, it should behave like the
distribution's Python.
10. A distribution Python (or any Python interpreter) used
with a ``PYTHONHOME`` taken from a compatible unmodified
upstream Python.
Because the behavior changes in this PEP are keyed off
of files in the standard library (the marker file in
``stdlib`` and the behavior of the ``sysconfig``
module), the behavior is just like an unmodified
upstream CPython (case 1).
Specification
=============
Marking an interpreter as using an external package manager
-----------------------------------------------------------
Before a Python-specific package installer (that is, a tool such as
pip - not an external tool such as apt) installs a package
into a certain Python context, it should make the following
checks by default:
1. Is it running outside of a virtual environment? It can
determine this by whether ``sys.prefix ==
sys.base_prefix`` (but see `Backwards Compatibility`_).
2. Is there a ``EXTERNALLY-MANAGED`` file in the directory
identified by ``sysconfig.get_path("stdlib",
sysconfig.get_default_scheme())``
If both of these conditions are true, the installer should
exit with an error message indicating that package
installation into this Python interpreter's directory are
disabled outside of a virtual environment.
The installer should have a way for the user to override
these rules, such as a command-line flag
``--break-system-packages``. This option should not be
enabled by default and should carry some connotation that
its use is risky.
The ``EXTERNALLY-MANAGED`` file is a metadata file in the
`packaging core metadata format`_, which is an
email-message-like format with headers and a body. (At
current writing, that format is defined exactly as what the
standard library ``email.parser`` module can parse using
``policy=email.policy.compat32``.) If the file can be parsed
as a core metadata file, then the installer should output an
error message from that file as part of its error. If
``locale.getlocale(locale.LC_MESSAGES)`` returns
non-``None`` and the first element is a string of the form
``xx_YY``, and the file contains a header variable
``Error-xx_YY`` or failing that ``Error-xx``, then the
installer should use the value of that header as the error.
Otherwise, it should use the body of the message as an
error.
.. _`packaging core metadata format`: https://packaging.python.org/specifications/core-metadata/
If the file does not parse as a core metadata file, then the
installer should ignore the parse failure and instead just
use a pre-defined error message of its own, which should
suggest that the user create a virtual environment to
install packages.
Software distributors who have a non-Python-specific package
manager that manages libraries in the ``sys.path`` of their
Python package should, in general, ship a
``EXTERNALLY-MANAGED`` file in their standard library
directory. For instance, Debian may ship a file in
``/usr/lib/python3.9/EXTERNALLY-MANAGED`` consisting of
something like
::
To install Python packages system-wide, try apt install
python3-xyz, where xyz is the package you are trying to
install.
If you wish to install a non-Debian-packaged Python
package, create a virtual environment using python3 -m
venv path/to/venv. Then use path/to/venv/bin/python and
path/to/venv/bin/pip. Make sure you have python3-full
installed.
If you wish to install a non-Debian packaged Python
application, it may be easiest to use pipx install xyz,
which will manage a virtual environment for you. Make
sure you have pipx installed.
See /usr/share/doc/python3.9/README.venv for more
information.
which provides useful and distribution-relevant information
to a user trying to install a package.
In certain contexts, such as single-application container
images that aren't updated after creation, a distributor may
choose not to ship an ``EXTERNALLY-MANAGED`` file, so that
users can install whatever they like (as they can today)
without having to manually override this rule.
Writing to only the target ``sysconfig`` scheme
-----------------------------------------------
Usually, a Python package installer installs to directories
in a scheme returned by the ``sysconfig`` standard library
package. Ordinarily, this is the scheme returned by
``sysconfig.get_default_scheme()``, but based on
configuration (e.g. ``pip install --user``), it may use a
different scheme.
Whenever the installer is installing to a ``sysconfig``
scheme, this PEP specifies that the installer should never
modify or delete files outside of that scheme. For instance,
if it's upgrading a package, and the package is already
installed in a directory outside that scheme (perhaps in a
directory from another scheme), it should leave the existing
files alone.
If the installer does end up shadowing an existing
installation during an upgrade, we recommend that it
produces a warning at the end of its run.
If the installer is installing to a location outside of a
``sysconfig`` scheme (e.g., ``pip install --target``), then
this subsection does not apply.
Recommendations for distributions
=================================
This section is non-normative. It provides best practices we
believe distributions should follow unless they have a
specific reason otherwise.
Mark the installation as externally managed
-------------------------------------------
Distributions should create an ``EXTERNALLY-MANAGED`` file
in their ``stdlib`` directory.
Guide users towards virtual environments
----------------------------------------
The file should contain a useful and distribution-relevant
error message indicating both how to install system-wide
packages via the distribution's package manager and how to
set up a virtual environment. If your distribution is often
used by users in a state where the ``python3`` command is
available (and especially where ``pip`` or ``get-pip`` is
available) but ``python3 -m venv`` does not work, the
message should indicate clearly how to make ``python3 -m
venv`` work properly.
Consider packaging pipx_, a tool for installing
Python-language applications, and suggesting it in the
error. pipx automatically creates a virtual environment for
that application alone, which is a much better default for
end users who want to install some Python-language software
(which isn't available in the distribution) but are not
themselves Python users. Packaging pipx in the distribution
avoids the irony of instructing users to ``pip install
--user --break-system-packages pipx`` to *avoid* breaking
system packages. Consider arranging things so your
distribution's package / environment for Python for end
users (e.g., ``python3`` on Fedora or ``python3-full`` on
Debian) depends on pipx.
.. _pipx: https://github.com/pypa/pipx
Remove the marker file in container images
------------------------------------------
Distributions that produce official images for
single-application containers (e.g., Docker container
images) should remove the ``EXTERNALLY-MANAGED`` file,
preferably in a way that makes it not come back if a user
of that image installs package updates inside their image
(think ``RUN apt-get dist-upgrade``). On dpkg-based
systems, using ``dpkg-divert --local`` to persistently
rename the file would work. On other systems, there may
need to be some configuration flag available to a
post-install script to re-remove the
``EXTERNALLY-MANAGED`` file.
Create separate distribution and local directories
--------------------------------------------------
Distributions should place two separate paths on the system
interpreter's ``sys.path``, one for distribution-installed
packages and one for packages installed by the local system
administrator, and configure
``sysconfig.get_default_scheme()`` to point at the latter
path. This ensures that tools like pip will not modify
distribution-installed packages. The path for the local
system administrator should come before the distribution
path on ``sys.path`` so that local installs take preference
over distribution packages.
For example, Fedora and Debian (and their derivatives) both
implement this split by using ``/usr/local`` for
locally-installed packages and ``/usr`` for
distribution-installed packages. Fedora uses
``/usr/local/lib/python3.x/site-packages`` vs.
``/usr/lib/python3.x/site-packages``. (Debian uses
``/usr/local/lib/python3/dist-packages`` vs.
``/usr/lib/python3/dist-packages`` as an additional layer of
separation from a locally-compiled Python interpreter: if
you build and install upstream CPython in
``/usr/local/bin``, it will look at
``/usr/local/lib/python3/site-packages``, and Debian wishes
to make sure that packages installed via the locally-built
interpreter don't show up on ``sys.path`` for the
distribution interpreter.)
Note that the ``/usr/local`` vs. ``/usr`` split is analogous
to how the ``PATH`` environment variable typically includes
``/usr/local/bin:/usr/bin`` and non-distribution software
installs to ``/usr/local`` by default. This split is
`recommended by the Filesystem Hierarchy Standard`__.
.. __: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch04s09.html
There are two ways you could do this. One is, if you are
building and packaging Python libraries directly (e.g., your
packaging helpers unpack a PEP 517-built wheel or call
``setup.py install``), arrange for those tools to use a
directory that is not in a ``sysconfig`` scheme but is still
on ``sys.path``.
The other is to arrange for the default ``sysconfig`` scheme
to change when running inside a package build versus when
running on an installed system. The ``sysconfig``
customization hooks from bpo-43976_ should make this easy:
make your packaging tool set an environment variable or some
other detectable configuration, and define a
``get_preferred_schemes`` function to return a different
scheme when called from inside a package build. Then you can
use ``pip install`` as part of your distribution packaging.
.. _bpo-43976: https://bugs.python.org/issue43976
We propose adding a ``--scheme=...`` option to instruct pip
to run against a specific scheme. (See `Implementation
Notes`_ below for how pip currently determines schemes.)
Once that's available, for local testing and possibly for
actual packaging, you would be able to run something like
``pip install --scheme=posix_distro`` to explicitly install
a package into your distribution's location (bypassing
``get_preferred_schemes``). One could also, if absolutely
needed, use ``pip uninstall --scheme=posix_distro`` to use
pip to remove packages from the system-managed directory,
which addresses the (hopefully theoretical) regression in
use case 5 in Rationale_.
To install packages with pip, you would also need to either
suppress the ``EXTERNALLY-MANAGED`` marker file to allow pip
to run or to override it on the command line. You may want
to use the same means for suppressing the marker file in
build chroots as you do in container images.
The advantage of setting these up to be automatic
(suppressing the marker file in your build environment and
having ``get_preferred_schemes`` automatically return your
distribution's scheme) is that an unadorned ``pip install``
will work inside a package build, which generally means that
an unmodified upstream build script that happens to
internally call ``pip install`` will do the right thing.
You can, of course, just ensure that your packaging process
always calls ``pip install --scheme=posix_distro
--break-system-packages``, which would work too.
The best approach here depends a lot on your distribution's
conventions and mechanisms for packaging.
Similarly, the ``sysconfig`` paths that are not for
importable Python code - that is, ``include``,
``platinclude``, ``scripts``, and ``data`` - should also
have two variants, one for use by distribution-packaged
software and one for use for locally-installed software, and
the distribution should be set up such that both are usable.
For instance, a typical FHS-compliant distribution will use
``/usr/local/include`` for the default scheme's ``include``
and ``/usr/include`` for distribution-packaged headers and
place both on the compiler's search path, and it will use
``/usr/local/bin`` for the default scheme's ``scripts`` and
``/usr/bin`` for distribution-packaged entry points and
place both on ``$PATH``.
Backwards Compatibility
=======================
All of these mechanisms are proposed for new distribution
releases and new versions of tools like pip only.
In particular, we strongly recommend that distributions with
a concept of major versions only add the marker file or
change ``sysconfig`` schemes in a new major version;
otherwise there is a risk that, on an existing system,
software installed via a Python-specific package manager now
becomes unmanageable (without an override option). For a
rolling-release distribution, if possible, only add the
marker file or change ``sysconfig`` schemes in a new Python
minor version.
One particular backwards-compatibility difficulty for
package installation tools is likely to be managing
environments created by old versions of ``virtualenv`` which
have the latest version of the tool installed. A "virtual
environment" now has a fairly precise definition: it uses
the ``pyvenv.cfg`` mechanism, which causes ``sys.base_prefix
!= sys.prefix``. It is possible, however, that a user may
have an old virtual environment created by an older version
of ``virtualenv``; as of this writing, pip supports Python
3.6 onwards, which is in turn supported by ``virtualenv``
15.1.0 onwards, so this scenario is possible. In older
versions of ``virtualenv``, the mechanism is instead to set
a new attribute, ``sys.real_prefix``, and it does not use
the standard library support for virtual environments,
so ``sys.base_prefix`` is the same as ``sys.prefix``. So the
logic for robustly detecting a virtual environment is
something like::
def is_virtual_environment():
return sys.base_prefix != sys.prefix or hasattr(sys, "real_prefix")
Security Implications
=====================
The purpose of this feature is not to implement a security
boundary; it is to discourage well-intended changes from
unexpectedly breaking a user's environment. That is to say,
the reason this PEP restricts ``pip install`` outside a
virtual environment is not that it's a security risk to be
able to do so; it's that "There should be one-- and
preferably only one --obvious way to do it," and that way
should be using a virtual environment. ``pip install``
outside a virtual environment is rather too obvious for what
is almost always the wrong way to do it.
If there is a case where a user should not be able to ``sudo
pip install`` or ``pip install --user`` and add files to
``sys.path`` *for security reasons*, that needs to be
implemented either via access control rules on what files
the user can write to or an explicitly secured ``sys.path``
for the program in question. Neither of the mechanisms in
this PEP should be interpreted as a way to address such a
scenario.
For those reasons, an attempted install with a marker file
present is not a security incident, and there is no need to
raise an auditing event for it. If the calling user
legitimately has access to ``sudo pip install`` or ``pip
install --user``, they can accomplish the same installation
entirely outside of Python; if they do not legitimately have
such access, that's a problem outside the scope of this PEP.
The marker file itself is located in the standard library
directory, which is a trusted location (i.e., anyone who can
write to the marker file used by a particular installer
could, presumably, run arbitrary code inside the installer).
Therefore, there is generally no need to filter out terminal
escape sequences or other potentially-malicious content in
the error message.
Alternatives
==============
There are a number of similar proposals we considered that this
PEP rejects or defers, largely to preserve the behavior in
the case-by-case analysis in Rationale_.
Marker file
-----------
Should the marker file be in ``sys.path``, marking a
particular directory as not to be written to by a Python-specific
package manager? This would help with the second problem
addressed by this PEP (not overwriting deleting
distribution-owned files) but not the first (incompatible
installs). A directory-specific marker in
``/usr/lib/python3.x/site-packages`` would not discourage
installations into either
``/usr/local/lib/python3.x/site-packages`` or
``~/.local/lib/python3.x/site-packages``, both of which are
on ``sys.path`` for ``/usr/bin/python3``. In other words,
the marker file should not be interpreted as marking a
single *directory* as externally managed (even though it
happens to be in a directory on ``sys.path``); it marks the
entire *Python installation* as externally managed.
Another variant of the above: should the marker file be in
``sys.path``, where if it can be found in any directory in
``sys.path``, it marks the installation as externally
managed? An apparent advantage of this approach is that it
automatically disables itself in virtual environments.
Unfortunately, This has the wrong behavior with a
``--system-site-packages`` virtual environment, where the
system-wide ``sys.path`` is visible but package
installations are allowed. (It could work if the rule of
exempting virtual environments is preserved, but that seems
to have no advantage over the current scheme.)
Should the marker just be a new attribute of a ``sysconfig``
scheme? There is some conceptual cleanliness to this,
except that it's hard to override. We want to make it easy
for container images, package build environments, etc. to
suppress the marker file. A file that you can remove is
easy; code in ``sysconfig`` is much harder to modify.
Should the file be in ``/etc``? No, because again, it refers
to a specific Python installation. A user who installs their
own Python may well want to install packages within the
global context of that interpreter.
Should the configuration setting be in ``pip.conf`` or
``distutils.cfg``? Apart from the above objections about
marking an installation, this mechanism isn't specific to
either of those tools. (It seems reasonable for pip to
*also* implement a configuration flag for users to prevent
themselves from performing accidental
non-virtual-environment installs in any Python installation,
but that is outside the scope of this PEP.)
Should the file be TOML? TOML is gaining popularity for
packaging (see e.g. PEP-517) but does not yet have an
implementation in the standard library. Strictly speaking,
this isn't a blocker - distributions need only write the
file, not read it, so they don't need a TOML library (the
file will probably be written by hand, regardless of
format), and packaging tools likely have a TOML reader
already. However, the ``email.message`` format is currently
used for various other forms of packaging metadata, meets
our needs, and is parseable by the standard library, and the
pip maintainers expressed a preference to avoid using TOML
for this yet.
Should the marker file be executable Python code that
evaluates whether installation should be allowed or not?
Apart from the concerns above about having the file in
``sys.path``, we have a concern that making it executable is
committing to too powerful of an API and risks making
behavior harder to understand. (Note that the
``get_default_scheme`` hook of bpo-43976_ is in fact
executable, but that code needs to be supplied when the
interpreter builds; it isn't intended to be supplied
post-build.)
When overriding the marker, should a Python-specific package manager
be disallowed from shadowing a package installed by the
external package manager (i.e., installing modules of the
same name)? This would minimize the risk of breaking system
software, but it's not clear it's worth the additional user
experience complexity. There are legitimate use cases for
shadowing system packages, and an additional command-line
option to permit it would be more confusing. Meanwhile, not
passing that option wouldn't eliminate the risk of breaking
system software, which may be relying on a ``try: import xyz``
failing, finding a limited set of entry points, etc.
Communicating this distinction seems difficult. We think
it's a good idea for Python-specific package managers to print a
warning if they shadow a package, but we think it's not
worth disabling it by default.
Why not use the ``INSTALLER`` file from PEP 376 to determine
who installed a package and whether it can be removed?
First, it's specific to a particular package (it's in the
package's ``dist-info`` directory), so like some of the
alternatives above, it doesn't provide information on an
entire environment and whether package installations are
permissible. PEP 627 also updates PEP 376 to prevent
programmatic use of ``INSTALLER``, specifying that the file
is "to be used for informational purposes only. [...] Our
goal is supporting interoperating tools, and basing any
action on which tool happened to install a package runs
counter to that goal." Finally, as PEP 627 envisions, there
are legitimate use cases for one tool knowing how to handle
packages installed by another tool; for instance, ``conda``
can safely remove a package installed by ``pip`` into a
Conda environment.
Why does the specification give no means for disabling
package installations inside a virtual environment? We can't
see a particularly strong use case for it (at least not one
related to the purposes of this PEP). If you need it, it's
simple enough to ``pip uninstall pip`` inside that
environment, which should discourage at least unintentional
changes to the environment (and this specification makes no
provision to disable *intentional* changes, since after all
the marker file can be easily removed).
System Python
-------------
Shouldn't distribution software just run with the
distribution ``site-packages`` directory alone on
``sys.path`` and ignore the local system administrator's
``site-packages`` as well as the user-specific one? This is
a worthwhile idea, and various versions of it have been
circulating for a while under the name of "system Python" or
"platform Python" (with a separate "user Python" for end
users writing Python or installing Python software separate
from the system). However, it's much more involved of a
change. First, it would be a backwards-incompatible change.
As mentioned in the Motivation_ section, there are valid use
cases for running distribution-installed Python applications
like Sphinx or Ansible with locally-installed Python
libraries available on their ``sys.path``. A wholesale
switch to ignoring local packages would break these use
cases, and a distribution would have to make a case-by-case
analysis of whether an application ought to see
locally-installed libraries or not.
Furthermore, `Fedora attempted this change and reverted
it`_, finding, ironically, that their implementation of the
change `broke their package manager`_. Given that
experience, there are clearly details to be worked out
before distributions can reliably implement that approach,
and a PEP recommending it would be premature.
.. _`Fedora attempted this change and reverted it`: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org…
.. _`broke their package manager`: https://bugzilla.redhat.com/show_bug.cgi?id=1483342
This PEP is intended to be a complete and self-contained
change that is independent of a distributor's decision for
or against "system Python" or similar proposals. It is not
incompatible with a distribution implementing "system
Python" in the future, and even though both proposals
address the same class of problems, there are still
arguments in favor of implementing something like "system
Python" even after implementing this PEP. At the same time,
though, this PEP specifically tries to make a more targeted
and minimal change, such that it can be implemented by
distributors who don't expect to adopt "system Python" (or
don't expect to implement it immediately). The changes in
this PEP stand on their own merits and are not an
intermediate step for some future proposal. This PEP reduces
(but does not eliminate) the risk of breaking system
software while minimizing (but not completely avoiding)
breaking changes, which should therefore be much easier to
implement than the full "system Python" idea, which comes
with the downsides mentioned above.
We expect that the guidance in this PEP - that users should
use virtual environments whenever possible and that
distributions should have separate ``sys.path`` directories
for distribution-managed and locally-managed modules -
should make further experiments easier in the future. These
may include distributing wholly separate "system" and "user"
Python interpreters, running system software out of a
distribution-owned virtual environment or ``PYTHONHOME``
(but shipping a single interpreter), or modifying the entry
points for certain software (such as the distribution's
package manager) to use a ``sys.path`` that only sees
distribution-managed directories. Those ideas themselves,
however, remain outside the scope of this PEP.
Implementation Notes
====================
This section is non-normative and contains notes relevant to
both the specification and potential implementations.
Currently, pip does not directly expose a way to choose a
target ``sysconfig`` scheme, but it has three ways of
looking up schemes when installing:
``pip install``
Calls ``sysconfig.get_default_scheme()``, which is
usually (in upstream CPython and most current
distributions) the same as
``get_preferred_scheme('prefix')``.
``pip install --prefix=/some/path``
Calls ``sysconfig.get_preferred_scheme('prefix')``.
``pip install --user``
Calls ``sysconfig.get_preferred_scheme('user')``.
Finally, ``pip install --target=/some/path`` writes directly
to ``/some/path`` without looking up any schemes.
Debian currently carries a `patch to change the default
install location inside a virtual environment`__, using a
few heuristics (including checking for the ``VIRTUAL_ENV``
environment variable), largely so that the directory used in
a virtual environment remains ``site-packages`` and not
``dist-packages``. This does not particularly affect this
proposal, because the implementation of that patch does not
actually change the default ``sysconfig`` scheme, and
notably does not change the result of
``sysconfig.get_path("stdlib")``.
.. __: https://sources.debian.org/src/python3.7/3.7.3-2+deb10u3/debian/patches/dis…
Fedora currently carries a `patch to change the default
install location when not running inside rpmbuild`__, which
they use to implement the two-system-wide-directories
approach. This is conceptually the sort of hook envisioned
by bpo-43976_, except implemented as a code patch to
``distutils`` instead of as a changed ``sysconfig`` scheme.
.. __: https://src.fedoraproject.org/rpms/python3.9/blob/f34/f/00251-change-user-i…
The implementation of ``is_virtual_environment`` above, as
well as the logic to load the ``EXTERNALLY-MANAGED`` file
and find the error message from it, may as well get added to
the standard library (``sys`` and ``sysconfig``,
respectively), to centralize their implementations, but they
don't need to be added yet.
References
==========
For additional background on these problems and previous
attempts to solve them, see `Debian bug 771794`_ "pip
silently removes/updates system provided python packages`
from 2014, Fedora's 2018 article `Making sudo pip safe`_
about pointing ``sudo pip`` at /usr/local (which
acknowledges that the changes still do not make ``sudo pip``
completely safe), pip issues 5605_ ("Disable upgrades to
existing python modules which were not installed via pip")
and 5722_ ("pip should respect /usr/local") from 2018, and
the post-PyCon US 2019 discussion thread `Playing nice with
external package managers`_.
.. _`Debian bug 771794`: https://bugs.debian.org/771794
.. _`Making sudo pip safe`: https://fedoraproject.org/wiki/Changes/Making_sudo_pip_safe
.. _5605: https://github.com/pypa/pip/issues/5605
.. _5722: https://github.com/pypa/pip/issues/5722
.. _`Playing nice with external package managers`: https://discuss.python.org/t/playing-nice-with-external-package-managers/19…
TODO: We can open these before the PEP is accepted and should link to these:
* PR to pip for EXTERNALLY-MANAGED + ``--break-system-packages``
* PR to pip for hands-off-system-packages.patch v2
* MR to Debian Python to create the EXTERNALLY-MANAGED file
* PR to upstream Python for ``is_virtual_env``/``is_externally_managed`` maybe?
.. [#PEP-376] PEP 376, Database of Installed Python Distributions Ziadé
(http://www.python.org/dev/peps/pep-0376)
.. [#PEP-427] PEP 427, The Wheel Binary Package Format 1.0, Holth
(http://www.python.org/dev/peps/pep-0427)
.. [#pip] https://pip.pypa.io/en/stable/
.. [#easy-install] https://setuptools.readthedocs.io/en/latest/deprecated/easy_install.html
(Note that the ``easy_install`` command was removed in
setuptools version 52, released 23 January 2021.)
.. [#Conda] https://conda.io
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.
--
Geoffrey Thomas
https://ldpreload.com
geofft(a)ldpreload.com
5
6