PEP 453 Round 2 - Explicit bootstrapping of pip in Python installations
So there've been a number of updates to PEP453, so i'm posting it here again for more discussion: Viewable online at: http://www.python.org/dev/peps/pep-0453/ Abstract ======== This PEP proposes the inclusion of a method for explicitly bootstrapping `pip`_ as the default package manager for Python. It also proposes that the distributions of Python available via Python.org will automatically run this explicit bootstrapping method and a recommendation to third party redistributors of Python to also provide pip by default (in a way reasonable for their distributions). Proposal ======== This PEP proposes the inclusion of a ``getpip`` bootstrapping module in Python 3.4, as well as in the next maintenance releases of Python 3.3 and 2.7. This PEP does *not* propose making pip (or any dependencies) part of the standard library. Instead, pip will be a bundled application provided along with CPython for the convenience of Python users, but subject to its own development life cycle and able to be upgraded independently of the core interpreter and standard library. Rationale ========= Installing a third party package into a freshly installed Python requires first installing the package manager. This requires users ahead of time to know what the package manager is, where to get them from, and how to install them. The effect of this is that these external projects are required to either blindly assume the user already has the package manager installed, needs to duplicate the instructions and tell their users how to install the package manager, or completely forgo the use of dependencies to ease installation concerns for their users. All of the available options have their own drawbacks. If a project simply assumes a user already has the tooling then they get a confusing error message when the installation command doesn't work. Some operating may ease this pain by providing a global hook that looks for commands that don't exist and suggest an OS package they can install to make the command work. If a project chooses to duplicate the installation instructions and tell their users how to install the package manager before telling them how to install their own project then whenever these instructions need updates they need updating by every project that has duplicated them. This will inevitably not happen in every case leaving many different instructions on how to install it many of them broken or less than optimal. These additional instructions might also confuse users who try to install the package manager a second time thinking that it's part of the instructions of installing the project. The problem of stale instructions can be alleviated by referencing `pip's own bootstrapping instructions <http://www.pip-installer.org/en/latest/installing.html>`__, but the user experience involved still isn't good (especially on Windows, where downloading and running a Python script with the default OS configuration is significantly more painful than downloading and running a binary executable or installer). The situation becomes even more complicated when multiple Python versions are involved (for example, parallel installations of Python 2 and Python 3). The projects that have decided to forgo dependencies altogether are forced to either duplicate the efforts of other projects by inventing their own solutions to problems or are required to simply include the other projects in their own source trees. Both of these options present their own problems either in duplicating maintenance work across the ecosystem or potentially leaving users vulnerable to security issues because the included code or duplicated efforts are not automatically updated when upstream releases a new version. By providing the package manager by default it will be easier for users trying to install these third party packages as well as easier for the people distributing them as they no longer need to pick the lesser evil. This will become more important in the future as the Wheel_ package format does not have a built in "installer" in the form of ``setup.py`` so users wishing to install a Wheel package will need an installer even in the simple case. Reducing the burden of actually installing a third party package should also decrease the pressure to add every useful module to the standard library. This will allow additions to the standard library to focus more on why Python should have a particular tool out of the box instead of needing to use the difficulty in installing a package as justification for inclusion. Providing a standard installation system also helps with bootstrapping alternate build and installer systems, such as ``setuptools``, ``zc.buildout`` and the ``hashdist``/``conda`` combination that is aimed specifically at the scientific community. So long as ``pip install <tool>`` works, then a standard Python-specific installer provides a reasonably secure, cross platform mechanism to get access to these utilities. Explicit Bootstrapping ====================== An additional module called ``getpip`` will be added to the standard library whose purpose is to install pip and any of its dependencies into the appropriate location (most commonly site-packages). It will expose a single callable named ``bootstrap()`` as well as offer direct execution via ``python -m getpip``. Options for installing it such as index server, installation location (``--user``, ``--root``, etc) will also be available to enable different installation schemes. It is believed that users will want the most recent versions available to be installed so that they can take advantage of the new advances in packaging. Since any particular version of Python has a much longer staying power than a version of pip in order to satisfy a user's desire to have the most recent version the bootstrap will contact PyPI, find the latest version, download it, and then install it. This process is security sensitive, difficult to get right, and evolves along with the rest of packaging. Instead of attempting to maintain a "mini pip" for the sole purpose of installing pip the ``getpip`` module will, as an implementation detail, include a private copy of pip and its dependencies which will be used to discover and install pip from PyPI. It is important to stress that this private copy of pip is *only* an implementation detail and it should *not* be relied on or assumed to exist. Not all users will have network access to PyPI whenever they run the bootstrap. In order to ensure that these users will still be able to bootstrap pip the bootstrap will fallback to simply installing the included copy of pip. The pip ``--no-download`` command line option will be supported to force installation of the bundled version, without even attempting to contact PyPI. This presents a balance between giving users the latest version of pip, saving them from needing to immediately upgrade pip after bootstrapping it, and allowing the bootstrap to work offline in situations where users might already have packages downloaded that they wish to install. Proposed CLI ------------ The proposed CLI is based on a subset of the existing ``pip install`` options:: Usage: python -m getpip [options] Download Options: --no-download Install the bundled version, don't attempt to download -i, --index-url <url> Base URL of Python Package Index (default https://pypi.python.org/simple/). --proxy <proxy> Specify a proxy in the form [user:passwd@]proxy.server:port. --timeout <sec> Set the socket timeout (default 15 seconds). --cert <path> Path to alternate CA bundle. Installation Options: -U, --upgrade Upgrade pip and dependencies, even if already installed --user Install using the user scheme. --root <dir> Install everything relative to this alternate root directory. Additional options (such as verbosity and logging options) may also be supported. Automatic installation of setuptools ------------------------------------ ``pip`` currently depends on ``setuptools`` to handle metadata generation during the build process, along with some other features. While work is ongoing to reduce or eliminate this dependency, it is not clear if that work will be complete for pip 1.5 (which is the version likely to be bundled with Python 3.4.0). This PEP proposes that, if pip still requires it, ``setuptools`` will be bundled along with pip itself, and thus installed when running ``python -m getpip``. However, this behaviour will be officially declared an implementation detail. Other projects which explicitly require setuptools should still provide an appropriate dependency declaration, rather than assuming ``setuptools`` will always be installed alongside ``pip``. Updating the bundled pip ------------------------ In order to keep up with evolutions in packaging as well as providing users who are using the offline installation method with as recent version as possible the ``getpip`` module should be updated to the latest versions of everything it bootstraps. After each new pip release, and again during the preparation for any release of Python, a script, provided as part of this PEP, should be run to ensure the bundled packages have been updated to the latest versions. This means that maintenance releases of the CPython installers will include an updated version of the ``getpip`` bootstrap module. Feature Addition in Maintenance Releases ======================================== Adding a new module to the standard library in Python 2.7 and 3.3 maintenance releases breaks the usual policy of "no new features in maintenance releases". It is being proposed in this case as the bootstrapping problem greatly affects the experience of new users, especially on Python 2 where many Python 3 standard library improvements are available as backports on PyPI, but are not included in the Python 2 standard library. By updating Python 2.7, 3.3 and 3.4 to easily bootstrap the PyPI ecosystem, this should aid the vast majority of Python users, rather than only those with the freedom to adopt Python 3.4 as soon as it is released. This is also a matter of starting as we mean to continue: similar to IDLE (see PEP 434), ``getpip`` will be permanently exempted from the "no new features in maintenance releases" restriction, as it will include (and rely on) upgraded versions of ``pip`` even in maintenance releases. Pre-installation ================ During the installation of Python from Python.org ``python -m getpip`` should be executed, leaving people using the Windows or OSX installers with a working copy of pip once the installation has completed. The exact method of this is left up to the maintainers of the installers, however if the bootstrapping is optional it should be opt-out rather than opt-in. The Windows and OSX installers distributed by Python.org will automatically attempt to run ``python -m getpip`` by default however the ``make install`` and ``make altinstall`` commands of the source distribution will not. Note that ``getpip`` itself will still be installed normally (as it is a regular part of the standard library), only the installation of pip and its dependencies will be skipped. Keeping the pip bootstrapping as a separate step for make based installations should minimize the changes CPython redistributors need to make to their build processes. Avoiding the layer of indirection through ``make`` for the getpip invocation also ensures those installing from a custom source build can easily force an offline installation of pip, install it from a private index server, or skip installing pip entirely. Open Question: Uninstallation ============================= No changes are currently proposed to the uninstallation process. The bootstrapped pip will be installed the same way as any other pip installed packages, and will be handled in the same way as any other post-install additions to the Python environment. At least on Windows, that means the bootstrapped files will be left behind after uninstallation, since those files won't be associated with the Python MSI installer. .. note:: Perhaps the installer needs to be updated to clobber everything in site-packages and the Scripts directory, but I would prefer not to make this PEP conditional on that change. Open Question: Script Execution on Windows ========================================== While the Windows installer was updated in Python 3.3 to make ``python`` available on the PATH, no such change was made to include the scripts directory. This PEP proposes that this be changed to also add the scripts directory. Without this change, the most reliable way to invoke pip on Windows (without tinkering with paths) is actually be ``py -m pip`` (or ``py -3 -m pip`` if both Python 2 and 3 are installed) rather than simply calling ``pip``. Adding the scripts directory to the system PATH would mean that ``pip`` works reliably in the "only one Python installation" case, with ``py -m pip`` needed only for the parallel installation case. Python Virtual Environments =========================== Python 3.3 included a standard library approach to virtual Python environments through the ``venv`` module. Since it's release it has become clear that very few users have been willing to use this feature in part due to the lack of an installer present by default inside of the virtual environment. They have instead opted to continue using the ``virtualenv`` package which *does* include pip installed by default. To make the ``venv`` more useful to users it will be modified to issue the pip bootstrap by default inside of the new environment while creating it. This will allow people the same convenience inside of the virtual environment as this PEP provides outside of it as well as bringing the ``venv`` module closer to feature parity with the external ``virtualenv`` package making it a more suitable replacement. To handles cases where a user does not wish to have pip bootstrapped into their virtual environment a ``--without-pip`` option will be added. The ``--no-download`` option will also be supported, to force the use of the bundled ``pip`` rather than retrieving the latest version from PyPI. Bundling CA Certificates with CPython ===================================== The reference ``getpip`` implementation includes the ``pip`` CA bundle along with the rest of pip. This means CPython effectively includes a CA bundle that is used solely for ``getpip``. This is considered desirable, as it ensures that ``pip`` will behave the same across all supported versions of Python, even those prior to Python 3.4 that cannot access the system certificate store on Windows. Recommendations for Downstream Distributors =========================================== A common source of Python installations are through downstream distributors such as the various Linux Distributions [#ubuntu]_ [#debian]_ [#fedora]_, OSX package managers [#homebrew]_, or python specific tools [#conda]_. In order to provide a consistent, user friendly experience to all users of Python regardless of how they attained Python this PEP recommends and asks that downstream distributors: * Ensure that whenever Python is installed pip is also installed. * This may take the form of separate packages with dependencies on each other so that installing the Python package installs the pip package and installing the pip package installs the Python package. * Do not remove the bundled copy of pip. * This is required for offline installation of pip into a virtual environment. * This is similar to the existing ``virtualenv`` package for which many downstream distributors have already made exception to the common "debundling" policy. * This does mean that if ``pip`` needs to be updated due to a security issue, so does the bundled version in the ``getpip`` bootstrap module * However, altering the bundled version of pip to remove the embedded CA certificate bundle and rely the system CA bundle instead is a reasonable change. * Migrate build systems to utilize `pip`_ and `Wheel`_ instead of directly using ``setup.py``. * This will ensure that downstream packages can more easily utilize the new metadata formats which may not have a ``setup.py``. * Ensure that all features of this PEP continue to work with any modifications made. * Online installation of the latest version of pip into a global or virtual python environment using ``python -m getpip``. * Offline installation of the bundled version of pip into a global or virtual python environment using ``python -m getpip``. * ``pip install --upgrade pip`` in a global installation should not affect any already created virtual environments. * ``pip install --upgrade pip`` in a virtual environment should not affect the global installation. Policies & Governance ===================== The maintainers of the bundled software and the CPython core team will work together in order to address the needs of both. The bundled software will still remain external to CPython and this PEP does not include CPython subsuming the responsibilities or decisions of the bundled software. This PEP aims to decrease the burden on end users wanting to use third party packages and the decisions inside it are pragmatic ones that represent the trust that the Python community has placed in the authors and maintainers of the bundled software. Backwards Compatibility ----------------------- The public API of the ``getpip`` module itself will fall under the typical backwards compatibility policy of Python for its standard library. The externally developed software that this PEP bundles does not. Most importantly, this means that the bundled version of pip may gain new features in CPython maintenance releases, and pip continues to operate on its own 6 month release cycle rather than CPython's 18-24 month cycle. Security Releases ----------------- Any security update that affects the ``getpip`` module will be shared prior to release with the PSRT. The PSRT will then decide if the issue inside warrants a security release of Python. Appendix: Rejected Proposals ============================ Implicit Bootstrap ------------------ `PEP439`_, the predecessor for this PEP, proposes its own solution. Its solution involves shipping a fake ``pip`` command that when executed would implicitly bootstrap and install pip if it does not already exist. This has been rejected because it is too "magical". It hides from the end user when exactly the pip command will be installed or that it is being installed at all. It also does not provide any recommendations or considerations towards downstream packagers who wish to manage the globally installed pip through the mechanisms typical for their system. Including pip In the Standard Library ------------------------------------- Similar to this PEP is the proposal of just including pip in the standard library. This would ensure that Python always includes pip and fixes all of the end user facing problems with not having pip present by default. This has been rejected because we've learned through the inclusion and history of ``distutils`` in the standard library that losing the ability to update the packaging tools independently can leave the tooling in a state of constant limbo. Making it unable to ever reasonably evolve in a timeframe that actually affects users as any new features will not be available to the general population for *years*. Allowing the packaging tools to progress separately from the Python release and adoption schedules allows the improvements to be used by *all* members of the Python community and not just those able to live on the bleeding edge of Python releases. .. _Wheel: http://www.python.org/dev/peps/pep-0427/ .. _pip: http://www.pip-installer.org .. _setuptools: https://pypi.python.org/pypi/setuptools .. _PEP439: http://www.python.org/dev/peps/pep-0439/ References ========== .. [#ubuntu] `Ubuntu <http://www.ubuntu.com/>` .. [#debian] `Debian <http://www.debian.org>` .. [#fedora] `Fedora <https://fedoraproject.org/>` .. [#homebrew] `Homebrew <http://brew.sh/>` .. [#conda] `Conda <http://www.continuum.io/blog/conda>` ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
On 15 September 2013 16:33, Donald Stufft <donald@stufft.io> wrote: [...]
``pip`` currently depends on ``setuptools`` to handle metadata generation during the build process, along with some other features. While work is ongoing to reduce or eliminate this dependency, it is not clear if that work will be complete for pip 1.5 (which is the version likely to be bundled with Python 3.4.0).
Might be better worded as "(which is the version likely to be current when Python 3.4.0 is released)",
This PEP proposes that, if pip still requires it, ``setuptools`` will be bundled along with pip itself, and thus installed when running ``python -m getpip``.
Do you mean "thus installed"? Surely the current version from PyPI will be installed except in --no-download cases? Actually, I'd be inclined to avoid the use of the word "bundled". getpip has an "internal copy" of pip, which users aren't supposed to care about except for the fact that it's the "static version included for offline installation", and there's also the "current version on PyPI" which is what is normally installed. I don't think that any of the 3 quoted terms really corresponds to "bundled" in the sense that people will expect it to mean.
Updating the bundled pip ------------------------ [..] This means that maintenance releases of the CPython installers will include an updated version of the ``getpip`` bootstrap module.
Is this *solely* the copy of pip that will be updated, or if there are logic changes in getpip itself (I can't immediately think of any that might be added) will they be included as well?
While the Windows installer was updated in Python 3.3 to make ``python`` available on the PATH,
Correction: it was updated to *optionally* make Python available, and the option is off by default.
no such change was made to include the scripts directory. This PEP proposes that this be changed to also add the scripts directory.
Without this change, the most reliable way to invoke pip on Windows (without tinkering with paths) is actually be ``py -m pip`` (or ``py -3 -m pip`` if both Python 2 and 3 are installed) rather than simply calling ``pip``.
Adding the scripts directory to the system PATH would mean that ``pip`` works reliably in the "only one Python installation" case, with ``py -m pip`` needed only for the parallel installation case.
Technically, for the 'the user only selected the "put Python on PATH" option for one installation" case.... And if the user selected it more than once, they are already in the land of confusing which-comes-first PATH ordering fun, so working out which pip gets run won't be much extra pain :-) I think it's also still an open question as to whether getpip should be run in --user mode by default (or as an option) as part of the installation process. There are issues around PATH with doing so, as well as possible confusion because pip does not default to --user. I suspect that the probably answer here should be "--user installs are too immature at the moment to have as the default, but we should look at this again later". But these minor points aside, I'm +1 on this. Paul
On Sunday, 15 September 2013, Donald Stufft wrote:
So there've been a number of updates to PEP453, so i'm posting it here again for more discussion:
Viewable online at: http://www.python.org/dev/peps/pep-0453/
<snip PEP text>
Hey, I've been trying hard to follow the discussions but please forgive this question if its has been hashed out already. I was a little concerned, despite it being an implementation detail, about bundling a pip which ends up being present somewhere whwerever the bootstrap is. I know the concept of a mini-pip has been abandoned, but I couldn't help thinking about the pip / setuptools as wheels in Nick's roadmap summary. Instead of a copy of pip, could there not be minimal code to simply fetch the wheel and install that or perhaps even use it directly (a la what was possible with eggs)? Perhaps that is already too much code and there are good reasons not to do this, but hoped asking would make it clear. Thanks, Alex J Burke.
On Sep 15, 2013, at 3:56 PM, Alex Burke <alexjeffburke@gmail.com> wrote:
On Sunday, 15 September 2013, Donald Stufft wrote: So there've been a number of updates to PEP453, so i'm posting it here again for more discussion:
Viewable online at: http://www.python.org/dev/peps/pep-0453/
<snip PEP text>
Hey,
I've been trying hard to follow the discussions but please forgive this question if its has been hashed out already.
I was a little concerned, despite it being an implementation detail, about bundling a pip which ends up being present somewhere whwerever the bootstrap is.
I know the concept of a mini-pip has been abandoned, but I couldn't help thinking about the pip / setuptools as wheels in Nick's roadmap summary. Instead of a copy of pip, could there not be minimal code to simply fetch the wheel and install that or perhaps even use it directly (a la what was possible with eggs)?
Perhaps that is already too much code and there are good reasons not to do this, but hoped asking would make it clear.
Thanks, Alex J Burke.
Basically three reasons: - There's a lot of code to handle a variety of situations in pip, it would be much harder to extract this code and keep it up to date in a way that the "mini pip" could use it. It would also not be as battle worn as the pip code. - On top of the one time cost there is the ongoing cost. As the packaging ecosystem progresses the stdlib implementation will need to be kept up to date as well as the pip code. - We need to include a copy of pip in order to support offline installs either way. Since we have the private copy there to support offline we might as well use it to handle everything. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
On Sun, Sep 15, 2013 at 6:33 PM, Donald Stufft <donald@stufft.io> wrote:
So there've been a number of updates to PEP453, so i'm posting it here again for more discussion:
Just want to say that It is good to see that over three year `pip` has gained strength to be the tool of choice for Python package management. My attempt to propose this three years ago had failed, because there was no package management tool people could agree to, even though the proposal was for the interactive user script that could recommend all : https://mail.python.org/pipermail/distutils-sig/2010-March/015894.html So, why PIP? -- anatoly t.
On 16 September 2013 18:07, anatoly techtonik <techtonik@gmail.com> wrote:
On Sun, Sep 15, 2013 at 6:33 PM, Donald Stufft <donald@stufft.io> wrote:
So there've been a number of updates to PEP453, so i'm posting it here again for more discussion:
Just want to say that It is good to see that over three year `pip` has gained strength to be the tool of choice for Python package management. My attempt to propose this three years ago had failed, because there was no package management tool people could agree to, even though the proposal was for the interactive user script that could recommend all : https://mail.python.org/pipermail/distutils-sig/2010-March/015894.html
So, why PIP?
The design of pip is heavily based around addressing various issues with the design of easy_install, while still providing equivalent functionality. The last big missing piece was an updated alternative to the binary egg format that supported FHS compliant installation, and that was addressed in pip 1.4 with the initial version of pip's wheel support. The amicable resolution of the setuptools/distribute split (with distribute merging back into setuptools and development moving to the PyPA account on BitBucket) also makes it easier for python-dev to officially bless pip as the default installation tool, which people can use unless/until they need something with more sophisticated handling of external dependencies (like zc.buildout or conda). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 15 September 2013 16:33, Donald Stufft <donald@stufft.io> wrote:
So there've been a number of updates to PEP453, so i'm posting it here again for more discussion:
Explicit Bootstrapping ======================
An additional module called ``getpip`` will be added to the standard library whose purpose is to install pip and any of its dependencies into the appropriate location (most commonly site-packages). It will expose a single callable named ``bootstrap()`` as well as offer direct execution via ``python -m getpip``. Options for installing it such as index server, installation location (``--user``, ``--root``, etc) will also be available to enable different installation schemes.
It is believed that users will want the most recent versions available to be installed so that they can take advantage of the new advances in packaging. Since any particular version of Python has a much longer staying power than a version of pip in order to satisfy a user's desire to have the most recent version the bootstrap will contact PyPI, find the latest version, download it, and then install it. This process is security sensitive, difficult to get right, and evolves along with the rest of packaging.
Instead of attempting to maintain a "mini pip" for the sole purpose of installing pip the ``getpip`` module will, as an implementation detail, include a private copy of pip and its dependencies which will be used to discover and install pip from PyPI. It is important to stress that this private copy of pip is *only* an implementation detail and it should *not* be relied on or assumed to exist.
Not all users will have network access to PyPI whenever they run the bootstrap. In order to ensure that these users will still be able to bootstrap pip the bootstrap will fallback to simply installing the included copy of pip. The pip ``--no-download`` command line option will be supported to force installation of the bundled version, without even attempting to contact PyPI.
This presents a balance between giving users the latest version of pip, saving them from needing to immediately upgrade pip after bootstrapping it, and allowing the bootstrap to work offline in situations where users might already have packages downloaded that they wish to install.
I still don't understand why this is preferable to just shipping a recent stable pip/setuptools and providing instructions to update post-install. That would surely be a lot simpler not just to implement but for others to understand. If I'm happy to use the bundled version of pip then why do I need to do anything to make it usable after having already run the installer? If I want the new version then why is 'py -m getpip' better than 'py -m pip install -U pip'?
Recommendations for Downstream Distributors ===========================================
A common source of Python installations are through downstream distributors such as the various Linux Distributions [#ubuntu]_ [#debian]_ [#fedora]_, OSX package managers [#homebrew]_, or python specific tools [#conda]_. In order to provide a consistent, user friendly experience to all users of Python regardless of how they attained Python this PEP recommends and asks that downstream distributors:
* Ensure that whenever Python is installed pip is also installed.
* This may take the form of separate packages with dependencies on each other so that installing the Python package installs the pip package and installing the pip package installs the Python package.
* Do not remove the bundled copy of pip.
Are distros really going to be okay with this idea? Many of them have CPython in their base install so you're basically asking that they always ship a parallel package management system that is outside of their control. Personally I think that it's unfortunate that distro package managers don't have a --user option like pip does but I've always assumed that they had some good reason for not wanting any old user to be able to easily install things without admin/root privileges. This would break that arrangement since any user would be able to use 'pip install --user' to install anything from PyPI. I imagine that lots of deployment sites would want to disable this even if the distro has it enabled by default. Oscar
On 16 Sep 2013 19:55, "Oscar Benjamin" <oscar.j.benjamin@gmail.com> wrote:
On 15 September 2013 16:33, Donald Stufft <donald@stufft.io> wrote:
So there've been a number of updates to PEP453, so i'm posting it here
Explicit Bootstrapping ======================
An additional module called ``getpip`` will be added to the standard
whose purpose is to install pip and any of its dependencies into the appropriate location (most commonly site-packages). It will expose a single callable named ``bootstrap()`` as well as offer direct execution via ``python -m getpip``. Options for installing it such as index server, installation location (``--user``, ``--root``, etc) will also be available to enable different installation schemes.
It is believed that users will want the most recent versions available to be installed so that they can take advantage of the new advances in
Since any particular version of Python has a much longer staying power
again for more discussion: library packaging. than
a version of pip in order to satisfy a user's desire to have the most recent version the bootstrap will contact PyPI, find the latest version, download it, and then install it. This process is security sensitive, difficult to get right, and evolves along with the rest of packaging.
Instead of attempting to maintain a "mini pip" for the sole purpose of installing pip the ``getpip`` module will, as an implementation detail, include a private copy of pip and its dependencies which will be used to discover and install pip from PyPI. It is important to stress that this private copy of pip is *only* an implementation detail and it should *not* be relied on or assumed to exist.
Not all users will have network access to PyPI whenever they run the bootstrap. In order to ensure that these users will still be able to bootstrap pip the bootstrap will fallback to simply installing the included copy of pip. The pip ``--no-download`` command line option will be supported to force installation of the bundled version, without even attempting to contact PyPI.
This presents a balance between giving users the latest version of pip, saving them from needing to immediately upgrade pip after bootstrapping it, and allowing the bootstrap to work offline in situations where users might already have packages downloaded that they wish to install.
I still don't understand why this is preferable to just shipping a recent stable pip/setuptools and providing instructions to update post-install. That would surely be a lot simpler not just to implement but for others to understand.
If I'm happy to use the bundled version of pip then why do I need to do anything to make it usable after having already run the installer? If I want the new version then why is 'py -m getpip' better than 'py -m pip install -U pip'?
You don't, the installer bootstraps it for you. Running it explicitly should only be needed when building from source, or bootstrapping a previously pip-free virtual environment. The complicated bootstrapping dance is both to make pip easy to leave out if people really don't want it and to avoid the CPython platform installers and pip getting into a fight about who is responsible for the files.
Recommendations for Downstream Distributors ===========================================
A common source of Python installations are through downstream distributors such as the various Linux Distributions [#ubuntu]_ [#debian]_ [#fedora]_, OSX package managers [#homebrew]_, or python specific tools [#conda]_. In order to provide a consistent, user friendly experience to all users of Python regardless of how they attained Python this PEP recommends and asks that downstream distributors:
* Ensure that whenever Python is installed pip is also installed.
* This may take the form of separate packages with dependencies on each other so that installing the Python package installs the pip package and installing the pip package installs the Python package.
* Do not remove the bundled copy of pip.
Are distros really going to be okay with this idea? Many of them have CPython in their base install so you're basically asking that they always ship a parallel package management system that is outside of their control.
Fedora is fine with it (we discussed it at Flock last month), while Ubuntu (and Debian?) already splits out a "python-core" package for inclusion on the installation media, so should be fine with having the full python have a circular dependency with pip. Other distros will be free to make their own decisions, but I believe they all already tolerate this bundling approach inside virtualenv. Windows and Mac OS X users are intended as the main beneficiaries though - bootstrapping pip with yum, apt, etc is much easier than bootstrapping it on either of those platforms.
Personally I think that it's unfortunate that distro package managers don't have a --user option like pip does but I've always assumed that they had some good reason for not wanting any old user to be able to easily install things without admin/root privileges. This would break that arrangement since any user would be able to use 'pip install --user' to install anything from PyPI. I imagine that lots of deployment sites would want to disable this even if the distro has it enabled by default.
Yep, but there are plenty of options available to do that, and the better ones work even if users download and install pip directly. Our target audience here is the beginning enthusiast (regardless of OS). Sophisticated users (including admins that want to lock their system down) can still do so if they really want to. Cheers, Nick.
Oscar _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
On 16 September 2013 12:27, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 16 Sep 2013 19:55, "Oscar Benjamin" <oscar.j.benjamin@gmail.com> wrote:
I still don't understand why this is preferable to just shipping a recent stable pip/setuptools and providing instructions to update post-install. That would surely be a lot simpler not just to implement but for others to understand.
If I'm happy to use the bundled version of pip then why do I need to do anything to make it usable after having already run the installer? If I want the new version then why is 'py -m getpip' better than 'py -m pip install -U pip'?
You don't, the installer bootstraps it for you. Running it explicitly should only be needed when building from source, or bootstrapping a previously pip-free virtual environment.
Oh okay. So basically the normal thing is that pip just gets installed automatically when you install Python. For most people the whole of the "explicit bootstrapping" described in the PEP is an implementation detail that occurs *implicitly* during installation? The only point of relevance from a user perspective is that running the installer without a network connection leaves you with an older version of pip/setuptools.
The complicated bootstrapping dance is both to make pip easy to leave out if people really don't want it and to avoid the CPython platform installers and pip getting into a fight about who is responsible for the files.
Surely this is only relevant for people using the installers since if you're capable of building CPython from source then you should be plenty capable of installing pip/setuptools from source as well. Likewise if you're installing via a distro package manager then you're not going to use this bootstrapping script. If this is just for the Windows and OSX installers then can they not just have a tickbox for installing the bundled pip and another tickbox for updating it (both on by default)? If you need to update it after installation then you can just use pip to update itself. Who, apart from the Windows and OSX installers, is going to use this bootstrap script? When you say that pip and the installer could get into a fight about responsibility do you mean for uninstallation? Presumably if you're uninstalling Python then you also want to uninstall the associated pip installation so it's fine for the installer to just delete everything to do with pip anyway right? Oscar
On 16 September 2013 22:08, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On 16 September 2013 12:27, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 16 Sep 2013 19:55, "Oscar Benjamin" <oscar.j.benjamin@gmail.com> wrote:
I still don't understand why this is preferable to just shipping a recent stable pip/setuptools and providing instructions to update post-install. That would surely be a lot simpler not just to implement but for others to understand.
If I'm happy to use the bundled version of pip then why do I need to do anything to make it usable after having already run the installer? If I want the new version then why is 'py -m getpip' better than 'py -m pip install -U pip'?
You don't, the installer bootstraps it for you. Running it explicitly should only be needed when building from source, or bootstrapping a previously pip-free virtual environment.
Oh okay. So basically the normal thing is that pip just gets installed automatically when you install Python. For most people the whole of the "explicit bootstrapping" described in the PEP is an implementation detail that occurs *implicitly* during installation? The only point of relevance from a user perspective is that running the installer without a network connection leaves you with an older version of pip/setuptools.
The complicated bootstrapping dance is both to make pip easy to leave out if people really don't want it and to avoid the CPython platform installers and pip getting into a fight about who is responsible for the files.
Surely this is only relevant for people using the installers since if you're capable of building CPython from source then you should be plenty capable of installing pip/setuptools from source as well. Likewise if you're installing via a distro package manager then you're not going to use this bootstrapping script. If this is just for the Windows and OSX installers then can they not just have a tickbox for installing the bundled pip and another tickbox for updating it (both on by default)? If you need to update it after installation then you can just use pip to update itself.
Who, apart from the Windows and OSX installers, is going to use this bootstrap script?
As noted in the PEP, pyvenv will use it to add pip to virtual environments by default (with an option to turn it off when creating the venv). The only case where we expect getpip *won't* be used is for system installs of Python on Linux distributions (since the system package manager will likely provide it instead).
When you say that pip and the installer could get into a fight about responsibility do you mean for uninstallation? Presumably if you're uninstalling Python then you also want to uninstall the associated pip installation so it's fine for the installer to just delete everything to do with pip anyway right?
No, it's a problem when you install a new maintenance version of CPython over an existing installation. With getpip, we can treat that like any other case of running getpip when pip is already installed. With a bundled pip, the OS installer will try to overwrite the existing installation (which may be a downgrade if you previously used pip to upgrade itself to a more recent version than the bundled one). Separating the two (installer owns getpip, pip owns pip) makes a lot of competing upgrade related issues just go away. The other aspect is a social one, though. The bootstrapping dance helps make it clearer that only getpip is developed directly under the normal CPython governance model - pip is developed by PyPA/distutils-sig and merely shipped with CPython. However, I just noticed that the PEP *doesn't* currently specify the effect of running "python -m getpip" when pip is already installed. I believe it should attempt to upgrade pip, so that in the normal course of events (i.e. just installing CPython maintenance releases without manually upgrading pip), then pip will be automatically upgraded to the latest version each time CPython itself is updated. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sep 16, 2013, at 10:15 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
However, I just noticed that the PEP *doesn't* currently specify the effect of running "python -m getpip" when pip is already installed. I believe it should attempt to upgrade pip, so that in the normal course of events (i.e. just installing CPython maintenance releases without manually upgrading pip), then pip will be automatically upgraded to the latest version each time CPython itself is updated.
TBH I just assumed it would say that pip is already bootstrapped, but I'm OK with making it upgrade as well. My thinking was if you already had pip then ``pip install --upgrade pip`` is easier but making the bootstrap upgrade too does mean that installing a maintenance release will upgrade your pip too which is kinda nice. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
On 17 September 2013 12:21, Donald Stufft <donald@stufft.io> wrote:
On Sep 16, 2013, at 10:15 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
However, I just noticed that the PEP *doesn't* currently specify the effect of running "python -m getpip" when pip is already installed. I believe it should attempt to upgrade pip, so that in the normal course of events (i.e. just installing CPython maintenance releases without manually upgrading pip), then pip will be automatically upgraded to the latest version each time CPython itself is updated.
TBH I just assumed it would say that pip is already bootstrapped, but I'm OK with making it upgrade as well. My thinking was if you already had pip then ``pip install --upgrade pip`` is easier but making the bootstrap upgrade too does mean that installing a maintenance release will upgrade your pip too which is kinda nice.
Yeah, that's the exact same thought process I went through :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sep 16, 2013, at 8:08 AM, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
Oh okay. So basically the normal thing is that pip just gets installed automatically when you install Python. For most people the whole of the "explicit bootstrapping" described in the PEP is an implementation detail that occurs *implicitly* during installation? The only point of relevance from a user perspective is that running the installer without a network connection leaves you with an older version of pip/setuptools.
Yes, ideally a user will never have to invoke it manually. However we can't control what the downstream distributors do so for some Linux versions they may have to invoke it (or use system packages if pip is there). It also works as a nice fallback if you somehow end up with pip uninstalled on a system that doesn't have something like apt-get to get it back. And yes by default the online/offline stuff is behind the scenes for the typical user and they'll get the latest version of pip during install that we can locate. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
On Sep 16, 2013, at 7:27 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Other distros will be free to make their own decisions, but I believe they all already tolerate this bundling approach inside virtualenv.
Going through this thread but just wanted to mention i've heard positive things from a few people involved in FreeBSD too. Not a certain +1 but a general willingness to figure something out. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
Hi, Donald Stufft <donald <at> stufft.io> writes:
This is also a matter of starting as we mean to continue: similar to IDLE (see PEP 434), ``getpip`` will be permanently exempted from the "no new features in maintenance releases" restriction, as it will include (and rely on) upgraded versions of ``pip`` even in maintenance releases.
This sounds rather weird. If the whole point of ``getpip`` is for people to get the latest pip version without it being bundled, the why does ``getpip`` itself need to be upgraded in maintenance releases? (barring bug and compatibility fixes, obviously)
The reference ``getpip`` implementation includes the ``pip`` CA bundle along with the rest of pip. This means CPython effectively includes a CA bundle that is used solely for ``getpip``.
This is considered desirable, as it ensures that ``pip`` will behave the same across all supported versions of Python, even those prior to Python 3.4 that cannot access the system certificate store on Windows.
You mean... it ensures that ``getpip`` will behave the same, right?
Policies & Governance =====================
The maintainers of the bundled software and the CPython core team will work together in order to address the needs of both. The bundled software will still remain external to CPython and this PEP does not include CPython subsuming the responsibilities or decisions of the bundled software.
Ok, to be clear: "bundled software" means the private pip copy, not "getpip" itself, right? Otherwise I'm afraid I don't agree :-) Whichever public API is exposed by the CPython distribution is certainly within the realm and moral responsibility of the CPython core developers. By the way, the parallel with IDLE here is flawed, because IDLE is a very annex and secondary piece of software that few people care about (which is why IDLE was several times proposed *out* of the stdlib). For the rest, +1. Regards Antoine.
On 16 Sep 2013 20:06, "Antoine Pitrou" <antoine@python.org> wrote:
Hi,
Donald Stufft <donald <at> stufft.io> writes:
This is also a matter of starting as we mean to continue: similar to
IDLE
(see PEP 434), ``getpip`` will be permanently exempted from the "no new features in maintenance releases" restriction, as it will include (and rely on) upgraded versions of ``pip`` even in maintenance releases.
This sounds rather weird. If the whole point of ``getpip`` is for people to get the latest pip version without it being bundled, the why does ``getpip`` itself need to be upgraded in maintenance releases? (barring bug and compatibility fixes, obviously)
Because getpip contains a complete private copy of pip that it installs in the "--no-download" case and otherwise uses to download the latest version. *Technically* you could lock down the getpip shim to prevent feature additions, but I don't see the point in introducing cross-version inconsistencies in maintained versions if we decide the shim should expose more pip features.
The reference ``getpip`` implementation includes the ``pip`` CA bundle along with the rest of pip. This means CPython effectively
includes
a CA bundle that is used solely for ``getpip``.
This is considered desirable, as it ensures that ``pip`` will behave the same across all supported versions of Python, even those prior to Python 3.4 that cannot access the system certificate store on Windows.
You mean... it ensures that ``getpip`` will behave the same, right?
Both, really. The behaviour of getpip and the behaviour of a "--no-download" pip bootstrap will be the same, since they use the same software.
Policies & Governance =====================
The maintainers of the bundled software and the CPython core team will
work
together in order to address the needs of both. The bundled software will still remain external to CPython and this PEP does not include CPython subsuming the responsibilities or decisions of the bundled software.
Ok, to be clear: "bundled software" means the private pip copy, not "getpip" itself, right? Otherwise I'm afraid I don't agree :-) Whichever public API is exposed by the CPython distribution is certainly within the realm and moral responsibility of the CPython core developers.
getpip delegates to the included private copy of pip for the installation process, so it may gain new features in maintenance releases (e.g. signature validation if we get a TUF based system working). This means I'm OK with requiring no backwards incompatible changes to getpip in maintenance releases, I'm not OK with disallowing feature additions (hence the comparison to IDLE).
By the way, the parallel with IDLE here is flawed, because IDLE is a very
annex
and secondary piece of software that few people care about (which is why IDLE was several times proposed *out* of the stdlib).
If this bundling approach works for pip, it wouldn't surprise me if IDLE migrated to this multi-version bundled application model in 3.5. Cheers, Nick.
For the rest, +1.
Regards
Antoine.
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Nick Coghlan <ncoghlan <at> gmail.com> writes:
On 16 Sep 2013 20:06, "Antoine Pitrou" <antoine <at> python.org> wrote:
Hi,
Donald Stufft <donald <at> stufft.io> writes:
This is also a matter of starting as we mean to continue: similar to IDLE (see PEP 434), ``getpip`` will be permanently exempted from the "no new features in maintenance releases" restriction, as it will include (and rely on) upgraded versions of ``pip`` even in maintenance releases.
This sounds rather weird. If the whole point of ``getpip`` is for people to get the latest pip version without it being bundled, the why does ``getpip`` itself need to be upgraded in maintenance releases? (barring bug and compatibility fixes, obviously)
Because getpip contains a complete private copy of pip that it installs in
the "--no-download" case and otherwise uses to download the latest version.
*Technically* you could lock down the getpip shim to prevent feature additions, but I don't see the point in introducing cross-version inconsistencies in maintained versions if we decide the shim should expose more pip features.
Well... Cross-version inconsistencies are the reason we have several maintained versions in the first place. If you upgrade getpip's functionality in maintenance releases, this means someone with Python 2.7.7 won't get the same experience as, e.g., someone with Python 2.7.6 or 2.7.8. It breaks the expectation that maintenance releases are basically substitutable to each other (modulo, of course, bug fixes). It also makes support more complicated for the various Python communities ("wait, which 2.7 version do you have? then your getpip doesn't have that option, you must instead do this and that..."). It will probably also make Python maintenance more difficult for distributors (e.g. Linux distros) which commit to API stability between bugfix versions. Regards Antoine.
On 16 Sep 2013 21:23, "Antoine Pitrou" <antoine@python.org> wrote:
Nick Coghlan <ncoghlan <at> gmail.com> writes:
On 16 Sep 2013 20:06, "Antoine Pitrou" <antoine <at> python.org> wrote:
Hi,
Donald Stufft <donald <at> stufft.io> writes:
This is also a matter of starting as we mean to continue: similar
(see PEP 434), ``getpip`` will be permanently exempted from the "no new features in maintenance releases" restriction, as it will include (and rely on) upgraded versions of ``pip`` even in maintenance releases.
This sounds rather weird. If the whole point of ``getpip`` is for
get the latest pip version without it being bundled, the why does ``getpip`` itself need to be upgraded in maintenance releases? (barring bug and compatibility fixes, obviously) Because getpip contains a complete private copy of pip that it installs in the "--no-download" case and otherwise uses to download the latest version. *Technically* you could lock down the getpip shim to prevent feature additions, but I don't see the point in introducing cross-version inconsistencies in maintained versions if we decide the shim should expose more pip features.
Well... Cross-version inconsistencies are the reason we have several
to IDLE people to maintained
versions in the first place.
If you upgrade getpip's functionality in maintenance releases, this means someone with Python 2.7.7 won't get the same experience as, e.g., someone with Python 2.7.6 or 2.7.8. It breaks the expectation that maintenance releases are basically substitutable to each other (modulo, of course, bug fixes). It also makes support more complicated for the various Python communities ("wait, which 2.7 version do you have? then your getpip doesn't have that option, you must instead do this and that...").
Well, people shouldn't be running getpip manually very often in the first place. The one thing I do *not* want to preclude is security improvements in maintenance releases. Those *may* require visible CLI changes (e.g. a flag to opt in to signature checking). End users should then get the enhanced security automatically most of the time (as the installers and pyvenv pass in the flag), while direct invocations will remain unaltered (as they *won't* pass the new flag). The next Python *feature* release would then make the flag opt-out rather than opt-in. I'm happy to limit the exception to such security enhancements, though, rather than allowing free reign for arbitrary getpip changes in maintenance releases.
It will probably also make Python maintenance more difficult for
distributors
(e.g. Linux distros) which commit to API stability between bugfix versions.
If the API additions are limited to opt-in security improvements, they can probably live with it (although, to be honest, while I don't work for the Platform team, it wouldn't surprise me if Red Hat still left pip and getpip out of RHEL and only included it in Red Hat Software Collections, regardless of what our recommendations say). Cheers, Nick.
Regards
Antoine.
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
Well, people shouldn't be running getpip manually very often in the first
The one thing I do *not* want to preclude is security improvements in
Nick Coghlan <ncoghlan <at> gmail.com> writes: place. maintenance
releases. Those *may* require visible CLI changes (e.g. a flag to opt in to signature checking). End users should then get the enhanced security automatically most of the time (as the installers and pyvenv pass in the flag), while direct invocations will remain unaltered (as they *won't* pass the new flag).
I definitely agree with this :)
(although, to be honest, while I don't work for the Platform team, it wouldn't surprise me if Red Hat still left pip and getpip out of RHEL and only included it in Red Hat Software Collections, regardless of what our recommendations say).
Yes, I suppose Debian may make the same choice. Distributions like their "minimal" packages :) Regards Antoine.
participants (7)
-
Alex Burke
-
anatoly techtonik
-
Antoine Pitrou
-
Donald Stufft
-
Nick Coghlan
-
Oscar Benjamin
-
Paul Moore