PEP 470, round 4 - Using Multi Repository Support for External to PyPI Package File Hosting
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
Here’s round 4 of PEP 470, I believe I’ve addressed the comments from the previous thread. I've also tried to clarify the text as well as the motivations better. You can view this online at: https://www.python.org/dev/peps/pep-0470/ --------------------- PEP: 470 Title: Using Multi Repository Support for External to PyPI Package File Hosting Version: $Revision$ Last-Modified: $Date$ Author: Donald Stufft <donald@stufft.io>, BDFL-Delegate: Richard Jones <richard@python.org> Discussions-To: distutils-sig@python.org Status: Draft Type: Process Content-Type: text/x-rst Created: 12-May-2014 Post-History: 14-May-2014, 05-Jun-2014, 03-Oct-2014 Replaces: 438 Abstract ======== This PEP proposes a mechanism for project authors to register with PyPI an external repository where their project's downloads can be located. This information can than be included as part of the simple API so that installers can use it to tell users where the item they are attempting to install is located and what they need to do to enable this additional repository. In addition to adding discovery information to make explicit multiple repositories easy to use, this PEP also deprecates and removes the implicit multiple repository support which currently functions through directly or indirectly linking offsite via the simple API. Finally this PEP also proposes deprecating and removing the functionality added by PEP 438, particularly the additional rel information and the meta tag to indicate the API version. This PEP *does* not propose mandating that all authors upload their projects to PyPI in order to exist in the index nor does it propose any change to the human facing elements of PyPI. Rationale ========= Historically PyPI did not have any method of hosting files nor any method of automatically retrieving installables, it was instead focused on providing a central registry of names, to prevent naming collisions, and as a means of discovery for finding projects to use. In the course of time setuptools began to scrape these human facing pages, as well as pages linked from those pages, looking for things it could automatically download and install. Eventually this became the "Simple" API which used a similar URL structure however it eliminated any of the extraneous links and information to make the API more efficient. Additionally PyPI grew the ability for a project to upload release files directly to PyPI enabling PyPI to act as a repository in addition to an index. This gives PyPI two equally important roles that it plays in the Python ecosystem, that of index to enable easy discovery of Python projects and central repository to enable easy hosting, download, and installation of Python projects. Due to the history behind PyPI and the very organic growth it has experienced the lines between these two roles are blurry, and this blurriness has caused confusion for the end users of both of these roles and this has in turn caused ire between people attempting to use PyPI in different capacities, most often when end users want to use PyPI as a repository but the author wants to use PyPI soley as an index. By moving to using explict multiple repositories we can make the lines between these two roles much more explicit and remove the "hidden" surprises caused by the current implementation of handling people who do not want to use PyPI as a repository. However simply moving to explicit multiple repositories is a regression in discoverablity, and for that reason this PEP adds an extension to the current simple API which will enable easy discovery of the specific repository that a project can be found in. PEP 438 attempted to solve this issue by allowing projects to explicitly declare if they were using the repository features or not, and if they were not, it had the installers classify the links it found as either "internal", "verifiable external" or "unverifiable external". PEP 438 was accepted and implemented in pip 1.4 (released on Jul 23, 2013) with the final transition implemented in pip 1.5 (released on Jan 2, 2014). PEP 438 was successful in bringing about more people to utilize PyPI's repository features, an altogether good thing given the global CDN powering PyPI providing speed ups for a lot of people, however it did so by introducing a new point of confusion and pain for both the end users and the authors. Why Additional Repositories? ---------------------------- The two common installer tools, pip and easy_install/setuptools, both support the concept of additional locations to search for files to satisify the installation requirements and have done so for many years. This means that there is no need to "phase" in a new flag or concept and the solution to installing a project from a repository other than PyPI will function regardless of how old (within reason) the end user's installer is. Not only has this concept existed in the Python tooling for some time, but it is a concept that exists across languages and even extending to the OS level with OS package tools almost universally using multiple repository support making it extremely likely that someone is already familar with the concept. Additionally, the multiple repository approach is a concept that is useful outside of the narrow scope of allowing projects which wish to be included on the index portion of PyPI but do not wish to utilize the repository portion of PyPI. This includes places where a company may wish to host a repository that contains their internal packages or where a project may wish to have multiple "channels" of releases, such as alpha, beta, release candidate, and final release. Setting up an external repository is very simple, it can be achieved with nothing more than a filesystem, some files to host, and any web server capable of serving files and generating an automated index of directories (commonly called "autoindex"). This can be as simple as: :: $ mkdir -p /var/www/index.example.com/ $ mkdir -p /var/www/index.example.com/myproject/ $ mv ~/myproject-1.0.tar.gz /var/www/index.example.com/myproject/ $ twistd -n web --path /var/www/index.example.com/ Using this additional location within pip is also simple and can be included on a per invocation, per shell, or per user basis. The pip 6.0 will also include the ability to configure this on a per virtual environment or per machine basis as well. This can be as simple as: :: $ # As a CLI argument $ pip install --extra-index-url https://index.example.com/ myproject $ # As an environment variable $ PIP_EXTRA_INDEX_URL=https://pypi.example.com/ pip install myproject $ # With a configuration file $ echo "[global]\nextra-index-url = https://pypi.example.com/" > ~/.pip/pip.conf $ pip install myproject Why Not PEP 438 or Similar? --------------------------- While the additional search location support has existed in pip and setuptools for quite some time support for PEP 438 has only existed in pip since the 1.4 version, and still has yet to be implemented in setuptools. The design of PEP 438 did mean that users still benefited for projects which did not require external files even with older installers, however for projects which *did* require external files, users are still silently being given either potentionally unreliable or, even worse, unsafe files to download. This system is also unique to Python as it arises out of the history of PyPI, this means that it is almost certain that this concept will be foreign to most, if not all users, until they encounter it while attempting to use the Python toolchain. Additionally, the classification system proposed by PEP 438 has, in practice, turned out to be extremely confusing to end users, so much so that it is a position of this PEP that the situation as it stands is completely untenable. The common pattern for a user with this system is to attempt to install a project possibly get an error message (or maybe not if the project ever uploaded something to PyPI but later switched without removing old files), see that the error message suggests ``--allow-external``, they reissue the command adding that flag most likely getting another error message, see that this time the error message suggests also adding ``--allow-unverified``, and again issue the command a third time, this time finally getting the thing they wish to install. This UX failure exists for several reasons. 1. If pip can locate files at all for a project on the Simple API it will simply use that instead of attempting to locate more. This is generally the right thing to do as attempting to locate more would erase a large part of the benefit of PEP 438. This means that if a project *ever* uploaded a file that matches what the user has requested for install that will be used regardless of how old it is. 2. PEP 438 makes an implicit assumption that most projects would either upload themselves to PyPI or would update themselves to directly linking to release files. While a large number of projects *did* ultimately decide to upload to PyPI, some of them did so only because the UX around what PEP 438 was so bad that they felt forced to do so. More concerning however, is the fact that very few projects have opted to directly and safely link to files and instead they still simply link to pages which must be scraped in order to find the actual files, thus rendering the safe variant (``--allow-external``) largely useless. 3. Even if an author wishes to directly link to their files, doing so safely is non-obvious. It requires the inclusion of a MD5 hash (for historical reasons) in the hash of the URL. If they do not include this then their files will be considered "unverified". 4. PEP 438 takes a security centric view and disallows any form of a global opt in for unverified projects. While this is generally a good thing, it creates extremely verbose and repetive command invocations such as: :: $ pip install --allow-external myproject --allow-unverified myproject myproject $ pip install --allow-all-external --allow-unverified myproject myproject Multiple Repository/Index Support ================================= Installers SHOULD implement or continue to offer, the ability to point the installer at multiple URL locations. The exact mechanisms for a user to indicate they wish to use an additional location is left up to each indidivdual implementation. Additionally the mechanism discovering an installation candidate when multiple repositories are being used is also up to each individual implementation, however once configured an implementation should not discourage, warn, or otherwise cast a negative light upon the use of a repository simply because it is not the default repository. Currently both pip and setuptools implement multiple repository support by using the best installation candidate it can find from either repository, essentially treating it as if it were one large repository. Installers SHOULD also implement some mechanism for removing or otherwise disabling use of the default repository. The exact specifics of how that is achieved is up to each indidivdual implementation. End users wishing to limit what files they pull from which repository can simply use `devpi <http://doc.devpi.net/latest/>`_ to whitelist projects from PyPI or another repository. External Index Discovery ======================== One of the problems with using an additional index is one of discovery. Users will not generally be aware that an additional index is required at all much less where that index can be found. Projects can attempt to convey this information using their description on the PyPI page however that excludes people who discover their project organically through ``pip search``. To support projects that wish to externally host their files and to enable users to easily discover what additional indexes are required, PyPI will gain the ability for projects to register external index URLs along with an associated comment for each. These URLs will be made available on the simple page however they will not be linked or provided in a form that older installers will automatically search them. This ability will take the form of a ``<meta>`` tag. The name of this tag must be set to ``external-repository`` and the content will be a link to the location of the external repository. An optional data-description attribute will convey any comments or description that the author has provided. An example would look something like: :: <meta name="external-repository" content="https://index.example.com/" data-description="Primary Repository"> <meta name="external-repository" content="https://index.example.com/Ubuntu-14.04/" data-description="Wheels built for Ubuntu 14.04"> When an external repository is added to a project, new uploads will no longer be permitted to that project. However any existing files will simply be hidden from the simple API and the web interface until all of the external repositories are removed, in which case they will be visible again. PyPI MUST warn authors if adding an external repository will hide files and that warning must persist on any of the project management pages for that particular project. When an installer fetches the simple page for a project, if it finds this additional meta-data and it cannot find any files for that project in it's configured URLs then it should use this data to tell the user how to add one or more of the additional URLs to search in. This message should include any comments that the project has included to enable them to communicate to the user and provide hints as to which URL they might want (e.g. if some are only useful or compatible with certain platforms or situations). When the installer has implemented the auto discovery mechanisms they should also deprecate any of the mechanisms added for PEP 438 (such as ``--allow-external``) for removal at the end of the deprecation period proposed by the PEP. This feature *must* be added to PyPI prior to starting the deprecation and removal process for the implicit offsite hosting functionality. Deprecation and Removal of Link Spidering ========================================= A new hosting mode will be added to PyPI. This hosting mode will be called ``pypi-only`` and will be in addition to the three that PEP 438 has already given us which are ``pypi-explicit``, ``pypi-scrape``, ``pypi-scrape-crawl``. This new hosting mode will modify a project's simple api page so that it only lists the files which are directly hosted on PyPI and will not link to anything else. Upon acceptance of this PEP and the addition of the ``pypi-only`` mode, all new projects will be defaulted to the PyPI only mode and they will be locked to this mode and unable to change this particular setting. ``pypi-only`` projects will still be able to register external index URLs as described above - the "pypi-only" refers only to the download links that are published directly on PyPI. An email will then be sent out to all of the projects which are hosted only on PyPI informing them that in one month their project will be automatically converted to the ``pypi-only`` mode. A month after these emails have been sent any of those projects which were emailed, which still are hosted only on PyPI will have their mode set to ``pypi-only``. After that switch, an email will be sent to projects which rely on hosting external to PyPI. This email will warn these projects that externally hosted files have been deprecated on PyPI and that in 6 months from the time of that email that all external links will be removed from the installer APIs. This email *must* include instructions for converting their projects to be hosted on PyPI and *must* include links to a script or package that will enable them to enter their PyPI credentials and package name and have it automatically download and re-host all of their files on PyPI. This email *must also* include instructions for setting up their own index page and registering that with PyPI, including the fact that they can use pythonhosted.org as a host for an index page without requiring them to host any additional infrastructure or purchase a TLS certificate. This email must also contain a link to the Terms of Service for PyPI as many users may have signed up a long time ago and may not recall what those terms are. Five months after the initial email, another email must be sent to any projects still relying on external hosting. This email will include all of the same information that the first email contained, except that the removal date will be one month away instead of six. Finally a month later all projects will be switched to the ``pypi-only`` mode and PyPI will be modified to remove the externally linked files functionality. At this point in time any installers should finally remove any of the deprecated PEP 438 functionality such as ``--allow-external`` and ``--allow-unverified`` in pip. Impact ====== The largest impact of this is going to be projects where the maintainers are no longer maintaining the project, for one reason or another. For these projects it's unlikely that a maintainer will arrive to set the external index metadata which would allow the auto discovery mechanism to find it. Looking at the numbers factoring out PIL (which has been special cased below) the actual impact should be quite low, with it affecting just 3.8% of projects which host any files only externally or 2.2% which have their latest version hosted only externally. 6674 unique IP addresses have accessed the Simple API for these 3.8% of projects in a single day (2014-09-30). Of those, 99.5% of them installed something which could not be verified, and thus they were open to a Remote Code Execution via a Man-In-The-Middle attack, while 7.9% installed something which could be verified and only 0.4% only installed things which could be verified. Projects Which Rely on Externally Hosted files ---------------------------------------------- This is determined by crawling the simple index and looking for installable files using a similar detection method as pip and setuptools use. The "latest" version is determined using ``pkg_resources.parse_version`` sort order and it is used to show whether or not the latest version is hosted externally or only old versions are. ============ ======= ================ =================== ======= \ PyPI External (old) External (latest) Total ============ ======= ================ =================== ======= **Safe** 43313 16 39 43368 **Unsafe** 0 756 1092 1848 **Total** 43313 772 1131 45216 ============ ======= ================ =================== ======= Top Externally Hosted Projects by Requests ------------------------------------------ This is determined by looking at the number of requests the ``/simple/<project>/`` page had gotten in a single day. The total number of requests during that day was 10,623,831. ============================== ======== Project Requests ============================== ======== PIL 63869 Pygame 2681 mysql-connector-python 1562 pyodbc 724 elementtree 635 salesforce-python-toolkit 316 wxPython 295 PyXML 251 RBTools 235 python-graph-core 123 cElementTree 121 ============================== ======== Top Externally Hosted Projects by Unique IPs -------------------------------------------- This is determined by looking at the IP addresses of requests the ``/simple/<project>/`` page had gotten in a single day. The total number of unique IP addresses during that day was 124,604. ============================== ========== Project Unique IPs ============================== ========== PIL 4553 mysql-connector-python 462 Pygame 202 pyodbc 181 elementtree 166 wxPython 126 RBTools 114 PyXML 87 salesforce-python-toolkit 76 pyDes 76 ============================== ========== PIL --- It's obvious from the numbers above that the vast bulk of the impact come from the PIL project. On 2014-05-17 an email was sent to the contact for PIL inquiring whether or not they would be willing to upload to PyPI. A response has not been received as of yet (2014-10-03) nor has any change in the hosting happened. Due to the popularity of PIL this PEP also proposes that during the deprecation period that PyPI Administrators will set the PIL download URL as the external index for that project. Allowing the users of PIL to take advantage of the auto discovery mechanisms although the project has seemingly become unmaintained. Rejected Proposals ================== Keep the current classification system but adjust the options ------------------------------------------------------------- This PEP rejects several related proposals which attempt to fix some of the usability problems with the current system but while still keeping the general gist of PEP 438. This includes: * Default to allowing safely externally hosted files, but disallow unsafely hosted. * Default to disallowing safely externally hosted files with only a global flag to enable them, but disallow unsafely hosted. * Continue on the suggested path of PEP 438 and remove the option to unsafely host externally but continue to allow the option to safely host externally. These proposals are rejected because: * The classification system introduced in PEP 438 in an entirely unique concept to PyPI which is not generically applicable even in the context of Python packaging. Adding additional concepts comes at a cost. * The classification system itself is non-obvious to explain and to pre-determine what classification of link a project will require entails inspecting the project's ``/simple/<project>/`` page, and possibly any URLs linked from that page. * The ability to host externally while still being linked for automatic discovery is mostly a historic relic which causes a fair amount of pain and complexity for little reward. * The installer's ability to optimize or clean up the user interface is limited due to the nature of the implicit link scraping which would need to be done. This extends to the ``--allow-*`` options as well as the inability to determine if a link is expected to fail or not. * The mechanism paints a very broad brush when enabling an option, while PEP 438 attempts to limit this with per package options. However a project that has existed for an extended period of time may often times have several different URLs listed in their simple index. It is not unsusual for at least one of these to no longer be under control of the project. While an unregistered domain will sit there relatively harmless most of the time, pip will continue to attempt to install from it on every discovery phase. This means that an attacker simply needs to look at projects which rely on unsafe external URLs and register expired domains to attack users. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/14434/14434093fe65023938db06aab2af2d4631ae040a" alt=""
Hi Donald, i could just only briefly glimpse over the new draft. I am still not in favor of the PEP because it forces backard-incompatible changes and work on various sides for not enough gain. Particularly end users will see previously working commands now fail and if they run a new enough pip version they get a hint on how to manually fix it. In the longer term you argue that people will appreciate reusing the concept of dealing with multiple repositories as known with Linux Distros. And that PEP470 would eventually simplify the user interface of pip and the implementation of pypi a bit. I'll see next week if i can come up with suggestions how to reduce friction introduced by the PEP while maintaining the benefits. And probably with a few questions because i didn't understand all details yet i think. best, holger On Fri, Oct 03, 2014 at 02:05 -0400, Donald Stufft wrote:
Here’s round 4 of PEP 470, I believe I’ve addressed the comments from the previous thread. I've also tried to clarify the text as well as the motivations better.
You can view this online at: https://www.python.org/dev/peps/pep-0470/
---------------------
PEP: 470 Title: Using Multi Repository Support for External to PyPI Package File Hosting Version: $Revision$ Last-Modified: $Date$ Author: Donald Stufft <donald@stufft.io>, BDFL-Delegate: Richard Jones <richard@python.org> Discussions-To: distutils-sig@python.org Status: Draft Type: Process Content-Type: text/x-rst Created: 12-May-2014 Post-History: 14-May-2014, 05-Jun-2014, 03-Oct-2014 Replaces: 438
Abstract ========
This PEP proposes a mechanism for project authors to register with PyPI an external repository where their project's downloads can be located. This information can than be included as part of the simple API so that installers can use it to tell users where the item they are attempting to install is located and what they need to do to enable this additional repository. In addition to adding discovery information to make explicit multiple repositories easy to use, this PEP also deprecates and removes the implicit multiple repository support which currently functions through directly or indirectly linking offsite via the simple API. Finally this PEP also proposes deprecating and removing the functionality added by PEP 438, particularly the additional rel information and the meta tag to indicate the API version.
This PEP *does* not propose mandating that all authors upload their projects to PyPI in order to exist in the index nor does it propose any change to the human facing elements of PyPI.
Rationale =========
Historically PyPI did not have any method of hosting files nor any method of automatically retrieving installables, it was instead focused on providing a central registry of names, to prevent naming collisions, and as a means of discovery for finding projects to use. In the course of time setuptools began to scrape these human facing pages, as well as pages linked from those pages, looking for things it could automatically download and install. Eventually this became the "Simple" API which used a similar URL structure however it eliminated any of the extraneous links and information to make the API more efficient. Additionally PyPI grew the ability for a project to upload release files directly to PyPI enabling PyPI to act as a repository in addition to an index.
This gives PyPI two equally important roles that it plays in the Python ecosystem, that of index to enable easy discovery of Python projects and central repository to enable easy hosting, download, and installation of Python projects. Due to the history behind PyPI and the very organic growth it has experienced the lines between these two roles are blurry, and this blurriness has caused confusion for the end users of both of these roles and this has in turn caused ire between people attempting to use PyPI in different capacities, most often when end users want to use PyPI as a repository but the author wants to use PyPI soley as an index.
By moving to using explict multiple repositories we can make the lines between these two roles much more explicit and remove the "hidden" surprises caused by the current implementation of handling people who do not want to use PyPI as a repository. However simply moving to explicit multiple repositories is a regression in discoverablity, and for that reason this PEP adds an extension to the current simple API which will enable easy discovery of the specific repository that a project can be found in.
PEP 438 attempted to solve this issue by allowing projects to explicitly declare if they were using the repository features or not, and if they were not, it had the installers classify the links it found as either "internal", "verifiable external" or "unverifiable external". PEP 438 was accepted and implemented in pip 1.4 (released on Jul 23, 2013) with the final transition implemented in pip 1.5 (released on Jan 2, 2014).
PEP 438 was successful in bringing about more people to utilize PyPI's repository features, an altogether good thing given the global CDN powering PyPI providing speed ups for a lot of people, however it did so by introducing a new point of confusion and pain for both the end users and the authors.
Why Additional Repositories? ----------------------------
The two common installer tools, pip and easy_install/setuptools, both support the concept of additional locations to search for files to satisify the installation requirements and have done so for many years. This means that there is no need to "phase" in a new flag or concept and the solution to installing a project from a repository other than PyPI will function regardless of how old (within reason) the end user's installer is. Not only has this concept existed in the Python tooling for some time, but it is a concept that exists across languages and even extending to the OS level with OS package tools almost universally using multiple repository support making it extremely likely that someone is already familar with the concept.
Additionally, the multiple repository approach is a concept that is useful outside of the narrow scope of allowing projects which wish to be included on the index portion of PyPI but do not wish to utilize the repository portion of PyPI. This includes places where a company may wish to host a repository that contains their internal packages or where a project may wish to have multiple "channels" of releases, such as alpha, beta, release candidate, and final release.
Setting up an external repository is very simple, it can be achieved with nothing more than a filesystem, some files to host, and any web server capable of serving files and generating an automated index of directories (commonly called "autoindex"). This can be as simple as:
::
$ mkdir -p /var/www/index.example.com/ $ mkdir -p /var/www/index.example.com/myproject/ $ mv ~/myproject-1.0.tar.gz /var/www/index.example.com/myproject/ $ twistd -n web --path /var/www/index.example.com/
Using this additional location within pip is also simple and can be included on a per invocation, per shell, or per user basis. The pip 6.0 will also include the ability to configure this on a per virtual environment or per machine basis as well. This can be as simple as:
::
$ # As a CLI argument $ pip install --extra-index-url https://index.example.com/ myproject $ # As an environment variable $ PIP_EXTRA_INDEX_URL=https://pypi.example.com/ pip install myproject $ # With a configuration file $ echo "[global]\nextra-index-url = https://pypi.example.com/" > ~/.pip/pip.conf $ pip install myproject
Why Not PEP 438 or Similar? ---------------------------
While the additional search location support has existed in pip and setuptools for quite some time support for PEP 438 has only existed in pip since the 1.4 version, and still has yet to be implemented in setuptools. The design of PEP 438 did mean that users still benefited for projects which did not require external files even with older installers, however for projects which *did* require external files, users are still silently being given either potentionally unreliable or, even worse, unsafe files to download. This system is also unique to Python as it arises out of the history of PyPI, this means that it is almost certain that this concept will be foreign to most, if not all users, until they encounter it while attempting to use the Python toolchain.
Additionally, the classification system proposed by PEP 438 has, in practice, turned out to be extremely confusing to end users, so much so that it is a position of this PEP that the situation as it stands is completely untenable. The common pattern for a user with this system is to attempt to install a project possibly get an error message (or maybe not if the project ever uploaded something to PyPI but later switched without removing old files), see that the error message suggests ``--allow-external``, they reissue the command adding that flag most likely getting another error message, see that this time the error message suggests also adding ``--allow-unverified``, and again issue the command a third time, this time finally getting the thing they wish to install.
This UX failure exists for several reasons.
1. If pip can locate files at all for a project on the Simple API it will simply use that instead of attempting to locate more. This is generally the right thing to do as attempting to locate more would erase a large part of the benefit of PEP 438. This means that if a project *ever* uploaded a file that matches what the user has requested for install that will be used regardless of how old it is.
2. PEP 438 makes an implicit assumption that most projects would either upload themselves to PyPI or would update themselves to directly linking to release files. While a large number of projects *did* ultimately decide to upload to PyPI, some of them did so only because the UX around what PEP 438 was so bad that they felt forced to do so. More concerning however, is the fact that very few projects have opted to directly and safely link to files and instead they still simply link to pages which must be scraped in order to find the actual files, thus rendering the safe variant (``--allow-external``) largely useless.
3. Even if an author wishes to directly link to their files, doing so safely is non-obvious. It requires the inclusion of a MD5 hash (for historical reasons) in the hash of the URL. If they do not include this then their files will be considered "unverified".
4. PEP 438 takes a security centric view and disallows any form of a global opt in for unverified projects. While this is generally a good thing, it creates extremely verbose and repetive command invocations such as:
::
$ pip install --allow-external myproject --allow-unverified myproject myproject $ pip install --allow-all-external --allow-unverified myproject myproject
Multiple Repository/Index Support =================================
Installers SHOULD implement or continue to offer, the ability to point the installer at multiple URL locations. The exact mechanisms for a user to indicate they wish to use an additional location is left up to each indidivdual implementation.
Additionally the mechanism discovering an installation candidate when multiple repositories are being used is also up to each individual implementation, however once configured an implementation should not discourage, warn, or otherwise cast a negative light upon the use of a repository simply because it is not the default repository.
Currently both pip and setuptools implement multiple repository support by using the best installation candidate it can find from either repository, essentially treating it as if it were one large repository.
Installers SHOULD also implement some mechanism for removing or otherwise disabling use of the default repository. The exact specifics of how that is achieved is up to each indidivdual implementation.
End users wishing to limit what files they pull from which repository can simply use `devpi <http://doc.devpi.net/latest/>`_ to whitelist projects from PyPI or another repository.
External Index Discovery ========================
One of the problems with using an additional index is one of discovery. Users will not generally be aware that an additional index is required at all much less where that index can be found. Projects can attempt to convey this information using their description on the PyPI page however that excludes people who discover their project organically through ``pip search``.
To support projects that wish to externally host their files and to enable users to easily discover what additional indexes are required, PyPI will gain the ability for projects to register external index URLs along with an associated comment for each. These URLs will be made available on the simple page however they will not be linked or provided in a form that older installers will automatically search them.
This ability will take the form of a ``<meta>`` tag. The name of this tag must be set to ``external-repository`` and the content will be a link to the location of the external repository. An optional data-description attribute will convey any comments or description that the author has provided.
An example would look something like:
::
<meta name="external-repository" content="https://index.example.com/" data-description="Primary Repository"> <meta name="external-repository" content="https://index.example.com/Ubuntu-14.04/" data-description="Wheels built for Ubuntu 14.04">
When an external repository is added to a project, new uploads will no longer be permitted to that project. However any existing files will simply be hidden from the simple API and the web interface until all of the external repositories are removed, in which case they will be visible again. PyPI MUST warn authors if adding an external repository will hide files and that warning must persist on any of the project management pages for that particular project.
When an installer fetches the simple page for a project, if it finds this additional meta-data and it cannot find any files for that project in it's configured URLs then it should use this data to tell the user how to add one or more of the additional URLs to search in. This message should include any comments that the project has included to enable them to communicate to the user and provide hints as to which URL they might want (e.g. if some are only useful or compatible with certain platforms or situations). When the installer has implemented the auto discovery mechanisms they should also deprecate any of the mechanisms added for PEP 438 (such as ``--allow-external``) for removal at the end of the deprecation period proposed by the PEP.
This feature *must* be added to PyPI prior to starting the deprecation and removal process for the implicit offsite hosting functionality.
Deprecation and Removal of Link Spidering =========================================
A new hosting mode will be added to PyPI. This hosting mode will be called ``pypi-only`` and will be in addition to the three that PEP 438 has already given us which are ``pypi-explicit``, ``pypi-scrape``, ``pypi-scrape-crawl``. This new hosting mode will modify a project's simple api page so that it only lists the files which are directly hosted on PyPI and will not link to anything else.
Upon acceptance of this PEP and the addition of the ``pypi-only`` mode, all new projects will be defaulted to the PyPI only mode and they will be locked to this mode and unable to change this particular setting. ``pypi-only`` projects will still be able to register external index URLs as described above - the "pypi-only" refers only to the download links that are published directly on PyPI.
An email will then be sent out to all of the projects which are hosted only on PyPI informing them that in one month their project will be automatically converted to the ``pypi-only`` mode. A month after these emails have been sent any of those projects which were emailed, which still are hosted only on PyPI will have their mode set to ``pypi-only``.
After that switch, an email will be sent to projects which rely on hosting external to PyPI. This email will warn these projects that externally hosted files have been deprecated on PyPI and that in 6 months from the time of that email that all external links will be removed from the installer APIs. This email *must* include instructions for converting their projects to be hosted on PyPI and *must* include links to a script or package that will enable them to enter their PyPI credentials and package name and have it automatically download and re-host all of their files on PyPI. This email *must also* include instructions for setting up their own index page and registering that with PyPI, including the fact that they can use pythonhosted.org as a host for an index page without requiring them to host any additional infrastructure or purchase a TLS certificate. This email must also contain a link to the Terms of Service for PyPI as many users may have signed up a long time ago and may not recall what those terms are.
Five months after the initial email, another email must be sent to any projects still relying on external hosting. This email will include all of the same information that the first email contained, except that the removal date will be one month away instead of six.
Finally a month later all projects will be switched to the ``pypi-only`` mode and PyPI will be modified to remove the externally linked files functionality. At this point in time any installers should finally remove any of the deprecated PEP 438 functionality such as ``--allow-external`` and ``--allow-unverified`` in pip.
Impact ======
The largest impact of this is going to be projects where the maintainers are no longer maintaining the project, for one reason or another. For these projects it's unlikely that a maintainer will arrive to set the external index metadata which would allow the auto discovery mechanism to find it.
Looking at the numbers factoring out PIL (which has been special cased below) the actual impact should be quite low, with it affecting just 3.8% of projects which host any files only externally or 2.2% which have their latest version hosted only externally.
6674 unique IP addresses have accessed the Simple API for these 3.8% of projects in a single day (2014-09-30). Of those, 99.5% of them installed something which could not be verified, and thus they were open to a Remote Code Execution via a Man-In-The-Middle attack, while 7.9% installed something which could be verified and only 0.4% only installed things which could be verified.
Projects Which Rely on Externally Hosted files ----------------------------------------------
This is determined by crawling the simple index and looking for installable files using a similar detection method as pip and setuptools use. The "latest" version is determined using ``pkg_resources.parse_version`` sort order and it is used to show whether or not the latest version is hosted externally or only old versions are.
============ ======= ================ =================== ======= \ PyPI External (old) External (latest) Total ============ ======= ================ =================== ======= **Safe** 43313 16 39 43368 **Unsafe** 0 756 1092 1848 **Total** 43313 772 1131 45216 ============ ======= ================ =================== =======
Top Externally Hosted Projects by Requests ------------------------------------------
This is determined by looking at the number of requests the ``/simple/<project>/`` page had gotten in a single day. The total number of requests during that day was 10,623,831.
============================== ======== Project Requests ============================== ======== PIL 63869 Pygame 2681 mysql-connector-python 1562 pyodbc 724 elementtree 635 salesforce-python-toolkit 316 wxPython 295 PyXML 251 RBTools 235 python-graph-core 123 cElementTree 121 ============================== ========
Top Externally Hosted Projects by Unique IPs --------------------------------------------
This is determined by looking at the IP addresses of requests the ``/simple/<project>/`` page had gotten in a single day. The total number of unique IP addresses during that day was 124,604.
============================== ========== Project Unique IPs ============================== ========== PIL 4553 mysql-connector-python 462 Pygame 202 pyodbc 181 elementtree 166 wxPython 126 RBTools 114 PyXML 87 salesforce-python-toolkit 76 pyDes 76 ============================== ==========
PIL ---
It's obvious from the numbers above that the vast bulk of the impact come from the PIL project. On 2014-05-17 an email was sent to the contact for PIL inquiring whether or not they would be willing to upload to PyPI. A response has not been received as of yet (2014-10-03) nor has any change in the hosting happened. Due to the popularity of PIL this PEP also proposes that during the deprecation period that PyPI Administrators will set the PIL download URL as the external index for that project. Allowing the users of PIL to take advantage of the auto discovery mechanisms although the project has seemingly become unmaintained.
Rejected Proposals ==================
Keep the current classification system but adjust the options -------------------------------------------------------------
This PEP rejects several related proposals which attempt to fix some of the usability problems with the current system but while still keeping the general gist of PEP 438.
This includes:
* Default to allowing safely externally hosted files, but disallow unsafely hosted. * Default to disallowing safely externally hosted files with only a global flag to enable them, but disallow unsafely hosted. * Continue on the suggested path of PEP 438 and remove the option to unsafely host externally but continue to allow the option to safely host externally.
These proposals are rejected because:
* The classification system introduced in PEP 438 in an entirely unique concept to PyPI which is not generically applicable even in the context of Python packaging. Adding additional concepts comes at a cost.
* The classification system itself is non-obvious to explain and to pre-determine what classification of link a project will require entails inspecting the project's ``/simple/<project>/`` page, and possibly any URLs linked from that page.
* The ability to host externally while still being linked for automatic discovery is mostly a historic relic which causes a fair amount of pain and complexity for little reward.
* The installer's ability to optimize or clean up the user interface is limited due to the nature of the implicit link scraping which would need to be done. This extends to the ``--allow-*`` options as well as the inability to determine if a link is expected to fail or not.
* The mechanism paints a very broad brush when enabling an option, while PEP 438 attempts to limit this with per package options. However a project that has existed for an extended period of time may often times have several different URLs listed in their simple index. It is not unsusual for at least one of these to no longer be under control of the project. While an unregistered domain will sit there relatively harmless most of the time, pip will continue to attempt to install from it on every discovery phase. This means that an attacker simply needs to look at projects which rely on unsafe external URLs and register expired domains to attack users.
Copyright =========
This document has been placed in the public domain.
.. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
--- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 3, 2014, at 4:06 AM, holger krekel <holger@merlinux.eu> wrote:
Hi Donald,
i could just only briefly glimpse over the new draft. I am still not in favor of the PEP because it forces backard-incompatible changes and work on various sides for not enough gain. Particularly end users will see previously working commands now fail and if they run a new enough pip version they get a hint on how to manually fix it.
In the longer term you argue that people will appreciate reusing the concept of dealing with multiple repositories as known with Linux Distros. And that PEP470 would eventually simplify the user interface of pip and the implementation of pypi a bit.
I'll see next week if i can come up with suggestions how to reduce friction introduced by the PEP while maintaining the benefits. And probably with a few questions because i didn't understand all details yet i think.
Mostly I think that the current situation is the worst possible implementation of handling offsite hosting that I can think of. I don't believe any person would design this system from scratch and the only reason it exists is because of lots of small decisions over the course of history. I think that the current implementation does a disservice to everyone involved. I think that it hurts end users because they end up most likely insecure, with tooling that cannot accurately report errors, and a horrible user interface. I think that it hurts authors, even those that want to rely on this feature, because it positions them in a case where it either implicitly breaks the expectations of end users who view PyPI mainly as a repository or it explicitly causes them pain in attempting to use and decipher the flags. It has been suggested that the changes pip made in implementing PEP 438 was done in an attempt to make not hosting on PyPI so painful that authors would be forced to host on PyPI. I think that backwards compatability is important, however I also think that it's important to break backwards compatability when it makes sense to do so. The current situation is such that attempting to preserve backwards compatability has made things worse for everyone involved whenever a project wishes to do anything but host their projects on PyPI. As far as simplication goes, I don't believe it simplifies the implementation of PyPI at all, it just shuffles things around and creates work on my part in order to get PyPI supporting the new stuff. It does however let installers become simpler and it enables installers to present accurate error information that actually helps determine the root cause of a failure instead of the current silent failure with a confusing error message model. I look forward to your suggestions, but I'm not hopeful. I've been thus far unable to determine a way to improve the current solution in a way that isn't just papering over one problem without solving the fundamental issue. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 3 October 2014 22:02, Donald Stufft <donald@stufft.io> wrote:
As far as simplication goes, I don't believe it simplifies the implementation of PyPI at all, it just shuffles things around and creates work on my part in order to get PyPI supporting the new stuff. It does however let installers become simpler and it enables installers to present accurate error information that actually helps determine the root cause of a failure instead of the current silent failure with a confusing error message model.
I look forward to your suggestions, but I'm not hopeful. I've been thus far unable to determine a way to improve the current solution in a way that isn't just papering over one problem without solving the fundamental issue.
Donald's perspective here matches my own. I'll be interested to hear alternative proposals, but they should aim to address at least the following user experience expectations: 1. Easily allow external hosting to "just work" when appropriately configured at the system, user or virtual environment level (pip already supports this at the user level, and will support it at the system and environment level in the next version). 2. Easily allow package authors to tell PyPI "my releases are hosted <here>" and have that advertised in such a way that tools can clearly communicate it to users, without silently introducing unexpected dependencies on third party services. 3. Eliminate any and all references to the confusing "verifiable external" and "unverifiable external" distinction from the user experience (both when installing and when releasing packages). 4. The repository aspects of PyPI should become *just* the default package hosting location (i.e. the only one that is treated as opt-out rather than opt-in by most client tools in their default configuration). Aside from that aspect, hosting on PyPI should not otherwise provide an enhanced user experience over hosting your own package repository. 5. Do all of the above while providing default behaviour that is secure against most attackers below the nation state adversary level. In my view, the most debatable part of Donald's latest proposal would be the handling of projects that don't get updated to properly register an external URL before the link spidering support is removed from the client applications. That aspect should arguably include a step where the decision on whether or not to disable that support is based on *looking at the numbers again* before turning the feature off on the server, and perhaps also monitoring for user complaints for a period after it is first turned off, before the feature is removed from the clients. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/0ed35/0ed351bb7241b844adf1c0b350773bb1a3e2c07f" alt=""
On 03 Oct 2014, at 16:24, Nick Coghlan <ncoghlan@gmail.com> wrote: 2. Easily allow package authors to tell PyPI "my releases are hosted <here>" and have that advertised in such a way that tools can clearly communicate it to users, without silently introducing unexpected dependencies on third party services.
I haven’t read the PEP, so this might be a stupid remark, but: is that needed, when a package author can also say something like “add my repository to your system with pip —add-repository <url>” ? Wichert.
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 3 October 2014 15:29, Wichert Akkerman <wichert@wiggy.net> wrote:
On 03 Oct 2014, at 16:24, Nick Coghlan <ncoghlan@gmail.com> wrote: 2. Easily allow package authors to tell PyPI "my releases are hosted <here>" and have that advertised in such a way that tools can clearly communicate it to users, without silently introducing unexpected dependencies on third party services.
I haven’t read the PEP, so this might be a stupid remark, but: is that needed, when a package author can also say something like “add my repository to your system with pip —add-repository <url>” ?
The logic is that if I say pip install foo and foo is not hosted on PyPI, I get an error saying "cannot find foo". The quoted point is saying that we want a way for the author of foo to add metadata to PyPI that lets pip give a more helpful message: pip install foo ERROR: No downloads for package 'foo' found. foo is hosted at the following repositoties: Main repository - http://foo.example.com/simple/ Windows wheels - http://wheels.foo.example.com/simple/ Use --index-url to specify the repository you wish to use. (Or something like that...) Paul
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 3, 2014, at 10:44 AM, Paul Moore <p.f.moore@gmail.com> wrote:
On 3 October 2014 15:29, Wichert Akkerman <wichert@wiggy.net> wrote:
On 03 Oct 2014, at 16:24, Nick Coghlan <ncoghlan@gmail.com> wrote: 2. Easily allow package authors to tell PyPI "my releases are hosted <here>" and have that advertised in such a way that tools can clearly communicate it to users, without silently introducing unexpected dependencies on third party services.
I haven’t read the PEP, so this might be a stupid remark, but: is that needed, when a package author can also say something like “add my repository to your system with pip —add-repository <url>” ?
The logic is that if I say
pip install foo
and foo is not hosted on PyPI, I get an error saying "cannot find foo". The quoted point is saying that we want a way for the author of foo to add metadata to PyPI that lets pip give a more helpful message:
pip install foo ERROR: No downloads for package 'foo' found. foo is hosted at the following repositoties: Main repository - http://foo.example.com/simple/ Windows wheels - http://wheels.foo.example.com/simple/ Use --index-url to specify the repository you wish to use.
(Or something like that…)
Yes, the comments in particular were inspired by Egenix’s own repositories. I saw that they had different repositories for UCS2 and UCS4 and I thought that it would be awesome if pip could tell end users about both of those and give hem the information to choose between which ones were relevant. In this way it supports more things than the existing mechanisms support because the old things don’t allow any mechanism for selective addition which means binary distributions are hard if the filename doesn’t include enough information for pip/easy_install to actually select the proper download and you have to encode some of that information in the URL. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 3, 2014, at 10:29 AM, Wichert Akkerman <wichert@wiggy.net> wrote:
On 03 Oct 2014, at 16:24, Nick Coghlan <ncoghlan@gmail.com> wrote: 2. Easily allow package authors to tell PyPI "my releases are hosted <here>" and have that advertised in such a way that tools can clearly communicate it to users, without silently introducing unexpected dependencies on third party services.
I haven’t read the PEP, so this might be a stupid remark, but: is that needed, when a package author can also say something like “add my repository to your system with pip —add-repository <url>” ?
Wichert.
So it’s not strictly required, and for pip versions less than 6.0 that’s essentially what will be happening. However providing that mechanism makes the discovery story a lot nicer. Instead of ``pip install foo`` coming back with a “I can’t find any downloads” error, it can come back with a “I can’t find any downloads, but here is the repositories that the author says they are using”. Basically it’s an affordance to make the UX of an external repository better. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/14434/14434093fe65023938db06aab2af2d4631ae040a" alt=""
On Sat, Oct 04, 2014 at 00:24 +1000, Nick Coghlan wrote:
On 3 October 2014 22:02, Donald Stufft <donald@stufft.io> wrote:
As far as simplication goes, I don't believe it simplifies the implementation of PyPI at all, it just shuffles things around and creates work on my part in order to get PyPI supporting the new stuff. It does however let installers become simpler and it enables installers to present accurate error information that actually helps determine the root cause of a failure instead of the current silent failure with a confusing error message model.
I look forward to your suggestions, but I'm not hopeful. I've been thus far unable to determine a way to improve the current solution in a way that isn't just papering over one problem without solving the fundamental issue.
Donald's perspective here matches my own.
I don't see the "the fundamental issue" that PEP470 tries to solve. The first para of the abstract says it wants to substitute the existing mechanism for registering external indexes with another one. It doesn't say why. And it doesn't say why this can't be done in a backward compatible manner which would be preferable (i hope we agree there). And because the PEP doesn't precisely say what "fundamental issue" it solves it's a bit hard to present an alternative. If it's about focusing on "multi-repository operations" and simplifying installer UI it could be done with full backward compat: - add PyPI maintainer UI to add external indexes along with a message - change pip to disallow crawling to an external index it finds but rather present a message that you need to add the index manually to your installer invocation. (pip already finds external crawl URLs and it can also find the "new" ones - no need for any breakage). - tell all project maintainers which have "explicit file urls" that they need to move their release files to an offsite own external index (or to pypi itself) within N months. Then disable the file urls (after examination of how many people are effected) and remove related un-needed options in pip. Of course, i leave out some details but overall think it's pretty much doable. With this strategy, both old and new versions of pip wold work fine with the changed PyPI. It also wouldn't introduce very complicated transition phases or communication steps. I postpone other issues with respect to clarity and security of PEP/multi-repo operations to first get clarity on the backward compat issue and general strategy. best, holger P.S.: Nick, i think my rough draft above satisfies all of your points below, although they only partly relate to what we discuss in the PEP IMHO.
I'll be interested to hear alternative proposals, but they should aim to address at least the following user experience expectations:
1. Easily allow external hosting to "just work" when appropriately configured at the system, user or virtual environment level (pip already supports this at the user level, and will support it at the system and environment level in the next version).
2. Easily allow package authors to tell PyPI "my releases are hosted <here>" and have that advertised in such a way that tools can clearly communicate it to users, without silently introducing unexpected dependencies on third party services.
3. Eliminate any and all references to the confusing "verifiable external" and "unverifiable external" distinction from the user experience (both when installing and when releasing packages).
4. The repository aspects of PyPI should become *just* the default package hosting location (i.e. the only one that is treated as opt-out rather than opt-in by most client tools in their default configuration). Aside from that aspect, hosting on PyPI should not otherwise provide an enhanced user experience over hosting your own package repository.
5. Do all of the above while providing default behaviour that is secure against most attackers below the nation state adversary level.
In my view, the most debatable part of Donald's latest proposal would be the handling of projects that don't get updated to properly register an external URL before the link spidering support is removed from the client applications. That aspect should arguably include a step where the decision on whether or not to disable that support is based on *looking at the numbers again* before turning the feature off on the server, and perhaps also monitoring for user complaints for a period after it is first turned off, before the feature is removed from the clients.
Regards, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 3, 2014, at 2:28 PM, holger krekel <holger@merlinux.eu> wrote:
On Sat, Oct 04, 2014 at 00:24 +1000, Nick Coghlan wrote:
On 3 October 2014 22:02, Donald Stufft <donald@stufft.io> wrote:
As far as simplication goes, I don't believe it simplifies the implementation of PyPI at all, it just shuffles things around and creates work on my part in order to get PyPI supporting the new stuff. It does however let installers become simpler and it enables installers to present accurate error information that actually helps determine the root cause of a failure instead of the current silent failure with a confusing error message model.
I look forward to your suggestions, but I'm not hopeful. I've been thus far unable to determine a way to improve the current solution in a way that isn't just papering over one problem without solving the fundamental issue.
Donald's perspective here matches my own.
I don't see the "the fundamental issue" that PEP470 tries to solve. The first para of the abstract says it wants to substitute the existing mechanism for registering external indexes with another one. It doesn't say why. And it doesn't say why this can't be done in a backward compatible manner which would be preferable (i hope we agree there).
The fundamental issue is that PyPI is really two things, an index and a repository. Currently these two roles are blurred and that lack of distinction causes problems for both end users and authors and those problems create a certain animosity towards people not wanting to use PyPI as their repository. To this aim end users should be aware when they are installing things from a repository other than PyPI and they should also be aware when doing so is unsafe on the wire. PEP 438 solves this problem. End users opt in to using a repository other than PyPI. However It is my belief that the pain of doing so has outweighed the benefits of PEP 438. Thus PEP 470 attempts to "go back to the drawing board" and questions the mechanism for hosting on an alternative repository all together.
And because the PEP doesn't precisely say what "fundamental issue" it solves it's a bit hard to present an alternative. If it's about focusing on "multi-repository operations" and simplifying installer UI it could be done with full backward compat:
- add PyPI maintainer UI to add external indexes along with a message
Ok, this is part of PEP 470 too.
- change pip to disallow crawling to an external index it finds but rather present a message that you need to add the index manually to your installer invocation. (pip already finds external crawl URLs and it can also find the "new" ones - no need for any breakage).
I had thought of similar things, and my reasons for not using an <a href> and instead using a meta tag and for removing the old URLs instead of just making this in addition to is: 1. I don’t *want* users of older versions of pip/easy_install to implicitly be fetching these things, they should be able to opt in as well and indeed all the mechanisms exist in pip/easy_install for them to already do so. The only thing that doesn’t exist is the discovery mechanism. 2. This doesn’t actually prevent breakage, it just links the breakage to the version of pip/easy_install someone is using at the cost that people with older clients are implicitly fetching things, some of which may or may not be safe. Overall I think the goal of not breaking things is a good one, however PyPI isn’t a versioned thing where people can limit what version of things they run. It’s important just from a maintenance aspect to be able to deprecate and remove things over time. This will break things for people depending on those things of course, so it’s always a balancing act about deciding *when* exactly to remove something. I think that this is a good time to remove this particular thing because the core functionality of it’s replacement has existed for a long time, the actual use of the feature is quite low, and leaving it in presents an issue with usability and security.
- tell all project maintainers which have "explicit file urls" that they need to move their release files to an offsite own external index (or to pypi itself) within N months. Then disable the file urls (after examination of how many people are effected) and remove related un-needed options in pip.
This is still breakage for people using an older version of pip/easy_install, although a smaller set of things will break in this sense.
Of course, i leave out some details but overall think it's pretty much doable. With this strategy, both old and new versions of pip wold work fine with the changed PyPI. It also wouldn't introduce very complicated transition phases or communication steps.
I postpone other issues with respect to clarity and security of PEP/multi-repo operations to first get clarity on the backward compat issue and general strategy.
best, holger
P.S.: Nick, i think my rough draft above satisfies all of your points below, although they only partly relate to what we discuss in the PEP IMHO.
I'll be interested to hear alternative proposals, but they should aim to address at least the following user experience expectations:
1. Easily allow external hosting to "just work" when appropriately configured at the system, user or virtual environment level (pip already supports this at the user level, and will support it at the system and environment level in the next version).
2. Easily allow package authors to tell PyPI "my releases are hosted <here>" and have that advertised in such a way that tools can clearly communicate it to users, without silently introducing unexpected dependencies on third party services.
3. Eliminate any and all references to the confusing "verifiable external" and "unverifiable external" distinction from the user experience (both when installing and when releasing packages).
4. The repository aspects of PyPI should become *just* the default package hosting location (i.e. the only one that is treated as opt-out rather than opt-in by most client tools in their default configuration). Aside from that aspect, hosting on PyPI should not otherwise provide an enhanced user experience over hosting your own package repository.
5. Do all of the above while providing default behaviour that is secure against most attackers below the nation state adversary level.
In my view, the most debatable part of Donald's latest proposal would be the handling of projects that don't get updated to properly register an external URL before the link spidering support is removed from the client applications. That aspect should arguably include a step where the decision on whether or not to disable that support is based on *looking at the numbers again* before turning the feature off on the server, and perhaps also monitoring for user complaints for a period after it is first turned off, before the feature is removed from the clients.
Regards, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
--- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 4 October 2014 05:08, Donald Stufft <donald@stufft.io> wrote:
2. This doesn’t actually prevent breakage, it just links the breakage to the version of pip/easy_install someone is using at the cost that people with older clients are implicitly fetching things, some of which may or may not be safe.
Overall I think the goal of not breaking things is a good one, however PyPI isn’t a versioned thing where people can limit what version of things they run. It’s important just from a maintenance aspect to be able to deprecate and remove things over time. This will break things for people depending on those things of course, so it’s always a balancing act about deciding *when* exactly to remove something. I think that this is a good time to remove this particular thing because the core functionality of it’s replacement has existed for a long time, the actual use of the feature is quite low, and leaving it in presents an issue with usability and security.
It occurred to me that it's potentially desirable to decouple the "stop advertising links for spidering from PyPI" step from the "stop supporting link spidering in the clients" step. My rationale is that the first is just about changing PyPI itself - more clearly splitting the "PyPI as index" and "PyPI as repository" roles. We can quantify that impact fairly clearly, and will have data to make informed decisions each step of the way. Removing the link spidering support from the *clients* is a potentially bigger deal, as it would impact anyone that was using link spidering *independently of PyPI*. We don't have any data on that, and it's a decision different clients may want to approach differently. So while PEP 470 would allow clients to *consider* dropping link spidering support (and any new clients would be free to never add it), it likely doesn't make sense for the PEP to commit any clients (including pip) to a particular time frame for dropping the feature. That would narrow the scope to just server side PyPI changes (with client updates to report the availability of external repositories being a quality of implementation issue rather than a hard requirement). Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 4, 2014, at 3:46 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 4 October 2014 05:08, Donald Stufft <donald@stufft.io> wrote:
2. This doesn’t actually prevent breakage, it just links the breakage to the version of pip/easy_install someone is using at the cost that people with older clients are implicitly fetching things, some of which may or may not be safe.
Overall I think the goal of not breaking things is a good one, however PyPI isn’t a versioned thing where people can limit what version of things they run. It’s important just from a maintenance aspect to be able to deprecate and remove things over time. This will break things for people depending on those things of course, so it’s always a balancing act about deciding *when* exactly to remove something. I think that this is a good time to remove this particular thing because the core functionality of it’s replacement has existed for a long time, the actual use of the feature is quite low, and leaving it in presents an issue with usability and security.
It occurred to me that it's potentially desirable to decouple the "stop advertising links for spidering from PyPI" step from the "stop supporting link spidering in the clients" step.
My rationale is that the first is just about changing PyPI itself - more clearly splitting the "PyPI as index" and "PyPI as repository" roles. We can quantify that impact fairly clearly, and will have data to make informed decisions each step of the way.
Removing the link spidering support from the *clients* is a potentially bigger deal, as it would impact anyone that was using link spidering *independently of PyPI*. We don't have any data on that, and it's a decision different clients may want to approach differently.
So while PEP 470 would allow clients to *consider* dropping link spidering support (and any new clients would be free to never add it), it likely doesn't make sense for the PEP to commit any clients (including pip) to a particular time frame for dropping the feature. That would narrow the scope to just server side PyPI changes (with client updates to report the availability of external repositories being a quality of implementation issue rather than a hard requirement).
Yea, I don’t think I included what the installers do in this PEP other than the parts specific to this PEP, so: 1. Implement multiple repository support. 2. Implement some mechanism for removing/disabling the default repository 3. Implement the discovery mechanism. 4. Deprecate / Remove PEP 438 I purposely don't give exact details how it should be done, as I think that each installer should decide how best to integrate that within their own UX. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 5 October 2014 03:21, Donald Stufft <donald@stufft.io> wrote:
On Oct 4, 2014, at 3:46 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: So while PEP 470 would allow clients to *consider* dropping link spidering support (and any new clients would be free to never add it), it likely doesn't make sense for the PEP to commit any clients (including pip) to a particular time frame for dropping the feature. That would narrow the scope to just server side PyPI changes (with client updates to report the availability of external repositories being a quality of implementation issue rather than a hard requirement).
Yea, I don’t think I included what the installers do in this PEP other than the parts specific to this PEP, so:
1. Implement multiple repository support. 2. Implement some mechanism for removing/disabling the default repository 3. Implement the discovery mechanism. 4. Deprecate / Remove PEP 438
I purposely don't give exact details how it should be done, as I think that each installer should decide how best to integrate that within their own UX.
I think it's worth spelling out that list of updated client expectations clearly in the PEP, with step 4 explicitly flagged as optional. If any given client wants to continue supporting PEP 438 for use with private indexes, I think that's fine. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 4, 2014, at 10:06 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 5 October 2014 03:21, Donald Stufft <donald@stufft.io> wrote:
On Oct 4, 2014, at 3:46 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: So while PEP 470 would allow clients to *consider* dropping link spidering support (and any new clients would be free to never add it), it likely doesn't make sense for the PEP to commit any clients (including pip) to a particular time frame for dropping the feature. That would narrow the scope to just server side PyPI changes (with client updates to report the availability of external repositories being a quality of implementation issue rather than a hard requirement).
Yea, I don’t think I included what the installers do in this PEP other than the parts specific to this PEP, so:
1. Implement multiple repository support. 2. Implement some mechanism for removing/disabling the default repository 3. Implement the discovery mechanism. 4. Deprecate / Remove PEP 438
I purposely don't give exact details how it should be done, as I think that each installer should decide how best to integrate that within their own UX.
I think it's worth spelling out that list of updated client expectations clearly in the PEP, with step 4 explicitly flagged as optional. If any given client wants to continue supporting PEP 438 for use with private indexes, I think that's fine.
Regards, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Updated my local copy to have this, it’ll be included in my next draft. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/14434/14434093fe65023938db06aab2af2d4631ae040a" alt=""
On Fri, Oct 03, 2014 at 15:08 -0400, Donald Stufft wrote:
On Oct 3, 2014, at 2:28 PM, holger krekel <holger@merlinux.eu> wrote:
On Sat, Oct 04, 2014 at 00:24 +1000, Nick Coghlan wrote:
On 3 October 2014 22:02, Donald Stufft <donald@stufft.io> wrote:
As far as simplication goes, I don't believe it simplifies the implementation of PyPI at all, it just shuffles things around and creates work on my part in order to get PyPI supporting the new stuff. It does however let installers become simpler and it enables installers to present accurate error information that actually helps determine the root cause of a failure instead of the current silent failure with a confusing error message model.
I look forward to your suggestions, but I'm not hopeful. I've been thus far unable to determine a way to improve the current solution in a way that isn't just papering over one problem without solving the fundamental issue.
Donald's perspective here matches my own.
I don't see the "the fundamental issue" that PEP470 tries to solve. The first para of the abstract says it wants to substitute the existing mechanism for registering external indexes with another one. It doesn't say why. And it doesn't say why this can't be done in a backward compatible manner which would be preferable (i hope we agree there).
The fundamental issue is that PyPI is really two things, an index and a repository. Currently these two roles are blurred and that lack of distinction causes problems for both end users and authors and those problems create a certain animosity towards people not wanting to use PyPI as their repository. To this aim end users should be aware when they are installing things from a repository other than PyPI and they should also be aware when doing so is unsafe on the wire.
PEP 438 solves this problem. End users opt in to using a repository other than PyPI. However It is my belief that the pain of doing so has outweighed the benefits of PEP 438.
Well, the main benefit of PEP438 was that it removed random crawling for some 90% of the packages on the package index, speeding up and making installs more reliable. And it did that without breaking backward compatibility. And I think PEP470 could achieve its goals this way too.
Thus PEP 470 attempts to "go back to the drawing board" and questions the mechanism for hosting on an alternative repository all together.
And because the PEP doesn't precisely say what "fundamental issue" it solves it's a bit hard to present an alternative. If it's about focusing on "multi-repository operations" and simplifying installer UI it could be done with full backward compat:
- add PyPI maintainer UI to add external indexes along with a message
Ok, this is part of PEP 470 too.
- change pip to disallow crawling to an external index it finds but rather present a message that you need to add the index manually to your installer invocation. (pip already finds external crawl URLs and it can also find the "new" ones - no need for any breakage).
I had thought of similar things, and my reasons for not using an <a href> and instead using a meta tag and for removing the old URLs instead of just making this in addition to is:
1. I don’t *want* users of older versions of pip/easy_install to implicitly be fetching these things, they should be able to opt in as well and indeed all the mechanisms exist in pip/easy_install for them to already do so. The only thing that doesn’t exist is the discovery mechanism.
I think it's better to generally avoid deliberately breaking things. Things break enough even when we don't intend them to. IOW, Pypi should IMO aim to preserve working with as many client side scenarios as possible -- while adding things and improving for newer versions of clients.
2. This doesn’t actually prevent breakage, it just links the breakage to the version of pip/easy_install someone is using at the cost that people with older clients are implicitly fetching things, some of which may or may not be safe.
I am not sure i follow here, sorry. There are two things the PEP does: 1. remove "registered verified external links" 2. support recording external indexes for a project The first could be done without breakage except for the users and maintainers of that feature -- i take it we are still talking about just a few thousand client side uses and 60 project maintainers, right? The second could be done without breakage alltogether i think: at one time all external urls are auto-registered as external indexes and they are presented on the simple page with some meta information that does not confuse older pips/easy_installs. Newer pips/easy_installs can then provide nice error messages. Older pips can continue to use the PEP438 options. And easy install can continue to work.
Overall I think the goal of not breaking things is a good one, however PyPI isn’t a versioned thing where people can limit what version of things they run. It’s important just from a maintenance aspect to be able to deprecate and remove things over time. This will break things for people depending on those things of course, so it’s always a balancing act about deciding *when* exactly to remove something. I think that this is a good time to remove this particular thing because the core functionality of it’s replacement has existed for a long time, the actual use of the feature is quite low, and leaving it in presents an issue with usability and security.
I agree that removing features and functionality is a good thing. But i maintain PEP470 could do it without breaking things. holger
- tell all project maintainers which have "explicit file urls" that they need to move their release files to an offsite own external index (or to pypi itself) within N months. Then disable the file urls (after examination of how many people are effected) and remove related un-needed options in pip.
This is still breakage for people using an older version of pip/easy_install, although a smaller set of things will break in this sense.
Of course, i leave out some details but overall think it's pretty much doable. With this strategy, both old and new versions of pip wold work fine with the changed PyPI. It also wouldn't introduce very complicated transition phases or communication steps.
I postpone other issues with respect to clarity and security of PEP/multi-repo operations to first get clarity on the backward compat issue and general strategy.
best, holger
P.S.: Nick, i think my rough draft above satisfies all of your points below, although they only partly relate to what we discuss in the PEP IMHO.
I'll be interested to hear alternative proposals, but they should aim to address at least the following user experience expectations:
1. Easily allow external hosting to "just work" when appropriately configured at the system, user or virtual environment level (pip already supports this at the user level, and will support it at the system and environment level in the next version).
2. Easily allow package authors to tell PyPI "my releases are hosted <here>" and have that advertised in such a way that tools can clearly communicate it to users, without silently introducing unexpected dependencies on third party services.
3. Eliminate any and all references to the confusing "verifiable external" and "unverifiable external" distinction from the user experience (both when installing and when releasing packages).
4. The repository aspects of PyPI should become *just* the default package hosting location (i.e. the only one that is treated as opt-out rather than opt-in by most client tools in their default configuration). Aside from that aspect, hosting on PyPI should not otherwise provide an enhanced user experience over hosting your own package repository.
5. Do all of the above while providing default behaviour that is secure against most attackers below the nation state adversary level.
In my view, the most debatable part of Donald's latest proposal would be the handling of projects that don't get updated to properly register an external URL before the link spidering support is removed from the client applications. That aspect should arguably include a step where the decision on whether or not to disable that support is based on *looking at the numbers again* before turning the feature off on the server, and perhaps also monitoring for user complaints for a period after it is first turned off, before the feature is removed from the clients.
Regards, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
--- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/ab456/ab456d7b185e9d28a958835d5e138015926e5808" alt=""
FWIW: I support Holger's request to introduce multi repository support *without* breaking existing setups. Simply add the possibility for authors to register external indexes, have pip, setuptools, et al. crawl these in addition to what's up on the PyPI package page (using the logic that has existed in these tools for years) and then let the author decide whether they want to remove existing downloads from PyPI or not. This allows for older installations to continue working, while also (optionally) supporting a setup which does not use PyPI for hosting at all. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 7 October 2014 11:09, holger krekel <holger@merlinux.eu> wrote:
Well, the main benefit of PEP438 was that it removed random crawling for some 90% of the packages on the package index, speeding up and making installs more reliable. And it did that without breaking backward compatibility.
The setuptools index page is 1.4MB in size. Most of that can be ignored, but it still has to be downloaded and parsed. Whether the data that setuptools includes in its long_description is useful is arguable, but irrelevant - the fact is that as things stand, it is there and it causes issues. PEP 470 would result in all of the unneeded entries in the simple index for setuptools being removed, which avoids the need for client tools (and I'm not talking just about pip here, but also about one-off scripts, which is the sort of thing I write a lot) to trawl through all of that data. And it does so without the setuptools project having to change how it writes its PyPI page (i.e., the project long_description). Arguably, that's equally a way of avoiding breaking backward compatibility...
The second could be done without breakage alltogether i think: at one time all external urls are auto-registered as external indexes and they are presented on the simple page with some meta information that does not confuse older pips/easy_installs. Newer pips/easy_installs can then provide nice error messages. Older pips can continue to use the PEP438 options. And easy install can continue to work.
Setuptools has 255 internal links to files hosted on PyPI. And about 11,000 other links. (I just checked that 3 times, as I couldn't believe it, but it *seems* to be right :-(). Removing duplicates, 337 unique links. Are you suggesting pip presents all of those as possible external indexes? I'm sure you can argue that setuptools has (badly!) misused the link-handling support in PyPI. And that it's a one-off special case. But how do we document to projects that they shouldn't do things like this? How do we even define what "things like this" are? Don't include links in your project description unless they are external indexes? Paul.
data:image/s3,"s3://crabby-images/14434/14434093fe65023938db06aab2af2d4631ae040a" alt=""
On Tue, Oct 07, 2014 at 11:40 +0100, Paul Moore wrote:
On 7 October 2014 11:09, holger krekel <holger@merlinux.eu> wrote:
Well, the main benefit of PEP438 was that it removed random crawling for some 90% of the packages on the package index, speeding up and making installs more reliable. And it did that without breaking backward compatibility.
The setuptools index page is 1.4MB in size. Most of that can be ignored, but it still has to be downloaded and parsed. Whether the data that setuptools includes in its long_description is useful is arguable, but irrelevant - the fact is that as things stand, it is there and it causes issues.
PEP 470 would result in all of the unneeded entries in the simple index for setuptools being removed, which avoids the need for client tools (and I'm not talking just about pip here, but also about one-off scripts, which is the sort of thing I write a lot) to trawl through all of that data. And it does so without the setuptools project having to change how it writes its PyPI page (i.e., the project long_description). Arguably, that's equally a way of avoiding breaking backward compatibility...
The second could be done without breakage alltogether i think: at one time all external urls are auto-registered as external indexes and they are presented on the simple page with some meta information that does not confuse older pips/easy_installs. Newer pips/easy_installs can then provide nice error messages. Older pips can continue to use the PEP438 options. And easy install can continue to work.
Setuptools has 255 internal links to files hosted on PyPI. And about 11,000 other links. (I just checked that 3 times, as I couldn't believe it, but it *seems* to be right :-(). Removing duplicates, 337 unique links. Are you suggesting pip presents all of those as possible external indexes?
No, i effectively suggest that PyPI would present just 2 index links, those which currently are attributed as rel={download,homepage}. Those two index links would be put into the new "extra indexes field" on pypi with a note like "the following indexes were extracted from old release data" which newer pip versions could present to the user. For older pip/easy_installs things would just continue to work but they'd get a shorter setuptools simple page. best, holger
I'm sure you can argue that setuptools has (badly!) misused the link-handling support in PyPI. And that it's a one-off special case. But how do we document to projects that they shouldn't do things like this? How do we even define what "things like this" are? Don't include links in your project description unless they are external indexes?
Paul.
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 7, 2014, at 7:03 AM, holger krekel <holger@merlinux.eu> wrote:
On Tue, Oct 07, 2014 at 11:40 +0100, Paul Moore wrote:
On 7 October 2014 11:09, holger krekel <holger@merlinux.eu> wrote:
Well, the main benefit of PEP438 was that it removed random crawling for some 90% of the packages on the package index, speeding up and making installs more reliable. And it did that without breaking backward compatibility.
The setuptools index page is 1.4MB in size. Most of that can be ignored, but it still has to be downloaded and parsed. Whether the data that setuptools includes in its long_description is useful is arguable, but irrelevant - the fact is that as things stand, it is there and it causes issues.
PEP 470 would result in all of the unneeded entries in the simple index for setuptools being removed, which avoids the need for client tools (and I'm not talking just about pip here, but also about one-off scripts, which is the sort of thing I write a lot) to trawl through all of that data. And it does so without the setuptools project having to change how it writes its PyPI page (i.e., the project long_description). Arguably, that's equally a way of avoiding breaking backward compatibility...
The second could be done without breakage alltogether i think: at one time all external urls are auto-registered as external indexes and they are presented on the simple page with some meta information that does not confuse older pips/easy_installs. Newer pips/easy_installs can then provide nice error messages. Older pips can continue to use the PEP438 options. And easy install can continue to work.
Setuptools has 255 internal links to files hosted on PyPI. And about 11,000 other links. (I just checked that 3 times, as I couldn't believe it, but it *seems* to be right :-(). Removing duplicates, 337 unique links. Are you suggesting pip presents all of those as possible external indexes?
No, i effectively suggest that PyPI would present just 2 index links, those which currently are attributed as rel={download,homepage}. Those two index links would be put into the new "extra indexes field" on pypi with a note like "the following indexes were extracted from old release data" which newer pip versions could present to the user. For older pip/easy_installs things would just continue to work but they'd get a shorter setuptools simple page.
I am not opposed to moving the rel={download,homepage} automatically from link to metatag, I am opposed to leaving them in place in an attempt to prioritize backwards compatibility over safety. The only thing I have against automatically translating the old links to the new external hosting metadata is that it’s going to be a lot of noise for authors who won’t know which links are the correct links to use. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 7, 2014, at 6:09 AM, holger krekel <holger@merlinux.eu> wrote:
On Fri, Oct 03, 2014 at 15:08 -0400, Donald Stufft wrote:
On Oct 3, 2014, at 2:28 PM, holger krekel <holger@merlinux.eu> wrote:
On Sat, Oct 04, 2014 at 00:24 +1000, Nick Coghlan wrote:
On 3 October 2014 22:02, Donald Stufft <donald@stufft.io> wrote:
As far as simplication goes, I don't believe it simplifies the implementation of PyPI at all, it just shuffles things around and creates work on my part in order to get PyPI supporting the new stuff. It does however let installers become simpler and it enables installers to present accurate error information that actually helps determine the root cause of a failure instead of the current silent failure with a confusing error message model.
I look forward to your suggestions, but I'm not hopeful. I've been thus far unable to determine a way to improve the current solution in a way that isn't just papering over one problem without solving the fundamental issue.
Donald's perspective here matches my own.
I don't see the "the fundamental issue" that PEP470 tries to solve. The first para of the abstract says it wants to substitute the existing mechanism for registering external indexes with another one. It doesn't say why. And it doesn't say why this can't be done in a backward compatible manner which would be preferable (i hope we agree there).
The fundamental issue is that PyPI is really two things, an index and a repository. Currently these two roles are blurred and that lack of distinction causes problems for both end users and authors and those problems create a certain animosity towards people not wanting to use PyPI as their repository. To this aim end users should be aware when they are installing things from a repository other than PyPI and they should also be aware when doing so is unsafe on the wire.
PEP 438 solves this problem. End users opt in to using a repository other than PyPI. However It is my belief that the pain of doing so has outweighed the benefits of PEP 438.
Well, the main benefit of PEP438 was that it removed random crawling for some 90% of the packages on the package index, speeding up and making installs more reliable. And it did that without breaking backward compatibility. And I think PEP470 could achieve its goals this way too.
Sorry, I mean the main benefit with regards to projects that are hosted externally. PEP 438 had tremendous benefit for cleaning up a ton of projects which were not hosted externally and had links which existed for nothing more than to slow things down and make things unsafe.
Thus PEP 470 attempts to "go back to the drawing board" and questions the mechanism for hosting on an alternative repository all together.
And because the PEP doesn't precisely say what "fundamental issue" it solves it's a bit hard to present an alternative. If it's about focusing on "multi-repository operations" and simplifying installer UI it could be done with full backward compat:
- add PyPI maintainer UI to add external indexes along with a message
Ok, this is part of PEP 470 too.
- change pip to disallow crawling to an external index it finds but rather present a message that you need to add the index manually to your installer invocation. (pip already finds external crawl URLs and it can also find the "new" ones - no need for any breakage).
I had thought of similar things, and my reasons for not using an <a href> and instead using a meta tag and for removing the old URLs instead of just making this in addition to is:
1. I don’t *want* users of older versions of pip/easy_install to implicitly be fetching these things, they should be able to opt in as well and indeed all the mechanisms exist in pip/easy_install for them to already do so. The only thing that doesn’t exist is the discovery mechanism.
I think it's better to generally avoid deliberately breaking things. Things break enough even when we don't intend them to.
IOW, Pypi should IMO aim to preserve working with as many client side scenarios as possible -- while adding things and improving for newer versions of clients.
And here I think is where the crux of our disagreement lies I think. I think that PyPI should preserve working with as many client side scenarios as possible, except where there is good reason to do so. I believe the fact that the vast bulk of the cases we’d be breaking are people who are silently, and often unknowingly, being directed to download some code over unauthenticated channels is a very good reason to break those cases. Especially given the fact that there is a fairly trivial work around for people who want to restore that behavior. In a way this is similar to switching Python to enforcing TLS verification by default, which afaik Guido has blessed even for 2.7 assuming that there is a sane way to restore the default behavior and configure it.
2. This doesn’t actually prevent breakage, it just links the breakage to the version of pip/easy_install someone is using at the cost that people with older clients are implicitly fetching things, some of which may or may not be safe.
I am not sure i follow here, sorry. There are two things the PEP does:
1. remove "registered verified external links"
2. support recording external indexes for a project
The first could be done without breakage except for the users and maintainers of that feature -- i take it we are still talking about just a few thousand client side uses and 60 project maintainers, right?
The second could be done without breakage alltogether i think: at one time all external urls are auto-registered as external indexes and they are presented on the simple page with some meta information that does not confuse older pips/easy_installs. Newer pips/easy_installs can then provide nice error messages. Older pips can continue to use the PEP438 options. And easy install can continue to work.
Overall I think the goal of not breaking things is a good one, however PyPI isn’t a versioned thing where people can limit what version of things they run. It’s important just from a maintenance aspect to be able to deprecate and remove things over time. This will break things for people depending on those things of course, so it’s always a balancing act about deciding *when* exactly to remove something. I think that this is a good time to remove this particular thing because the core functionality of it’s replacement has existed for a long time, the actual use of the feature is quite low, and leaving it in presents an issue with usability and security.
I agree that removing features and functionality is a good thing. But i maintain PEP470 could do it without breaking things.
It absolutely *could*, but as described above, I think it’s a better idea to break things in this case. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/14434/14434093fe65023938db06aab2af2d4631ae040a" alt=""
On Tue, Oct 07, 2014 at 08:00 -0400, Donald Stufft wrote:
On Oct 7, 2014, at 6:09 AM, holger krekel <holger@merlinux.eu> wrote:
I had thought of similar things, and my reasons for not using an <a href> and instead using a meta tag and for removing the old URLs instead of just making this in addition to is:
1. I don’t *want* users of older versions of pip/easy_install to implicitly be fetching these things, they should be able to opt in as well and indeed all the mechanisms exist in pip/easy_install for them to already do so. The only thing that doesn’t exist is the discovery mechanism.
I think it's better to generally avoid deliberately breaking things. Things break enough even when we don't intend them to.
IOW, Pypi should IMO aim to preserve working with as many client side scenarios as possible -- while adding things and improving for newer versions of clients.
And here I think is where the crux of our disagreement lies I think.
I think that PyPI should preserve working with as many client side scenarios as possible, except where there is good reason to do so.
I believe the fact that the vast bulk of the cases we’d be breaking are people who are silently, and often unknowingly, being directed to download some code over unauthenticated channels is a very good reason to break those cases. Especially given the fact that there is a fairly trivial work around for people who want to restore that behavior.
In a way this is similar to switching Python to enforcing TLS verification by default, which afaik Guido has blessed even for 2.7 assuming that there is a sane way to restore the default behavior and configure it.
Are you saying that PEP470's breaking of backard compatibility is deliberate and helps to defend against MITM attacks during installation? That might be true although i note that hacked servers (see also: bash, ssl) are much more common than MITM attacks and a hacked server can do SSL just fine. In any case, I see two security related downsides of PEP470, one of them severe. For one, current multi-index operations are riskier than PEP438's validated external release file urls. Because currently you only need to trust pypi.python.org has not been hacked but with PEP470 you need to trust the integrity of the external site as well. IIUC you and Nick think this is acceptable because people deliberately make that choice by supplying an explicit option to use the external index, right? If so i think the PEP should also be clear on the fact that Pip/pypi's external repo support is far inferior to typical linux repos because release files are not signed etc. Worse security problems loom with current multi-index ops like the --extra-index-url option which is advertised prominently in PEP470. You recommend to use it for private package indexes, but it can trivially compromise user machines: you register a private package name publically to pypi and add some malware release files, and can then infect all machines which execute an innocent "pip install --extra-index-url ...". I think we conversed about this issue earlier but i don't see the PEP discussing it but rather it recommends using it without a direct call for caution (*). I maintain this attack is more serious than MITM attacks for which you are even ready to break backward compat. Donald, Nick, i am not against the goals of PEP470 per se but in its current form i see it rather causing damage. When i explained to companies the dangers of pip multi-index operations they were rather alarmed and urged me to do something about it within the devpi context. But PEP470 pretends all is fine and everybody should move to multi-index immediately -- that's premature at least if not outright endagering users even today because they take the advise in the draft PEP470 for granted because it comes from Nick and Donald who usually know what they are talking about. At the very least we need to have clear discussion in the PEP about it and safer options for pip and PEP470 needs to MANDATE it for pip and maybe even for easy_install -- you could follow the devpi "pypi_whitelist" design to prevent mixed private/public package links and introduce a "--private-index-url" which means that pip would look first there and when it finds links for a name it would not consider other/public indexes unless the name is explicitely whitelisted. I admit i am not happy about the usability of that but it gives a good secure default against public packages infecting private package installs. best, holger (*) I saw that PEP470 in a different section says "Installers SHOULD implement some mechanism for removing or otherwise disabling use of the default repository." but that's just a "SHOULD" and even if implemented it will not fix fix things retro-actively for older pip/easy_install users -- but you claim fixing things for them is within the PEP470 scope above.
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 8, 2014, at 3:17 AM, holger krekel <holger@merlinux.eu> wrote:
On Tue, Oct 07, 2014 at 08:00 -0400, Donald Stufft wrote:
On Oct 7, 2014, at 6:09 AM, holger krekel <holger@merlinux.eu> wrote:
I had thought of similar things, and my reasons for not using an <a href> and instead using a meta tag and for removing the old URLs instead of just making this in addition to is:
1. I don’t *want* users of older versions of pip/easy_install to implicitly be fetching these things, they should be able to opt in as well and indeed all the mechanisms exist in pip/easy_install for them to already do so. The only thing that doesn’t exist is the discovery mechanism.
I think it's better to generally avoid deliberately breaking things. Things break enough even when we don't intend them to.
IOW, Pypi should IMO aim to preserve working with as many client side scenarios as possible -- while adding things and improving for newer versions of clients.
And here I think is where the crux of our disagreement lies I think.
I think that PyPI should preserve working with as many client side scenarios as possible, except where there is good reason to do so.
I believe the fact that the vast bulk of the cases we’d be breaking are people who are silently, and often unknowingly, being directed to download some code over unauthenticated channels is a very good reason to break those cases. Especially given the fact that there is a fairly trivial work around for people who want to restore that behavior.
In a way this is similar to switching Python to enforcing TLS verification by default, which afaik Guido has blessed even for 2.7 assuming that there is a sane way to restore the default behavior and configure it.
Are you saying that PEP470's breaking of backard compatibility is deliberate and helps to defend against MITM attacks during installation? That might be true although i note that hacked servers (see also: bash, ssl) are much more common than MITM attacks and a hacked server can do SSL just fine.
Yes.
In any case, I see two security related downsides of PEP470, one of them severe.
For one, current multi-index operations are riskier than PEP438's validated external release file urls. Because currently you only need to trust pypi.python.org has not been hacked but with PEP470 you need to trust the integrity of the external site as well. IIUC you and Nick think this is acceptable because people deliberately make that choice by supplying an explicit option to use the external index, right? If so i think the PEP should also be clear on the fact that Pip/pypi's external repo support is far inferior to typical linux repos because release files are not signed etc.
I might agree with you, except an important consideration of a security feature is, “Does anybody even use this?”. Looking at adoption rates it’s clear that practically nobody *does* use it. If it’s the most secure thing in the world but 95+% of the traffic is using the insecure option, does it really even matter if it’s secure? To be honest, it’s *not* inferior to typical linux repos because in both cases there is an online key you can compromise. If you compromise the debian build fleet you can sign any release files you want, just like if you compromise the Fastly servers and get the PyPI TLS key. You generally do *not* get end to end verification on any Linux repo. The big benefit of the linux model is that it enables untrusted mirrors whereas our current model does not.
Worse security problems loom with current multi-index ops like the --extra-index-url option which is advertised prominently in PEP470. You recommend to use it for private package indexes, but it can trivially compromise user machines: you register a private package name publically to pypi and add some malware release files, and can then infect all machines which execute an innocent "pip install --extra-index-url ...". I think we conversed about this issue earlier but i don't see the PEP discussing it but rather it recommends using it without a direct call for caution (*). I maintain this attack is more serious than MITM attacks for which you are even ready to break backward compat.
In the context of PEP 470 it’s giving another way for someone who has registered a project on PyPI to host off of PyPI. In this sense there is zero ability for someone else to come along and “override” the package name. The ability to do this for private projects is really only relevant in that by reusing that mechanism we have a single concept that users need to learn instead of multiple concepts. “There should be one way to do it”.
Donald, Nick, i am not against the goals of PEP470 per se but in its current form i see it rather causing damage. When i explained to companies the dangers of pip multi-index operations they were rather alarmed and urged me to do something about it within the devpi context. But PEP470 pretends all is fine and everybody should move to multi-index immediately -- that's premature at least if not outright endagering users even today because they take the advise in the draft PEP470 for granted because it comes from Nick and Donald who usually know what they are talking about.
This is really FUDish. Multi repository support *is* fine. If you have a private project then you should likely claim the name on PyPI because even without multi repository support all it would take is someone running pip on their machine and forgetting to switch to your internal index to attack you too. Can there be more improvements? Absolutely. However this particular problem is an inherent issue with a central repository that anyone can upload too. There are things we can do to make it less of a problem but it’s impossible to ever completely solve it.
At the very least we need to have clear discussion in the PEP about it and safer options for pip and PEP470 needs to MANDATE it for pip and maybe even for easy_install -- you could follow the devpi "pypi_whitelist" design to prevent mixed private/public package links and introduce a "--private-index-url" which means that pip would look first there and when it finds links for a name it would not consider other/public indexes unless the name is explicitely whitelisted. I admit i am not happy about the usability of that but it gives a good secure default against public packages infecting private package installs.
best, holger
(*) I saw that PEP470 in a different section says "Installers SHOULD implement some mechanism for removing or otherwise disabling use of the default repository." but that's just a "SHOULD" and even if implemented it will not fix fix things retro-actively for older pip/easy_install users -- but you claim fixing things for them is within the PEP470 scope above.
A PEP can’t really mandate anything to an installer and with PEP 438 I think we found that mandating how things are implemented from on top easily ends up being something that turns out worse in the long run. Pip has no means to improve upon the UX of PEP 438 except by deciding we’re not going to follow the PEP. We’d (I’d?) rather not just throw out what the PEPs say so we generally want to follow things. I have plans (and even a branch!) started to further enhance the multiple repository support in pip. A lot of that is modeled after what yum and apt-get has as far as options go. I am completely and unequivocally against things which mandate much at all to what UX pip presents for these things because I think we can better serve our users by being able to make our own UX decisions. After my experiences with a mandated UX from a PEP I’m at the point where personally I’ll ignore any such mandate in the future where I think there is a better option for pip. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/14434/14434093fe65023938db06aab2af2d4631ae040a" alt=""
On Wed, Oct 08, 2014 at 03:47 -0400, Donald Stufft wrote:
On Oct 8, 2014, at 3:17 AM, holger krekel <holger@merlinux.eu> wrote: Worse security problems loom with current multi-index ops like the --extra-index-url option which is advertised prominently in PEP470. You recommend to use it for private package indexes, but it can trivially compromise user machines: you register a private package name publically to pypi and add some malware release files, and can then infect all machines which execute an innocent "pip install --extra-index-url ...". I think we conversed about this issue earlier but i don't see the PEP discussing it but rather it recommends using it without a direct call for caution (*). I maintain this attack is more serious than MITM attacks for which you are even ready to break backward compat.
In the context of PEP 470 it’s giving another way for someone who has registered a project on PyPI to host off of PyPI. In this sense there is zero ability for someone else to come along and “override” the package name. The ability to do this for private projects is really only relevant in that by reusing that mechanism we have a single concept that users need to learn instead of multiple concepts. “There should be one way to do it”.
Donald, Nick, i am not against the goals of PEP470 per se but in its current form i see it rather causing damage. When i explained to companies the dangers of pip multi-index operations they were rather alarmed and urged me to do something about it within the devpi context. But PEP470 pretends all is fine and everybody should move to multi-index immediately -- that's premature at least if not outright endagering users even today because they take the advise in the draft PEP470 for granted because it comes from Nick and Donald who usually know what they are talking about.
This is really FUDish. Multi repository support *is* fine. If you have a private project then you should likely claim the name on PyPI because even without multi repository support all it would take is someone running pip on their machine and forgetting to switch to your internal index to attack you too.
I am sorry if raising the issue of private/public compromises sounds like FUD to you. From my experience it's a real attack vector. I talked about this at EP2014 (http://youtu.be/aNrrGf-uNUY?t=6m1s ) and people got back to me afterwards, surprised. And I don't think you can successfully ask people in companies around the work to register private package names publically (let alone the issue of clashes etc.). Admit it, that's even more unlikely than peple using some PEP438 features :) And yes, if someone forgets to set the private index he could still pull in malicious public links even with devpi or new pip options.
Can there be more improvements? Absolutely. However this particular problem is an inherent issue with a central repository that anyone can upload too. There are things we can do to make it less of a problem but it’s impossible to ever completely solve it.
Linux repos are totally different: their main index is a curated index and pypi's is a wiki. Thus merging links from a private index and the pypi wiki can trivially wreak havoc while putting malware into the central Debian or Redhat repo is very hard.
At the very least we need to have clear discussion in the PEP about it and safer options for pip and PEP470 needs to MANDATE it for pip and maybe even for easy_install -- you could follow the devpi "pypi_whitelist" design to prevent mixed private/public package links and introduce a "--private-index-url" which means that pip would look first there and when it finds links for a name it would not consider other/public indexes unless the name is explicitely whitelisted. I admit i am not happy about the usability of that but it gives a good secure default against public packages infecting private package installs.
best, holger
(*) I saw that PEP470 in a different section says "Installers SHOULD implement some mechanism for removing or otherwise disabling use of the default repository." but that's just a "SHOULD" and even if implemented it will not fix fix things retro-actively for older pip/easy_install users -- but you claim fixing things for them is within the PEP470 scope above.
A PEP can’t really mandate anything to an installer and with PEP 438 I think we found that mandating how things are implemented from on top easily ends up being something that turns out worse in the long run.
UI design is a delicate thing -- but i am sure you remember that you were involved in PEP438 and actually pushed for some UI that you are now criticising. I am a bit irritated but i understand that you probably all along wanted to push the processes towards the "multi-repo" idea. Please note that i am not against this in principle.
Pip has no means to improve upon the UX of PEP 438 except by deciding we’re not going to follow the PEP. We’d (I’d?) rather not just throw out what the PEPs say so we generally want to follow things.
And that's a good thing, thanks! Given the importance of PyPI today in the python community, I think the way how PyPI interacts with tools and installers deserves PEPs.
I have plans (and even a branch!) started to further enhance the multiple repository support in pip. A lot of that is modeled after what yum and apt-get has as far as options go. I am completely and unequivocally against things which mandate much at all to what UX pip presents for these things because I think we can better serve our users by being able to make our own UX decisions. After my experiences with a mandated UX from a PEP I’m at the point where personally I’ll ignore any such mandate in the future where I think there is a better option for pip.
PEPs are a form of helping collaboration and growth in a community but certainly not the only way and, if done badly, can do more damage than good. best, holger
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 8, 2014, at 4:44 AM, holger krekel <holger@merlinux.eu> wrote:
On Wed, Oct 08, 2014 at 03:47 -0400, Donald Stufft wrote:
On Oct 8, 2014, at 3:17 AM, holger krekel <holger@merlinux.eu> wrote: Worse security problems loom with current multi-index ops like the --extra-index-url option which is advertised prominently in PEP470. You recommend to use it for private package indexes, but it can trivially compromise user machines: you register a private package name publically to pypi and add some malware release files, and can then infect all machines which execute an innocent "pip install --extra-index-url ...". I think we conversed about this issue earlier but i don't see the PEP discussing it but rather it recommends using it without a direct call for caution (*). I maintain this attack is more serious than MITM attacks for which you are even ready to break backward compat.
In the context of PEP 470 it’s giving another way for someone who has registered a project on PyPI to host off of PyPI. In this sense there is zero ability for someone else to come along and “override” the package name. The ability to do this for private projects is really only relevant in that by reusing that mechanism we have a single concept that users need to learn instead of multiple concepts. “There should be one way to do it”.
Donald, Nick, i am not against the goals of PEP470 per se but in its current form i see it rather causing damage. When i explained to companies the dangers of pip multi-index operations they were rather alarmed and urged me to do something about it within the devpi context. But PEP470 pretends all is fine and everybody should move to multi-index immediately -- that's premature at least if not outright endagering users even today because they take the advise in the draft PEP470 for granted because it comes from Nick and Donald who usually know what they are talking about.
This is really FUDish. Multi repository support *is* fine. If you have a private project then you should likely claim the name on PyPI because even without multi repository support all it would take is someone running pip on their machine and forgetting to switch to your internal index to attack you too.
I am sorry if raising the issue of private/public compromises sounds like FUD to you. From my experience it's a real attack vector. I talked about this at EP2014 (http://youtu.be/aNrrGf-uNUY?t=6m1s ) and people got back to me afterwards, surprised.
And I don't think you can successfully ask people in companies around the work to register private package names publically (let alone the issue of clashes etc.). Admit it, that's even more unlikely than peple using some PEP438 features :)
And yes, if someone forgets to set the private index he could still pull in malicious public links even with devpi or new pip options.
I think raising the issue is FUDish because it has nothing to do with using multi repository support for things that are registered on PyPI. The attack vector you’re describing isn’t possible at all for any project that is effected by PEP 470, which are projects which wish to register themselves in the PyPI index without using PyPI as their repository. The *only* reason using the multiple repository support for private hosting is relevant is because it’s the option for allowing people to have private repositories at all, and we can re-use that behavior with this. It’s not particularly relevant to PEP 470 unless you have a suggestion for another, better mechanism that can satisfy all of these use cases (and possibly more?). That being said, the things I have sketched out for pip includes the ability to have both a whitelist and a black list for each repository. I don’t however think that it’s the PEPs place to dictate how that looks (or even if it exists). I’m also not against adding another *SHOULD* saying that installers should implement some mechanism that allows for whitelisting or blacklisting which repository particular projects come from.
Can there be more improvements? Absolutely. However this particular problem is an inherent issue with a central repository that anyone can upload too. There are things we can do to make it less of a problem but it’s impossible to ever completely solve it.
Linux repos are totally different: their main index is a curated index and pypi's is a wiki. Thus merging links from a private index and the pypi wiki can trivially wreak havoc while putting malware into the central Debian or Redhat repo is very hard.
At the very least we need to have clear discussion in the PEP about it and safer options for pip and PEP470 needs to MANDATE it for pip and maybe even for easy_install -- you could follow the devpi "pypi_whitelist" design to prevent mixed private/public package links and introduce a "--private-index-url" which means that pip would look first there and when it finds links for a name it would not consider other/public indexes unless the name is explicitely whitelisted. I admit i am not happy about the usability of that but it gives a good secure default against public packages infecting private package installs.
best, holger
(*) I saw that PEP470 in a different section says "Installers SHOULD implement some mechanism for removing or otherwise disabling use of the default repository." but that's just a "SHOULD" and even if implemented it will not fix fix things retro-actively for older pip/easy_install users -- but you claim fixing things for them is within the PEP470 scope above.
A PEP can’t really mandate anything to an installer and with PEP 438 I think we found that mandating how things are implemented from on top easily ends up being something that turns out worse in the long run.
UI design is a delicate thing -- but i am sure you remember that you were involved in PEP438 and actually pushed for some UI that you are now criticising. I am a bit irritated but i understand that you probably all along wanted to push the processes towards the "multi-repo" idea. Please note that i am not against this in principle.
Absolutely. I don’t think that it was clear at the time that the PEP 438 UX would be as bad as it turned out to be. My take away from that isn’t so much that the people involved were bad at UXs but that codifying a UX into a PEP is a bad idea in general. With PEP 470 you can see this because even on the PyPI side I don’t dictate a UX in it. I spell out the API that I expect PyPI to adopt but I don’t mention what the UX looks like so that we can easily adjust it. I also don’t spell out what the UXs on the installers look like, against because It’s my belief now that dictating UX is a generally bad idea, instead I spell out what features an installer *should* have. This is all just lessons learned from trying to spell out a UX inside of a PEP, we did it, it didn’t work and when it didn’t work the fact it was in a PEP put us in a crappy situation of having to either write a whole new PEP (and possible recreate new UX issues) or start ignoring PEPs. For the record, both pip and easy_install already have mechanisms for disabling the default repository. In pip this is ``—no-index`` and in easy_install it’s not as easy but you can either override it with ``—index-url`` or use the ``—allow-hosts`` option to disallow PyPI.
Pip has no means to improve upon the UX of PEP 438 except by deciding we’re not going to follow the PEP. We’d (I’d?) rather not just throw out what the PEPs say so we generally want to follow things.
And that's a good thing, thanks! Given the importance of PyPI today in the python community, I think the way how PyPI interacts with tools and installers deserves PEPs.
Yea I agree, which is why I’m trying to figure out how to do PEPs without making them feel more like handcuffs than useful tools :)
I have plans (and even a branch!) started to further enhance the multiple repository support in pip. A lot of that is modeled after what yum and apt-get has as far as options go. I am completely and unequivocally against things which mandate much at all to what UX pip presents for these things because I think we can better serve our users by being able to make our own UX decisions. After my experiences with a mandated UX from a PEP I’m at the point where personally I’ll ignore any such mandate in the future where I think there is a better option for pip.
PEPs are a form of helping collaboration and growth in a community but certainly not the only way and, if done badly, can do more damage than good.
best, holger
--- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/14434/14434093fe65023938db06aab2af2d4631ae040a" alt=""
On Wed, Oct 08, 2014 at 05:44 -0400, Donald Stufft wrote:
On Oct 8, 2014, at 4:44 AM, holger krekel <holger@merlinux.eu> wrote:
On Wed, Oct 08, 2014 at 03:47 -0400, Donald Stufft wrote:
On Oct 8, 2014, at 3:17 AM, holger krekel <holger@merlinux.eu> wrote: Worse security problems loom with current multi-index ops like the --extra-index-url option which is advertised prominently in PEP470. You recommend to use it for private package indexes, but it can trivially compromise user machines: you register a private package name publically to pypi and add some malware release files, and can then infect all machines which execute an innocent "pip install --extra-index-url ...". I think we conversed about this issue earlier but i don't see the PEP discussing it but rather it recommends using it without a direct call for caution (*). I maintain this attack is more serious than MITM attacks for which you are even ready to break backward compat.
In the context of PEP 470 it’s giving another way for someone who has registered a project on PyPI to host off of PyPI. In this sense there is zero ability for someone else to come along and “override” the package name. The ability to do this for private projects is really only relevant in that by reusing that mechanism we have a single concept that users need to learn instead of multiple concepts. “There should be one way to do it”.
Donald, Nick, i am not against the goals of PEP470 per se but in its current form i see it rather causing damage. When i explained to companies the dangers of pip multi-index operations they were rather alarmed and urged me to do something about it within the devpi context. But PEP470 pretends all is fine and everybody should move to multi-index immediately -- that's premature at least if not outright endagering users even today because they take the advise in the draft PEP470 for granted because it comes from Nick and Donald who usually know what they are talking about.
This is really FUDish. Multi repository support *is* fine. If you have a private project then you should likely claim the name on PyPI because even without multi repository support all it would take is someone running pip on their machine and forgetting to switch to your internal index to attack you too.
I am sorry if raising the issue of private/public compromises sounds like FUD to you. From my experience it's a real attack vector. I talked about this at EP2014 (http://youtu.be/aNrrGf-uNUY?t=6m1s ) and people got back to me afterwards, surprised.
And I don't think you can successfully ask people in companies around the work to register private package names publically (let alone the issue of clashes etc.). Admit it, that's even more unlikely than peple using some PEP438 features :)
And yes, if someone forgets to set the private index he could still pull in malicious public links even with devpi or new pip options.
I think raising the issue is FUDish because it has nothing to do with using multi repository support for things that are registered on PyPI.
Well, the PEP has two central paragraphs motivating multi-index operations: The two common installer tools, pip and easy_install/setuptools, both support the concept of additional locations to search for files to satisify the installation requirements and have done so for many years. This means that there is no need to "phase" in a new flag or concept and the solution to installing a project from a repository other than PyPI will function regardless of how old (within reason) the end user's installer is. Not only has this concept existed in the Python tooling for some time, but it is a concept that exists across languages and even extending to the OS level with OS package tools almost universally using multiple repository support making it extremely likely that someone is already familar with the concept. Additionally, the multiple repository approach is a concept that is useful outside of the narrow scope of allowing projects which wish to be included on the index portion of PyPI but do not wish to utilize the repository portion of PyPI. This includes places where a company may wish to host a repository that contains their internal packages or where a project may wish to have multiple "channels" of releases, such as alpha, beta, release candidate, and final release. and then it concretely suggests "--extra-index-url" and gives an example. It does not say that this is only good if you are using private projects that have a presence on PyPI. It rather suggests multi-index is the thing to go for today, generally, does it not? Given that PyPI is a wiki and Linux Distros are a curated index, i insist it's dangerous to recommend to mix multiple indexes with pip if you don't know quite exactly what you are doing. Do you really disagree on this? best, holger
The attack vector you’re describing isn’t possible at all for any project that is effected by PEP 470, which are projects which wish to register themselves in the PyPI index without using PyPI as their repository.
The *only* reason using the multiple repository support for private hosting is relevant is because it’s the option for allowing people to have private repositories at all, and we can re-use that behavior with this. It’s not particularly relevant to PEP 470 unless you have a suggestion for another, better mechanism that can satisfy all of these use cases (and possibly more?).
That being said, the things I have sketched out for pip includes the ability to have both a whitelist and a black list for each repository. I don’t however think that it’s the PEPs place to dictate how that looks (or even if it exists).
I’m also not against adding another *SHOULD* saying that installers should implement some mechanism that allows for whitelisting or blacklisting which repository particular projects come from.
Can there be more improvements? Absolutely. However this particular problem is an inherent issue with a central repository that anyone can upload too. There are things we can do to make it less of a problem but it’s impossible to ever completely solve it.
Linux repos are totally different: their main index is a curated index and pypi's is a wiki. Thus merging links from a private index and the pypi wiki can trivially wreak havoc while putting malware into the central Debian or Redhat repo is very hard.
At the very least we need to have clear discussion in the PEP about it and safer options for pip and PEP470 needs to MANDATE it for pip and maybe even for easy_install -- you could follow the devpi "pypi_whitelist" design to prevent mixed private/public package links and introduce a "--private-index-url" which means that pip would look first there and when it finds links for a name it would not consider other/public indexes unless the name is explicitely whitelisted. I admit i am not happy about the usability of that but it gives a good secure default against public packages infecting private package installs.
best, holger
(*) I saw that PEP470 in a different section says "Installers SHOULD implement some mechanism for removing or otherwise disabling use of the default repository." but that's just a "SHOULD" and even if implemented it will not fix fix things retro-actively for older pip/easy_install users -- but you claim fixing things for them is within the PEP470 scope above.
A PEP can’t really mandate anything to an installer and with PEP 438 I think we found that mandating how things are implemented from on top easily ends up being something that turns out worse in the long run.
UI design is a delicate thing -- but i am sure you remember that you were involved in PEP438 and actually pushed for some UI that you are now criticising. I am a bit irritated but i understand that you probably all along wanted to push the processes towards the "multi-repo" idea. Please note that i am not against this in principle.
Absolutely. I don’t think that it was clear at the time that the PEP 438 UX would be as bad as it turned out to be. My take away from that isn’t so much that the people involved were bad at UXs but that codifying a UX into a PEP is a bad idea in general. With PEP 470 you can see this because even on the PyPI side I don’t dictate a UX in it. I spell out the API that I expect PyPI to adopt but I don’t mention what the UX looks like so that we can easily adjust it. I also don’t spell out what the UXs on the installers look like, against because It’s my belief now that dictating UX is a generally bad idea, instead I spell out what features an installer *should* have.
This is all just lessons learned from trying to spell out a UX inside of a PEP, we did it, it didn’t work and when it didn’t work the fact it was in a PEP put us in a crappy situation of having to either write a whole new PEP (and possible recreate new UX issues) or start ignoring PEPs.
For the record, both pip and easy_install already have mechanisms for disabling the default repository. In pip this is ``—no-index`` and in easy_install it’s not as easy but you can either override it with ``—index-url`` or use the ``—allow-hosts`` option to disallow PyPI.
Pip has no means to improve upon the UX of PEP 438 except by deciding we’re not going to follow the PEP. We’d (I’d?) rather not just throw out what the PEPs say so we generally want to follow things.
And that's a good thing, thanks! Given the importance of PyPI today in the python community, I think the way how PyPI interacts with tools and installers deserves PEPs.
Yea I agree, which is why I’m trying to figure out how to do PEPs without making them feel more like handcuffs than useful tools :)
I have plans (and even a branch!) started to further enhance the multiple repository support in pip. A lot of that is modeled after what yum and apt-get has as far as options go. I am completely and unequivocally against things which mandate much at all to what UX pip presents for these things because I think we can better serve our users by being able to make our own UX decisions. After my experiences with a mandated UX from a PEP I’m at the point where personally I’ll ignore any such mandate in the future where I think there is a better option for pip.
PEPs are a form of helping collaboration and growth in a community but certainly not the only way and, if done badly, can do more damage than good.
best, holger
--- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 8, 2014, at 6:06 AM, holger krekel <holger@merlinux.eu> wrote:
On Wed, Oct 08, 2014 at 05:44 -0400, Donald Stufft wrote:
I think raising the issue is FUDish because it has nothing to do with using multi repository support for things that are registered on PyPI.
Well, the PEP has two central paragraphs motivating multi-index operations:
The two common installer tools, pip and easy_install/setuptools, both support the concept of additional locations to search for files to satisify the installation requirements and have done so for many years. This means that there is no need to "phase" in a new flag or concept and the solution to installing a project from a repository other than PyPI will function regardless of how old (within reason) the end user's installer is. Not only has this concept existed in the Python tooling for some time, but it is a concept that exists across languages and even extending to the OS level with OS package tools almost universally using multiple repository support making it extremely likely that someone is already familar with the concept.
Additionally, the multiple repository approach is a concept that is useful outside of the narrow scope of allowing projects which wish to be included on the index portion of PyPI but do not wish to utilize the repository portion of PyPI. This includes places where a company may wish to host a repository that contains their internal packages or where a project may wish to have multiple "channels" of releases, such as alpha, beta, release candidate, and final release.
and then it concretely suggests "--extra-index-url" and gives an example. It does not say that this is only good if you are using private projects that have a presence on PyPI. It rather suggests multi-index is the thing to go for today, generally, does it not?
Given that PyPI is a wiki and Linux Distros are a curated index, i insist it's dangerous to recommend to mix multiple indexes with pip if you don't know quite exactly what you are doing. Do you really disagree on this?
It is not dangerous to mix multiple indexes in the case that PEP 470 is specifying, which is when you want to have files for a project listed on the PyPI index hosted on a different repository. The use of --extra-index-url in PEP 470 is to show how someone would add one of the extra repositories for a project that is indexed on PyPI, which is again roughly as safe as installing from PyPI at all. If you use the multiple repository support to install things which are not claimed on PyPI and you do not disable the PyPI index, then yes that is dangerous. It also has nothing to do with whether it's safe for someone to add an additional repository that points to the repository that PIL is located at. I've also never suggested to anyone that their company should rely on PyPI and instead I point them towards either making their own repository with Apache/Index/Twisted Web or using devpi. My goal is to make PyPI as safe as possible for people who don't do that, but there are limits to what is possible. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/14434/14434093fe65023938db06aab2af2d4631ae040a" alt=""
On Wed, Oct 08, 2014 at 06:24 -0400, Donald Stufft wrote:
On Oct 8, 2014, at 6:06 AM, holger krekel <holger@merlinux.eu> wrote:
On Wed, Oct 08, 2014 at 05:44 -0400, Donald Stufft wrote:
I think raising the issue is FUDish because it has nothing to do with using multi repository support for things that are registered on PyPI.
Well, the PEP has two central paragraphs motivating multi-index operations:
The two common installer tools, pip and easy_install/setuptools, both support the concept of additional locations to search for files to satisify the installation requirements and have done so for many years. This means that there is no need to "phase" in a new flag or concept and the solution to installing a project from a repository other than PyPI will function regardless of how old (within reason) the end user's installer is. Not only has this concept existed in the Python tooling for some time, but it is a concept that exists across languages and even extending to the OS level with OS package tools almost universally using multiple repository support making it extremely likely that someone is already familar with the concept.
Additionally, the multiple repository approach is a concept that is useful outside of the narrow scope of allowing projects which wish to be included on the index portion of PyPI but do not wish to utilize the repository portion of PyPI. This includes places where a company may wish to host a repository that contains their internal packages or where a project may wish to have multiple "channels" of releases, such as alpha, beta, release candidate, and final release.
and then it concretely suggests "--extra-index-url" and gives an example. It does not say that this is only good if you are using private projects that have a presence on PyPI. It rather suggests multi-index is the thing to go for today, generally, does it not?
Given that PyPI is a wiki and Linux Distros are a curated index, i insist it's dangerous to recommend to mix multiple indexes with pip if you don't know quite exactly what you are doing. Do you really disagree on this?
It is not dangerous to mix multiple indexes in the case that PEP 470 is specifying, which is when you want to have files for a project listed on the PyPI index hosted on a different repository.
Yes, that case is not more dangerous than today.
The use of --extra-index-url in PEP 470 is to show how someone would add one of the extra repositories for a project that is indexed on PyPI, which is again roughly as safe as installing from PyPI at all.
Then we are reading the sections i cite above very differently -- IMO you and the PEP generally push for multi-index ops without explaining the risks. Maybe someone else can chime in. best, holger
If you use the multiple repository support to install things which are not claimed on PyPI and you do not disable the PyPI index, then yes that is dangerous. It also has nothing to do with whether it's safe for someone to add an additional repository that points to the repository that PIL is located at.
I've also never suggested to anyone that their company should rely on PyPI and instead I point them towards either making their own repository with Apache/Index/Twisted Web or using devpi. My goal is to make PyPI as safe as possible for people who don't do that, but there are limits to what is possible.
--- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 8 October 2014 20:33, holger krekel <holger@merlinux.eu> wrote:
Then we are reading the sections i cite above very differently -- IMO you and the PEP generally push for multi-index ops without explaining the risks.
Note that this explanation is present in the PEP: Currently both pip and setuptools implement multiple repository support by using the best installation candidate it can find from either repository, essentially treating it as if it were one large repository. Is it mainly that you would like the consequences of that in terms of any listed index being able to provide any requested package to be spelled out more clearly? Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 8 October 2014 11:33, holger krekel <holger@merlinux.eu> wrote:
The use of --extra-index-url in PEP 470 is to show how someone would add one of the extra repositories for a project that is indexed on PyPI, which is again roughly as safe as installing from PyPI at all.
Then we are reading the sections i cite above very differently -- IMO you and the PEP generally push for multi-index ops without explaining the risks.
Maybe someone else can chime in.
Chiming in because you asked for other opinions, although I've not yet read to the end of the thread... I read this section, and indeed the whole of the PEP, as basically saying: 1. We have a problem because PEP 438 didn't turn out so well in practice. 2. We have an existing mechanism (multi-index support). 3. The existing mechanism can be used as follows to better solve the problem PEP 438 tried to solve. I don't see any "encouragement" to use multi-index support, other than in the specific case PEP 438 was aimed at. Obviously PEP 470 raises the profile of multi-index support, which might cause people to use it ill-advisedly in inappropriate situations, but that's not the fault of PEP 470, and I don't want to see PEP 470 filled with warnings about how *other* uses of multi-index support might be inappropriate, because that will distract from the core message that is "we can fix the external hosting issue without needing clients to add a new mechanism". Paul
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 8, 2014, at 7:03 AM, Paul Moore <p.f.moore@gmail.com> wrote:
On 8 October 2014 11:33, holger krekel <holger@merlinux.eu> wrote:
The use of --extra-index-url in PEP 470 is to show how someone would add one of the extra repositories for a project that is indexed on PyPI, which is again roughly as safe as installing from PyPI at all.
Then we are reading the sections i cite above very differently -- IMO you and the PEP generally push for multi-index ops without explaining the risks.
Maybe someone else can chime in.
Chiming in because you asked for other opinions, although I've not yet read to the end of the thread...
I read this section, and indeed the whole of the PEP, as basically saying:
1. We have a problem because PEP 438 didn't turn out so well in practice. 2. We have an existing mechanism (multi-index support). 3. The existing mechanism can be used as follows to better solve the problem PEP 438 tried to solve.
I don't see any "encouragement" to use multi-index support, other than in the specific case PEP 438 was aimed at. Obviously PEP 470 raises the profile of multi-index support, which might cause people to use it ill-advisedly in inappropriate situations, but that's not the fault of PEP 470, and I don't want to see PEP 470 filled with warnings about how *other* uses of multi-index support might be inappropriate, because that will distract from the core message that is "we can fix the external hosting issue without needing clients to add a new mechanism".
Paul
This is more or less exactly what I intend (and what I think it does) the PEP to say. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 8 October 2014 20:06, holger krekel <holger@merlinux.eu> wrote:
Given that PyPI is a wiki and Linux Distros are a curated index, i insist it's dangerous to recommend to mix multiple indexes with pip if you don't know quite exactly what you are doing. Do you really disagree on this?
Hence this line in the PEP: End users wishing to limit what files they pull from which repository can simply use devpi to whitelist projects from PyPI or another repository. Anyone running a private PyPI mirror without disabling the use of upstream indexes entirely is already running their infrastructure in a dangerously insecure configuration. That has nothing to do with PEP 470. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 8 October 2014 19:44, Donald Stufft <donald@stufft.io> wrote:
On Oct 8, 2014, at 4:44 AM, holger krekel <holger@merlinux.eu> wrote: I am sorry if raising the issue of private/public compromises sounds like FUD to you. From my experience it's a real attack vector. I talked about this at EP2014 (http://youtu.be/aNrrGf-uNUY?t=6m1s ) and people got back to me afterwards, surprised.
And I don't think you can successfully ask people in companies around the work to register private package names publically (let alone the issue of clashes etc.). Admit it, that's even more unlikely than peple using some PEP438 features :)
And yes, if someone forgets to set the private index he could still pull in malicious public links even with devpi or new pip options.
I think raising the issue is FUDish because it has nothing to do with using multi repository support for things that are registered on PyPI. The attack vector you’re describing isn’t possible at all for any project that is effected by PEP 470, which are projects which wish to register themselves in the PyPI index without using PyPI as their repository.
From my perspective, there's also the question of "relative risk". As soon as anyone is installing anything at all directly from PyPI, their perceived threat level should already be through the roof. Why? Because of .pth files.
.pth files run automatically at every interpreter startup, so if you're installing directly from PyPI, then *every single package you install* has the power to completely subvert your application (if you're installing into an application specific virtual environment) or your entire Python installation (if you're installing into a shared Python instance). Even if they don't install malicious .pth files, then many of them are going to get imported at some point anyway, so they're going to able to reach out and do whatever they want to the Python level internal process state. "pip install python-nation" helps illustrate the degree to which we're generally trusting folks uploading stuff to PyPI to not be evil, and that level of trust also extends to folks providing external repositories rather than hosting directly on PyPI. If folks are more worried about the risk of PyPI or a third party repo shadowing their private packages than they are about malicious .pth files or generally malicious runtime behaviour in dependencies, then I strongly believe their threat meters need recalibrating. We focus on MITM attacks in the upstream infrastructure, because if *developers* are actively malicious, then you're already hosed - they don't need to do anything clever, they can just decide to own your system as a side effect of running their code. (Most of them won't, which is why the risk is low in practice. But as far as theoretical attacks go, this is near the top of my personal threat model, just behind third party MITM attacks) If folks are using Python in a context where these risks are unacceptable to them, then they should either be getting their packages via a trusted third party (like a community or commercial Linux distribution, or a commercial Python redistributor), or at least using a PyPI caching proxy with whitelisting support (which is why PEP 470 recommends the use of devpi in conjunction with turning off the default index). Establishing trustworthiness is expensive and relatively slow, which is why PyPI, like a lot of language specific distribution systems, doesn't currently offer it as feature. The generally more limited and older selection of packages in the redistributor channels reflects some of the impact of those additional costs. Bringing those costs down is something we're going to have to fix on the redistributor side - upstream initiatives like PEP 426 may help, but a lot of it is going to be a matter of Linux distros reassessing what services we're able to provide to the wider open source community. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/14434/14434093fe65023938db06aab2af2d4631ae040a" alt=""
On Wed, Oct 08, 2014 at 20:27 +1000, Nick Coghlan wrote:
On 8 October 2014 19:44, Donald Stufft <donald@stufft.io> wrote:
On Oct 8, 2014, at 4:44 AM, holger krekel <holger@merlinux.eu> wrote: I am sorry if raising the issue of private/public compromises sounds like FUD to you. From my experience it's a real attack vector. I talked about this at EP2014 (http://youtu.be/aNrrGf-uNUY?t=6m1s ) and people got back to me afterwards, surprised.
And I don't think you can successfully ask people in companies around the work to register private package names publically (let alone the issue of clashes etc.). Admit it, that's even more unlikely than peple using some PEP438 features :)
And yes, if someone forgets to set the private index he could still pull in malicious public links even with devpi or new pip options.
I think raising the issue is FUDish because it has nothing to do with using multi repository support for things that are registered on PyPI. The attack vector you’re describing isn’t possible at all for any project that is effected by PEP 470, which are projects which wish to register themselves in the PyPI index without using PyPI as their repository.
From my perspective, there's also the question of "relative risk". As soon as anyone is installing anything at all directly from PyPI, their perceived threat level should already be through the roof. Why? Because of .pth files.
Well, for installing NAME from pypi you need to trust that the people who registered and maintain NAME are not doing something bad (and the machine is not compromised but in that case all bets are off obviously). And i can make a choice to trust "django", "flask, "warehouse" and other pypi names. I am exposing myself to whatever the maintainers published but it's my choice. This is a very different thing compared to: pip install --extra-index http://private.repo mypackage I may think i am trusting just "mypackage" from my private repo. But in fact i am betting on nobody uploading "mypackage" to the pypi wiki. I don't think this is very obvious to many -- it certainly wasn't at EuroPython2014. best, holger
.pth files run automatically at every interpreter startup, so if you're installing directly from PyPI, then *every single package you install* has the power to completely subvert your application (if you're installing into an application specific virtual environment) or your entire Python installation (if you're installing into a shared Python instance). Even if they don't install malicious .pth files, then many of them are going to get imported at some point anyway, so they're going to able to reach out and do whatever they want to the Python level internal process state.
"pip install python-nation" helps illustrate the degree to which we're generally trusting folks uploading stuff to PyPI to not be evil, and that level of trust also extends to folks providing external repositories rather than hosting directly on PyPI.
If folks are more worried about the risk of PyPI or a third party repo shadowing their private packages than they are about malicious .pth files or generally malicious runtime behaviour in dependencies, then I strongly believe their threat meters need recalibrating. We focus on MITM attacks in the upstream infrastructure, because if *developers* are actively malicious, then you're already hosed - they don't need to do anything clever, they can just decide to own your system as a side effect of running their code. (Most of them won't, which is why the risk is low in practice. But as far as theoretical attacks go, this is near the top of my personal threat model, just behind third party MITM attacks)
If folks are using Python in a context where these risks are unacceptable to them, then they should either be getting their packages via a trusted third party (like a community or commercial Linux distribution, or a commercial Python redistributor), or at least using a PyPI caching proxy with whitelisting support (which is why PEP 470 recommends the use of devpi in conjunction with turning off the default index).
Establishing trustworthiness is expensive and relatively slow, which is why PyPI, like a lot of language specific distribution systems, doesn't currently offer it as feature. The generally more limited and older selection of packages in the redistributor channels reflects some of the impact of those additional costs. Bringing those costs down is something we're going to have to fix on the redistributor side - upstream initiatives like PEP 426 may help, but a lot of it is going to be a matter of Linux distros reassessing what services we're able to provide to the wider open source community.
Regards, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 8 October 2014 20:57, holger krekel <holger@merlinux.eu> wrote:
On Wed, Oct 08, 2014 at 20:27 +1000, Nick Coghlan wrote: Well, for installing NAME from pypi you need to trust that the people who registered and maintain NAME are not doing something bad (and the machine is not compromised but in that case all bets are off obviously). And i can make a choice to trust "django", "flask, "warehouse" and other pypi names. I am exposing myself to whatever the maintainers published but it's my choice. This is a very different thing compared to:
pip install --extra-index http://private.repo mypackage
I may think i am trusting just "mypackage" from my private repo. But in fact i am betting on nobody uploading "mypackage" to the pypi wiki. I don't think this is very obvious to many -- it certainly wasn't at EuroPython2014.
So your concern is specifically with the fact that some users are not currently aware that "--extra-index" adds an *extra* index (which can then supply *any* package, as can the default index), and not a *replacement* index, and that they need to use --index-url in order to completely override the default index? Would you be more comfortable if the existing admonition in PEP 470 to use a private devpi instance with whitelisting in situations with a low security risk tolerance was accompanied by a concrete example that noted the appropriate option to use for private index URLs?: pip install --index-url private-repo.example.com mypackage Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/14434/14434093fe65023938db06aab2af2d4631ae040a" alt=""
On Wed, Oct 08, 2014 at 21:22 +1000, Nick Coghlan wrote:
On 8 October 2014 20:57, holger krekel <holger@merlinux.eu> wrote:
On Wed, Oct 08, 2014 at 20:27 +1000, Nick Coghlan wrote: Well, for installing NAME from pypi you need to trust that the people who registered and maintain NAME are not doing something bad (and the machine is not compromised but in that case all bets are off obviously). And i can make a choice to trust "django", "flask, "warehouse" and other pypi names. I am exposing myself to whatever the maintainers published but it's my choice. This is a very different thing compared to:
pip install --extra-index http://private.repo mypackage
I may think i am trusting just "mypackage" from my private repo. But in fact i am betting on nobody uploading "mypackage" to the pypi wiki. I don't think this is very obvious to many -- it certainly wasn't at EuroPython2014.
So your concern is specifically with the fact that some users are not currently aware that "--extra-index" adds an *extra* index (which can then supply *any* package, as can the default index), and not a *replacement* index, and that they need to use --index-url in order to completely override the default index?
No, i am not concerned about the extra index supplying whatever packages. After all, the users specifies the option and should trust that index. I am concerned about the fact that public PyPI links are merged in even for my private packages residing on the extra index.
Would you be more comfortable if the existing admonition in PEP 470 to use a private devpi instance with whitelisting in situations with a low security risk tolerance was accompanied by a concrete example that noted the appropriate option to use for private index URLs?:
pip install --index-url private-repo.example.com mypackage
I rather think the whole rationale "Why additional repositories?" section of the PEP needs a re-work and specifically not recommend --extra-index-url. Contrary to what Donald and Paul claim i don't see it discussing just the particular issue of using extra indexes for publically registered packages: http://legacy.python.org/dev/peps/pep-0470/#why-additional-repositories best, holger
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 8 October 2014 12:40, holger krekel <holger@merlinux.eu> wrote:
I am concerned about the fact that public PyPI links are merged in even for my private packages residing on the extra index.
Bluntly, that's irrelevant. That's how pip works. Maybe it's not the best way, maybe a feature request for pip would be worth pursuing, maybe you could even argue that it's a security issue with pip. But it's not relevant to this PEP, which simply says that "for this *specific" problem, multi-index support is a viable solution". Asking for a change in behaviour from pip in this specific case is not what the PEP is about. Actually, pip's behaviour in general is not subject to the PEP process (as Donald pointed out, trying to make it be is what got PEP 438 in trouble). Paul
data:image/s3,"s3://crabby-images/14434/14434093fe65023938db06aab2af2d4631ae040a" alt=""
On Wed, Oct 08, 2014 at 13:05 +0100, Paul Moore wrote:
On 8 October 2014 12:40, holger krekel <holger@merlinux.eu> wrote:
I am concerned about the fact that public PyPI links are merged in even for my private packages residing on the extra index.
Bluntly, that's irrelevant.
I disagree. The PEP uses merging of public and private links in the main rationale section which comes before discussing migration strategies. It's used as motivation aka "look how easy it is to use additional/multi indexes" and not as a particular migration strategy that shouldn't be used otherwise.
That's how pip works. Maybe it's not the best way, maybe a feature request for pip would be worth pursuing, maybe you could even argue that it's a security issue with pip. But it's not relevant to this PEP, which simply says that "for this *specific" problem, multi-index support is a viable solution". Asking for a change in behaviour from pip in this specific case is not what the PEP is about. Actually, pip's behaviour in general is not subject to the PEP process (as Donald pointed out, trying to make it be is what got PEP 438 in trouble).
Well, for one i think "--extra-index-url" is indeed broken UI exposing people to compromise without any warning. Also, i am worried on principle grounds if pip maintainers are putting themselves outside PEP reach, yet pip is distributed along with Python. best, holger
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 8, 2014, at 8:17 AM, holger krekel <holger@merlinux.eu> wrote:
Also, i am worried on principle grounds if pip maintainers are putting themselves outside PEP reach, yet pip is distributed along with Python.
We’re not “putting ourselves outside of PEP reach”. We are an external project and we are not bound by the PEP process. Devpi, py.test, Django, requests, etc are also not bound by the PEP process. I was worried this might be used to try and force pip to adhere to PEPs which is why PEP 453 explicitly mentions this fact. http://legacy.python.org/dev/peps/pep-0453/#policies-governance “The maintainers of the bootstrapped software and the CPython core team will work together in order to address the needs of both. The bootstrapped software will still remain external to CPython and this PEP does not include CPython subsuming the development responsibilities or design decisions of the bootstrapped software. This PEP aims to decrease the burden on end users wanting to use third-party packages and the decisions inside it are pragmatic ones that represent the trust that the Python community has already placed in the Python Packaging Authority as the authors and maintainers of pip, setuptools, PyPI, virtualenv and other related projects.” --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 8 October 2014 22:22, Donald Stufft <donald@stufft.io> wrote:
On Oct 8, 2014, at 8:17 AM, holger krekel <holger@merlinux.eu> wrote:
Also, i am worried on principle grounds if pip maintainers are putting themselves outside PEP reach, yet pip is distributed along with Python.
We’re not “putting ourselves outside of PEP reach”. We are an external project and we are not bound by the PEP process. Devpi, py.test, Django, requests, etc are also not bound by the PEP process.
Note also that even for CPython itself, it is *up to us as core developers* to decide when something needs to be escalated through the PEP process. The vast majority of CPython changes are handled directly through the issue tracker, and there's still the occasional change that doesn't even make it that far (e.g. if we notice a problem while working on something else, we have the option of just committing the fix directly). PEPs are primarily for changes which have broad ecosystem implications where the additional overhead is justified. We don't write PEPs for every change to the CPython command line interface (e.g. there's no PEP for isolated mode), and the same kind of assessment of external impact applies to pip and the PyPA in general when decided whether a change can be handled within the scope of an individual project, or if it needs to be escalated for broader discussion. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/ab456/ab456d7b185e9d28a958835d5e138015926e5808" alt=""
On 08.10.2014 14:30, Nick Coghlan wrote:
On 8 October 2014 22:22, Donald Stufft <donald@stufft.io> wrote:
On Oct 8, 2014, at 8:17 AM, holger krekel <holger@merlinux.eu> wrote:
Also, i am worried on principle grounds if pip maintainers are putting themselves outside PEP reach, yet pip is distributed along with Python.
We’re not “putting ourselves outside of PEP reach”. We are an external project and we are not bound by the PEP process. Devpi, py.test, Django, requests, etc are also not bound by the PEP process.
Note also that even for CPython itself, it is *up to us as core developers* to decide when something needs to be escalated through the PEP process. The vast majority of CPython changes are handled directly through the issue tracker, and there's still the occasional change that doesn't even make it that far (e.g. if we notice a problem while working on something else, we have the option of just committing the fix directly).
PEPs are primarily for changes which have broad ecosystem implications where the additional overhead is justified. We don't write PEPs for every change to the CPython command line interface (e.g. there's no PEP for isolated mode), and the same kind of assessment of external impact applies to pip and the PyPA in general when decided whether a change can be handled within the scope of an individual project, or if it needs to be escalated for broader discussion.
I don't follow Donald's reasoning and I'm not sure I understand whether your comments are meant as clarification of pip being subject to the PEP process or support for Donald's reasoning :-) Changes to pip and PyPI *do* have a global effect on the Python ecosystem and thus need to be covered by the PEP process. If pip decides to go with a strategy that ignores this, I think we have a problem. The core developers put trust into pip when allowing it to (effectively) get distributed with Python and making it the default Python packaging manager. Please use that trust with the appropriate care and respect. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 8, 2014, at 8:55 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 08.10.2014 14:30, Nick Coghlan wrote:
On 8 October 2014 22:22, Donald Stufft <donald@stufft.io> wrote:
On Oct 8, 2014, at 8:17 AM, holger krekel <holger@merlinux.eu> wrote:
Also, i am worried on principle grounds if pip maintainers are putting themselves outside PEP reach, yet pip is distributed along with Python.
We’re not “putting ourselves outside of PEP reach”. We are an external project and we are not bound by the PEP process. Devpi, py.test, Django, requests, etc are also not bound by the PEP process.
Note also that even for CPython itself, it is *up to us as core developers* to decide when something needs to be escalated through the PEP process. The vast majority of CPython changes are handled directly through the issue tracker, and there's still the occasional change that doesn't even make it that far (e.g. if we notice a problem while working on something else, we have the option of just committing the fix directly).
PEPs are primarily for changes which have broad ecosystem implications where the additional overhead is justified. We don't write PEPs for every change to the CPython command line interface (e.g. there's no PEP for isolated mode), and the same kind of assessment of external impact applies to pip and the PyPA in general when decided whether a change can be handled within the scope of an individual project, or if it needs to be escalated for broader discussion.
I don't follow Donald's reasoning and I'm not sure I understand whether your comments are meant as clarification of pip being subject to the PEP process or support for Donald's reasoning :-)
Changes to pip and PyPI *do* have a global effect on the Python ecosystem and thus need to be covered by the PEP process.
If pip decides to go with a strategy that ignores this, I think we have a problem. The core developers put trust into pip when allowing it to (effectively) get distributed with Python and making it the default Python packaging manager. Please use that trust with the appropriate care and respect.
I don’t think we’ve *ever* not used that trust with care and respect and we’ve been trusted by the Python community for far longer than PEP 453 has existed. We attempt to follow PEPs where we can and where they make good sense. Nobody on the pip team is saying we’re going to flat out ignore PEPs or whatever. We (or at least I am) are saying that dictating UX via PEP process has been shown to us *not* to work and that we are not obligated to implement or listen to a PEP. This was explicitly spelled out in PEP 453 that we remain an external project even with the fact we’re now bundled with Python. This does not mean we won’t generally try to use the PEP process where our changes have cross cutting concerns between different projects but it does mean that we implement or follow PEPs at our discretion. This isn’t up for debate, it was an explicit inclusion in PEP 453 and if there was a problem with pip maintaining it’s own project the time to bring that up was a year ago. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/ab456/ab456d7b185e9d28a958835d5e138015926e5808" alt=""
On 08.10.2014 15:05, Donald Stufft wrote:
On Oct 8, 2014, at 8:55 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 08.10.2014 14:30, Nick Coghlan wrote:
On 8 October 2014 22:22, Donald Stufft <donald@stufft.io> wrote:
On Oct 8, 2014, at 8:17 AM, holger krekel <holger@merlinux.eu> wrote:
Also, i am worried on principle grounds if pip maintainers are putting themselves outside PEP reach, yet pip is distributed along with Python.
We’re not “putting ourselves outside of PEP reach”. We are an external project and we are not bound by the PEP process. Devpi, py.test, Django, requests, etc are also not bound by the PEP process.
Note also that even for CPython itself, it is *up to us as core developers* to decide when something needs to be escalated through the PEP process. The vast majority of CPython changes are handled directly through the issue tracker, and there's still the occasional change that doesn't even make it that far (e.g. if we notice a problem while working on something else, we have the option of just committing the fix directly).
PEPs are primarily for changes which have broad ecosystem implications where the additional overhead is justified. We don't write PEPs for every change to the CPython command line interface (e.g. there's no PEP for isolated mode), and the same kind of assessment of external impact applies to pip and the PyPA in general when decided whether a change can be handled within the scope of an individual project, or if it needs to be escalated for broader discussion.
I don't follow Donald's reasoning and I'm not sure I understand whether your comments are meant as clarification of pip being subject to the PEP process or support for Donald's reasoning :-)
Changes to pip and PyPI *do* have a global effect on the Python ecosystem and thus need to be covered by the PEP process.
If pip decides to go with a strategy that ignores this, I think we have a problem. The core developers put trust into pip when allowing it to (effectively) get distributed with Python and making it the default Python packaging manager. Please use that trust with the appropriate care and respect.
I don’t think we’ve *ever* not used that trust with care and respect and we’ve been trusted by the Python community for far longer than PEP 453 has existed. We attempt to follow PEPs where we can and where they make good sense. Nobody on the pip team is saying we’re going to flat out ignore PEPs or whatever.
We (or at least I am) are saying that dictating UX via PEP process has been shown to us *not* to work and that we are not obligated to implement or listen to a PEP. This was explicitly spelled out in PEP 453 that we remain an external project even with the fact we’re now bundled with Python. This does not mean we won’t generally try to use the PEP process where our changes have cross cutting concerns between different projects but it does mean that we implement or follow PEPs at our discretion. This isn’t up for debate, it was an explicit inclusion in PEP 453 and if there was a problem with pip maintaining it’s own project the time to bring that up was a year ago.
The intention of PEP 435 was to enable pip to evolve independent of the Python release process, which is a good thing. However, your comment that "We are an external project and we are not bound by the PEP process." doesn't really pan out in the light of the PEP's requirement that "The maintainers of the bootstrapped software and the CPython core team will work together in order to address the needs of both." If pip maintainers don't feel they are bound by PEPs, you could argue that you are also not bound by PEP 435, which would result in a rather pointless cooperation setup :-) Note that I'm not trying to say that you are actually not respecting the PEP process. I'm just concerned about comments like the above causing unnecessary heat in discussions. I'd also like to request that you take Holger's concerns more seriously, perhaps add him as PEP author and let him participate in clarifying it (if he still feels like investing time in this). PEPs are never perfect and there's always room for improvement. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 8 Oct 2014 23:40, "M.-A. Lemburg" <mal@egenix.com> wrote:
The intention of PEP 435 was to enable pip to evolve independent of the Python release process, which is a good thing.
However, your comment that "We are an external project and we are not bound by the PEP process." doesn't really pan out in the light of the
PEP's
requirement that "The maintainers of the bootstrapped software and the CPython core team will work together in order to address the needs of both."
If pip maintainers don't feel they are bound by PEPs, you could argue that you are also not bound by PEP 435, which would result in a rather pointless cooperation setup :-)
Note that I'm not trying to say that you are actually not respecting the PEP process. I'm just concerned about comments like the above causing unnecessary heat in discussions.
pip's UX decisions aren't likely to ever be put through the PEP process again - the PEP 426 (and now PEP 470) model of providing functional requirements and recommendations in the form of MUST and SHOULD statements is a cleaner process, since they provide guidance for all clients, not just pip, and leave the *details* of the UX to the normal pip development cycle (so if user feedback indicates a problem with the specifics of the initial approach, they can address that while remaining compliant with the specification). Decoupling functional specifications from UX details of individual tools is a good idea in general, this is just applying that model to pip and the PEP process in particular. PyPI needs to be covered in more detail, however, as these PEPs also serve as the *interface* specification for both clients and servers, and those need concrete API definitions to work with. PEP 438 was the main case so far where the PEP included specific UX design details for pip, and that's the aspect that *won't* be repeated. Regards, Nick.
I'd also like to request that you take Holger's concerns more seriously, perhaps add him as PEP author and let him participate in clarifying it (if he still feels like investing time in this).
PEPs are never perfect and there's always room for improvement.
Thanks, -- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
data:image/s3,"s3://crabby-images/ab456/ab456d7b185e9d28a958835d5e138015926e5808" alt=""
On 08.10.2014 15:59, Nick Coghlan wrote:
On 8 Oct 2014 23:40, "M.-A. Lemburg" <mal@egenix.com> wrote:
The intention of PEP 435 was to enable pip to evolve independent of the Python release process, which is a good thing.
However, your comment that "We are an external project and we are not bound by the PEP process." doesn't really pan out in the light of the PEP's requirement that "The maintainers of the bootstrapped software and the CPython core team will work together in order to address the needs of both."
If pip maintainers don't feel they are bound by PEPs, you could argue that you are also not bound by PEP 435, which would result in a rather pointless cooperation setup :-)
Note that I'm not trying to say that you are actually not respecting the PEP process. I'm just concerned about comments like the above causing unnecessary heat in discussions.
pip's UX decisions aren't likely to ever be put through the PEP process again - the PEP 426 (and now PEP 470) model of providing functional requirements and recommendations in the form of MUST and SHOULD statements is a cleaner process, since they provide guidance for all clients, not just pip, and leave the *details* of the UX to the normal pip development cycle (so if user feedback indicates a problem with the specifics of the initial approach, they can address that while remaining compliant with the specification).
Decoupling functional specifications from UX details of individual tools is a good idea in general, this is just applying that model to pip and the PEP process in particular.
IMO, specific user interface questions are PEP relevant if they affect the way users interact with the Python ecosystem. This doesn't mean mandating specific option names, but e.g. --using-silly-long-options-that-scare-away-users does have PEP relevance. A PEP would have to address such user interface designs by defining whether or not to encourage or discourage certain uses. And, of course, pip as officially sanctioned Python installer would need to implement these requirements.
PyPI needs to be covered in more detail, however, as these PEPs also serve as the *interface* specification for both clients and servers, and those need concrete API definitions to work with.
PEP 438 was the main case so far where the PEP included specific UX design details for pip, and that's the aspect that *won't* be repeated.
PEPs are not set in stone. They can be updated and replaced with new ones. That's why they are called "Python Enhancement *Proposals*" :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 8, 2014, at 9:40 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 08.10.2014 15:05, Donald Stufft wrote:
On Oct 8, 2014, at 8:55 AM, M.-A. Lemburg <mal@egenix.com> wrote:
On 08.10.2014 14:30, Nick Coghlan wrote:
On 8 October 2014 22:22, Donald Stufft <donald@stufft.io> wrote:
On Oct 8, 2014, at 8:17 AM, holger krekel <holger@merlinux.eu> wrote:
Also, i am worried on principle grounds if pip maintainers are putting themselves outside PEP reach, yet pip is distributed along with Python.
We’re not “putting ourselves outside of PEP reach”. We are an external project and we are not bound by the PEP process. Devpi, py.test, Django, requests, etc are also not bound by the PEP process.
Note also that even for CPython itself, it is *up to us as core developers* to decide when something needs to be escalated through the PEP process. The vast majority of CPython changes are handled directly through the issue tracker, and there's still the occasional change that doesn't even make it that far (e.g. if we notice a problem while working on something else, we have the option of just committing the fix directly).
PEPs are primarily for changes which have broad ecosystem implications where the additional overhead is justified. We don't write PEPs for every change to the CPython command line interface (e.g. there's no PEP for isolated mode), and the same kind of assessment of external impact applies to pip and the PyPA in general when decided whether a change can be handled within the scope of an individual project, or if it needs to be escalated for broader discussion.
I don't follow Donald's reasoning and I'm not sure I understand whether your comments are meant as clarification of pip being subject to the PEP process or support for Donald's reasoning :-)
Changes to pip and PyPI *do* have a global effect on the Python ecosystem and thus need to be covered by the PEP process.
If pip decides to go with a strategy that ignores this, I think we have a problem. The core developers put trust into pip when allowing it to (effectively) get distributed with Python and making it the default Python packaging manager. Please use that trust with the appropriate care and respect.
I don’t think we’ve *ever* not used that trust with care and respect and we’ve been trusted by the Python community for far longer than PEP 453 has existed. We attempt to follow PEPs where we can and where they make good sense. Nobody on the pip team is saying we’re going to flat out ignore PEPs or whatever.
We (or at least I am) are saying that dictating UX via PEP process has been shown to us *not* to work and that we are not obligated to implement or listen to a PEP. This was explicitly spelled out in PEP 453 that we remain an external project even with the fact we’re now bundled with Python. This does not mean we won’t generally try to use the PEP process where our changes have cross cutting concerns between different projects but it does mean that we implement or follow PEPs at our discretion. This isn’t up for debate, it was an explicit inclusion in PEP 453 and if there was a problem with pip maintaining it’s own project the time to bring that up was a year ago.
The intention of PEP 435 was to enable pip to evolve independent of the Python release process, which is a good thing.
However, your comment that "We are an external project and we are not bound by the PEP process." doesn't really pan out in the light of the PEP's requirement that "The maintainers of the bootstrapped software and the CPython core team will work together in order to address the needs of both."
If pip maintainers don't feel they are bound by PEPs, you could argue that you are also not bound by PEP 435, which would result in a rather pointless cooperation setup :-)
Note that I'm not trying to say that you are actually not respecting the PEP process. I'm just concerned about comments like the above causing unnecessary heat in discussions.
I feel like this whole “Is pip subject to PEPs” thing went way off the rails somewhere. Originally it was just “A PEP can’t mandate to an installer” which is true, pip is the only installer bundled with Python and I try to write my PEPs to be installer agnostic. I think it also got bound up in the fact that I/we feel pretty strongly that dictating a UX in a PEP to pip doesn’t work (discovered through experience with PEP 438) and thus we’re unlikely to listen to a PEP that dictates a UX that we feel is bad again because it’s turned out to be bad idea for us (although it’s more likely that such a PEP wouldn’t get accepted to begin with I think). Somewhere that morphed to pip is not subject to PEPs, I think with the suggestion that the changes Holger is asking to be made are specific to the pip functionality and not to PEP 470 so he should raise those issues on the pip tracker as they aren’t part of PEP 470 and that pip doesn’t generally use the PEP process for our behavior). From there is snowballed into this argument which I think is likely to just be circular and largely pointless as I don’t think a situation where pip flat out refuses to follow a reasonable PEP (or even a request from python-dev) is likely to come up so the difference is academic.
I'd also like to request that you take Holger's concerns more seriously, perhaps add him as PEP author and let him participate in clarifying it (if he still feels like investing time in this).
I take all concerns and feedback seriously else I wouldn’t spend the many hours I’ve spent just this morning responding to them. I don’t grok what Holger’s actual concern is so it’s hard to turn those concerns into anything actionable I can actually do on the PEP. I’ve asked for him (if he desires!) to give an actual example of something he’d change in the PEP to see if maybe that would make it clearer to me what he’s actually concerned about in relation to the PEP. I’m going to remove the one half a sentence that mentions a private repository in any capacity but I have a hard time believing that using it as one example in a list is what Holger’s concern is so I feel like that probably won’t fully address it.
PEPs are never perfect and there's always room for improvement.
Thanks, -- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
--- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/ab456/ab456d7b185e9d28a958835d5e138015926e5808" alt=""
On 08.10.2014 16:04, Donald Stufft wrote:
I'd also like to request that you take Holger's concerns more seriously, perhaps add him as PEP author and let him participate in clarifying it (if he still feels like investing time in this).
I take all concerns and feedback seriously else I wouldn’t spend the many hours I’ve spent just this morning responding to them. I don’t grok what Holger’s actual concern is so it’s hard to turn those concerns into anything actionable I can actually do on the PEP.
Holger has made his points very clear in his emails. If you don't follow/grok his reasoning it may indeed be better to have him edit the PEP to add his improvements/changes. I share his view that it is not necessary to break existing setups to add multi-index support. This can be implemented as simple extension to what we already have: """ Simply add the possibility for authors to register external indexes, have pip, setuptools, et al. crawl these in addition to what's up on the PyPI package page (using the logic that has existed in these tools for years) and then let the author decide whether they want to remove existing downloads from PyPI or not. This allows for older installations to continue working, while also (optionally) supporting a setup which does not use PyPI for hosting at all. """ BTW: For eGenix we've chosen to use a different approach, one that is based on a Python web installer. I gave a talk about this at PyCon UK, in case you're interested: https://downloads.egenix.com/python/PyCon-UK-2014-Python-Web-Installer-Talk.... (talk video here: http://www.egenix.com/library/presentations/PyCon-UK-2014-Python-Web-Install...) This solves the issues with the pip user experience for our packages, solves the download selection issues for the binaries, works with all Python versions we support and assures that the downloads are safe. It's still work in progress, but already quite usable. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 8, 2014, at 2:35 PM, M.-A. Lemburg <mal@egenix.com> wrote:
On 08.10.2014 16:04, Donald Stufft wrote:
I'd also like to request that you take Holger's concerns more seriously, perhaps add him as PEP author and let him participate in clarifying it (if he still feels like investing time in this).
I take all concerns and feedback seriously else I wouldn’t spend the many hours I’ve spent just this morning responding to them. I don’t grok what Holger’s actual concern is so it’s hard to turn those concerns into anything actionable I can actually do on the PEP.
Holger has made his points very clear in his emails.
If you don't follow/grok his reasoning it may indeed be better to have him edit the PEP to add his improvements/changes.
I share his view that it is not necessary to break existing setups to add multi-index support. This can be implemented as simple extension to what we already have:
""" Simply add the possibility for authors to register external indexes, have pip, setuptools, et al. crawl these in addition to what's up on the PyPI package page (using the logic that has existed in these tools for years) and then let the author decide whether they want to remove existing downloads from PyPI or not.
This allows for older installations to continue working, while also (optionally) supporting a setup which does not use PyPI for hosting at all. “""
His backwards compatibility point I understood completely and I responded that I believe that maintaining backwards compatibility for a minority of projects, where that backwards compatibility is almost entirely unsafe, is not more important than making ``pip install`` safe and it is actively preventing us making the pip repository code better. The PEP reflects that view, Holger and you may not agree with it and that's fine, I'm not going to compromise that with maintaining compat for a tiny fraction of people/projects. The thing I don't understand is Holger's worry about using the multi repository support for private projects. The PEP is entirely about using an existing feature for public projects. Private projects are mentioned a grand total of once in the entire PEP and that's just saying that the ability to specify other repositories is also useful for... and then lists a couple items, one of which is internal company repositories. So I'm not sure what I'm supposed to do with Holger's concern about --extra-index-url which don't apply to the PEP at all as far as I can tell which is where I'm looking for some clarification. How does Holger want me to address the use of this feature in a mostly unrelated to the PEP fashion in this PEP?
BTW: For eGenix we've chosen to use a different approach, one that is based on a Python web installer. I gave a talk about this at PyCon UK, in case you're interested: https://downloads.egenix.com/python/PyCon-UK-2014-Python-Web-Installer-Talk.... (talk video here: http://www.egenix.com/library/presentations/PyCon-UK-2014-Python-Web-Install...) This solves the issues with the pip user experience for our packages, solves the download selection issues for the binaries, works with all Python versions we support and assures that the downloads are safe. It's still work in progress, but already quite usable.
I’ve been pointed at your web installer and poked at it a little bit. I don’t have any specific points about it since I only really skimmed it, hopefully you’re doing all that needs to be done to secure the downloads, but otherwise I’m glad you found something that gives you more control over the process and that still works well with the newer policies of things. Maybe it’d make sense to also explicitly mention this as an additional option? Is there a tool you’re using to manage all this or is it all one-off and specific to eGenix? --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 8 October 2014 19:35, M.-A. Lemburg <mal@egenix.com> wrote:
On 08.10.2014 16:04, Donald Stufft wrote:
I'd also like to request that you take Holger's concerns more seriously, perhaps add him as PEP author and let him participate in clarifying it (if he still feels like investing time in this).
I take all concerns and feedback seriously else I wouldn’t spend the many hours I’ve spent just this morning responding to them. I don’t grok what Holger’s actual concern is so it’s hard to turn those concerns into anything actionable I can actually do on the PEP.
Holger has made his points very clear in his emails.
If you don't follow/grok his reasoning it may indeed be better to have him edit the PEP to add his improvements/changes.
I share his view that it is not necessary to break existing setups to add multi-index support. This can be implemented as simple extension to what we already have:
""" Simply add the possibility for authors to register external indexes, have pip, setuptools, et al. crawl these in addition to what's up on the PyPI package page (using the logic that has existed in these tools for years) and then let the author decide whether they want to remove existing downloads from PyPI or not.
This allows for older installations to continue working, while also (optionally) supporting a setup which does not use PyPI for hosting at all. """
OK, thanks for restating/clarifying. This was buried somewhat in the extended debate about security concerns and the implications of using multi-index support in contexts not relevant to the PEP. For what it's worth, I am -1 on this suggested addition. My reasons are: 1. The additional complexity of crawling extra indexes like this makes it harder to write new tools, or adhoc utilities (I know, I've tried :-)). 2. As the proposal stands, I don't see any way that I as a user can exercise any choice. Without inspecting the PyPI index page, I cannot know if "pip install foo" will access another website, which may be contrary to my company policy. 3. What if I want to prohibit that external access (maybe by adding a local index containing a verified copy of the package)? Unless I disable PyPI access, which may not be acceptable for other reasons, the crawl will still happen. 3. Dependency handling makes this even worse. What if a package I require, fully hosted on PyPI, depends on another one that is hosted elsewhere. How would I know? That'll do for now. Maybe Holger has proposals to address these concerns, which would be fine. It's much easier to address specifics rather than debating general, badly understood points. Once again, thanks for picking out the key point here. Paul
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 8 October 2014 13:55, M.-A. Lemburg <mal@egenix.com> wrote:
If pip decides to go with a strategy that ignores this, I think we have a problem. The core developers put trust into pip when allowing it to (effectively) get distributed with Python and making it the default Python packaging manager. Please use that trust with the appropriate care and respect.
Just to clarify - the pip team (I hope I speak for all of us) fully understand the implications of being the de facto standard package manager. And we appreciate the trust placed in us by the fact that pip is distributed with Python. But at the same time, that trust was given on the basis that (presumably) we have a track record of doing things right, in an area that is notoriously full of heated discussions and conflicting opinions. So what we'd like to do is to continue handling things in the same way as always, working with the packaging community. In particular, that means that we did not align ourselves to the CPython development model (as it is designed for a very different community and set of problems). But we do want to adopt their good practices where possible and appropriate. One of those is the PEP process - but it's not entirely suitable (see the trail of PEPs from the distribute/packaging/distutils2 era, for why). So we're trying to get things right, and in the process we're learning - for example, the failure of PEP 438 taught us that specifying installer behaviour too closely in a PEP means we can't fix problems that are completely messing up our users. But we still believe in the PEP process (anyone who thinks otherwise hasn't noticed the amount of effort Donald, in particular, is putting into all the PEPs in progress). It doesn't mean that it can be treated as a way of forcing us not to do what we think is right for the pip user base, though. Paul.
data:image/s3,"s3://crabby-images/ab456/ab456d7b185e9d28a958835d5e138015926e5808" alt=""
On 08.10.2014 15:15, Paul Moore wrote:
On 8 October 2014 13:55, M.-A. Lemburg <mal@egenix.com> wrote:
If pip decides to go with a strategy that ignores this, I think we have a problem. The core developers put trust into pip when allowing it to (effectively) get distributed with Python and making it the default Python packaging manager. Please use that trust with the appropriate care and respect.
Just to clarify - the pip team (I hope I speak for all of us) fully understand the implications of being the de facto standard package manager. And we appreciate the trust placed in us by the fact that pip is distributed with Python. But at the same time, that trust was given on the basis that (presumably) we have a track record of doing things right, in an area that is notoriously full of heated discussions and conflicting opinions. So what we'd like to do is to continue handling things in the same way as always, working with the packaging community.
In particular, that means that we did not align ourselves to the CPython development model (as it is designed for a very different community and set of problems). But we do want to adopt their good practices where possible and appropriate. One of those is the PEP process - but it's not entirely suitable (see the trail of PEPs from the distribute/packaging/distutils2 era, for why). So we're trying to get things right, and in the process we're learning - for example, the failure of PEP 438 taught us that specifying installer behaviour too closely in a PEP means we can't fix problems that are completely messing up our users. But we still believe in the PEP process (anyone who thinks otherwise hasn't noticed the amount of effort Donald, in particular, is putting into all the PEPs in progress). It doesn't mean that it can be treated as a way of forcing us not to do what we think is right for the pip user base, though.
Thanks for your clarification, Paul. I just want to remind everyone that PEPs can be augments and mistakes can be fixed by superseding one PEP with another. It's a well working process, one that is accepted in Python land and in line with the core development process. Since pip now is part of the Python stdlib (even though not bound by its release process), and the pip user base is identical with the CPython user base, the PEP process also applies to pip. That's the consequence of playing the role of an officially sanctioned part of the ecosystem and comes as part of the responsibility resulting from PEP 435. So far this has worked out well, which is why I'm surprised by some statements in this discussion. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 8 October 2014 19:09, M.-A. Lemburg <mal@egenix.com> wrote:
Thanks for your clarification, Paul.
In the interest of making sure everyone is understanding each other, I'm going to follow up on this. I think there are some perceptions that differ slightly, and some concerns that people have, that make this a sensitive subject. I hope that by being open, I don't misword something and cause offence or concern. Like you, I'm aware that the process with pip has worked out well so far, and I don't want to disrupt that.
I just want to remind everyone that PEPs can be augments and mistakes can be fixed by superseding one PEP with another. It's a well working process, one that is accepted in Python land and in line with the core development process.
I think the intention with PEP 470 is precisely that, to supersede PEP 438. It's unfortunate that PEP 438 didn't work out so well in practice, but we've learned some lessons, and hopefully we're getting better at how we handle this. Specifically, I think that on the client side, PEP 438 defined too many implementation details and didn't work in terms of functional requirements. There's a tension here in that PEPs have to speak in terms of "installers" and not target pip specifically, as it's important to us that pip competes on an equal footing with other installers, and we don't act as if we have a privileged position. It's also important that the relevant PEPs are actually PEPs about *PyPI* changes. The installer details are simply advice to installers on how to adapt to those changes. But the pip team takes that a stage further and tries to ensure that we "eat our own dogfood" and implement the advice that is included in the PEPs. It's relevant in that context that the most popular "other installer" (easy_install) does not always follow the recommendations in the PEPs, so all of the negative user feedback to UI changes gets directed at pip rather than at "the PEP" or some other general target.
Since pip now is part of the Python stdlib (even though not bound by its release process), and the pip user base is identical with the CPython user base, the PEP process also applies to pip.
My perception is somewhat different (although the practical results are similar, so this is more for context than anything else). It's important to me that pip is *not* part of the stdlib, as the release cycles of the stdlib are too slow for pip. What is part of the stdlib is *ensurepip*, which is a mechanism to install pip into a Python installation. When the subject of pip being included with Python came up, the key concern from the pip side of things was that we are still in a process of dynamic change (wheel support is still under development and improvement, for example) and the constraints of core Python / stdlib would stifle that development. Specifically, the "Policies and Governance" section of PEP 453 explicitly notes that "the bootstrapped software" (i.e. pip) will not come under CPython policies, while still noting the need for co-operation. Once again, it's worth noting that PEPs 438 and 470 are focused on *PyPI*, rather than on pip. Being a good example of an installer by following the recommendations in the PEPs for "installation programs" is part of the pip team's responsibility to work together with the CPython team, not a requirement of PEP 453. I'm completely aware, by the way, that it's pretty naive to speak of pip as merely one of many "installation programs" when in fact there's very few real alternatives. We do so specifically to keep ourselves honest...
That's the consequence of playing the role of an officially sanctioned part of the ecosystem and comes as part of the responsibility resulting from PEP 435.
I hope the above explains why I see pip's responsibilities slightly differently. In practical terms, I think it's unlikely that our differing perspectives will result in any real differences.
So far this has worked out well, which is why I'm surprised by some statements in this discussion.
There has been a fair bit of frustration in this discussion, and that has resulted in some statements that have maybe been a little more black and white than they should have been. But frankly I think everyone has been working really hard to try to understand each others' perspectives, and I hope that will continue. That's basically why I wrote this note - I'm hoping that it's a bit clearer now why the pip team are sensitive to strong statements like "the PEP process applies to pip" which are oversimplifications of a rather complex and still evolving relationship between the core CPython and PyPA teams. Paul
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 8, 2014, at 2:55 PM, Paul Moore <p.f.moore@gmail.com> wrote:
There's a tension here in that PEPs have to speak in terms of "installers" and not target pip specifically, as it's important to us that pip competes on an equal footing with other installers, and we don't act as if we have a privileged position.
This in particular is something I feel fairly strongly about. I do *not* want another situation where everyone has to use X blessed thing (e.g. distutils) or they have to go to terrible lengths (monkey patching etc) and deal with trying to work around the expectation that there would be only one tool (ez_setup.py imports in setup.py). That's why a big focus of mine has been on standardizing things via the PEP process and making them formats and APIs not implementations. I want someone to be able to come along and say "I really "hate/think I can do better than" this pip thing and they have well documented standards and the ability to just ignore pip all together. Now I also hope that pip is good enough that people don't want to do that, but I want it to be entirely possible if they do. That's why you'll see my PEPs reference what high level features an "installer" should implement but you'll not find much detail beyond that. I do try to include examples and to call out what behaviors are currently in play though just so that there's some example and "prior art" sort of information in the PEP. pip has a privileged position amongst installers (and possible future ones!) and we don't want that privelege to be exasperated by making all of our PEPs pip specific. We have a further privelege in that I'm both a pip core developer and a PyPI admin/developer so it would be fairly easy for me to add special APIs just for pip, but again I feel strongly about pip not being special in these situations so we restrict ourselves. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 8 October 2014 22:17, holger krekel <holger@merlinux.eu> wrote:
On Wed, Oct 08, 2014 at 13:05 +0100, Paul Moore wrote:
On 8 October 2014 12:40, holger krekel <holger@merlinux.eu> wrote:
I am concerned about the fact that public PyPI links are merged in even for my private packages residing on the extra index.
Bluntly, that's irrelevant.
I disagree. The PEP uses merging of public and private links in the main rationale section which comes before discussing migration strategies. It's used as motivation aka "look how easy it is to use additional/multi indexes" and not as a particular migration strategy that shouldn't be used otherwise.
OK, I think I understand your concern now - the PEP includes an example of a practice that you don't like and would prefer to see strongly discouraged. We can just delete all references to private indexes from the PEP, as they were merely included as an illustration of one of the reasons the multi-index/alternative-index support already exists. If you find the example distracting from the actual point of the PEP, then the example isn't serving its purpose, and we're better off without it. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 8, 2014, at 8:24 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 8 October 2014 22:17, holger krekel <holger@merlinux.eu> wrote:
On Wed, Oct 08, 2014 at 13:05 +0100, Paul Moore wrote:
On 8 October 2014 12:40, holger krekel <holger@merlinux.eu> wrote:
I am concerned about the fact that public PyPI links are merged in even for my private packages residing on the extra index.
Bluntly, that's irrelevant.
I disagree. The PEP uses merging of public and private links in the main rationale section which comes before discussing migration strategies. It's used as motivation aka "look how easy it is to use additional/multi indexes" and not as a particular migration strategy that shouldn't be used otherwise.
OK, I think I understand your concern now - the PEP includes an example of a practice that you don't like and would prefer to see strongly discouraged.
Does it? The only examples in the PEP are showing: A) How can I, as an author of a project who wishes to use this new mechanism do so for my project. B) How can I, as a user of a project who is using this new mechanism tell pip to add this additional *public* repository so that I can install it since they don’t host on PyPI.
We can just delete all references to private indexes from the PEP, as they were merely included as an illustration of one of the reasons the multi-index/alternative-index support already exists. If you find the example distracting from the actual point of the PEP, then the example isn't serving its purpose, and we're better off without it.
There is really only one mention in the entire PEP that I can remember or find in a quick re-skim. That is in: "Additionally, the multiple repository approach is a concept that is useful outside of the narrow scope of allowing projects which wish to be included on the index portion of PyPI but do not wish to utilize the repository portion of PyPI. This includes places where a company may wish to host a repository that contains their internal packages or where a project may wish to have multiple "channels" of releases, such as alpha, beta, release candidate, and final release.” Which is just saying “hey this concept of pip works with repositories, not PyPI, PyPI just happens to be the default repository” not only already exists, but is useful in more situations and one of those situations is internal company repositories. I can remove the half a dozen total words that constitute the only reference in the PEP to a private anything, but I’m still confused how this somehow correlates to the PEP is advocating everyone switch to using —extra-index-url for their private repositories when in reality the PEP is giving and example of what someone would need to do, as pip currently stands, to utilize a project that uses this feature. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 8 October 2014 21:40, holger krekel <holger@merlinux.eu> wrote:
No, i am not concerned about the extra index supplying whatever packages. After all, the users specifies the option and should trust that index.
I am concerned about the fact that public PyPI links are merged in even for my private packages residing on the extra index.
That's what a default repository *does*. It's always on, unless you explicitly turn it off. Hence the name *extra index*. The index URL option is the one to use if you want to *replace* the index. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/14434/14434093fe65023938db06aab2af2d4631ae040a" alt=""
On Wed, Oct 08, 2014 at 22:18 +1000, Nick Coghlan wrote:
On 8 October 2014 21:40, holger krekel <holger@merlinux.eu> wrote:
No, i am not concerned about the extra index supplying whatever packages. After all, the users specifies the option and should trust that index.
I am concerned about the fact that public PyPI links are merged in even for my private packages residing on the extra index.
That's what a default repository *does*. It's always on, unless you explicitly turn it off. Hence the name *extra index*. The index URL option is the one to use if you want to *replace* the index.
Nick, i don't know why you are saying this. Do you think i don't know this? My point is that PyPI makes for a very different default repository than the Debian or Redhat one. Or do you disagree there? holger
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 8, 2014, at 8:43 AM, holger krekel <holger@merlinux.eu> wrote:
On Wed, Oct 08, 2014 at 22:18 +1000, Nick Coghlan wrote:
On 8 October 2014 21:40, holger krekel <holger@merlinux.eu> wrote:
No, i am not concerned about the extra index supplying whatever packages. After all, the users specifies the option and should trust that index.
I am concerned about the fact that public PyPI links are merged in even for my private packages residing on the extra index.
That's what a default repository *does*. It's always on, unless you explicitly turn it off. Hence the name *extra index*. The index URL option is the one to use if you want to *replace* the index.
Nick, i don't know why you are saying this. Do you think i don't know this?
My point is that PyPI makes for a very different default repository than the Debian or Redhat one. Or do you disagree there?
If you understand that, then your statements in here don’t make any sense to me. What is it you’re trying to achieve exactly? Do you think the PEP should be rejected? Do you think it needs amended? You’re saying things that I can’t reconcile how they relate to the PEP (and I’m apparently not the only one) nor can I convert them into actionable feedback. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/14434/14434093fe65023938db06aab2af2d4631ae040a" alt=""
On Wed, Oct 08, 2014 at 08:47 -0400, Donald Stufft wrote:
On Oct 8, 2014, at 8:43 AM, holger krekel <holger@merlinux.eu> wrote:
On Wed, Oct 08, 2014 at 22:18 +1000, Nick Coghlan wrote:
On 8 October 2014 21:40, holger krekel <holger@merlinux.eu> wrote:
No, i am not concerned about the extra index supplying whatever packages. After all, the users specifies the option and should trust that index.
I am concerned about the fact that public PyPI links are merged in even for my private packages residing on the extra index.
That's what a default repository *does*. It's always on, unless you explicitly turn it off. Hence the name *extra index*. The index URL option is the one to use if you want to *replace* the index.
Nick, i don't know why you are saying this. Do you think i don't know this?
My point is that PyPI makes for a very different default repository than the Debian or Redhat one. Or do you disagree there?
If you understand that, then your statements in here don’t make any sense to me.
What is it you’re trying to achieve exactly? Do you think the PEP should be rejected? Do you think it needs amended? You’re saying things that I can’t reconcile how they relate to the PEP (and I’m apparently not the only one) nor can I convert them into actionable feedback.
Sorry that it's so unclear to you, Nick and Paul. I tried my best. And i tried to make suggestions what to change, what to avoid, what kind of options pip would need to become safer etc.. That was all meant as useful feedback to get a better PEP and end result. But if you and Nick as authors refuse my suggestions (mainly: backward compat, more careful reasoning about multi-index ops) then i am currently clearly -1 on the PEP because i think it does more harm than good. And i'll let it all rest at that for a bit because i don't want to spend more time on it right now. best, holger
data:image/s3,"s3://crabby-images/91953/919530deb337641f4df54505d8b507a52e5cd2d7" alt=""
On Oct 8, 2014, at 8:59 AM, holger krekel <holger@merlinux.eu> wrote:
On Wed, Oct 08, 2014 at 08:47 -0400, Donald Stufft wrote:
On Oct 8, 2014, at 8:43 AM, holger krekel <holger@merlinux.eu> wrote:
On Wed, Oct 08, 2014 at 22:18 +1000, Nick Coghlan wrote:
On 8 October 2014 21:40, holger krekel <holger@merlinux.eu> wrote:
No, i am not concerned about the extra index supplying whatever packages. After all, the users specifies the option and should trust that index.
I am concerned about the fact that public PyPI links are merged in even for my private packages residing on the extra index.
That's what a default repository *does*. It's always on, unless you explicitly turn it off. Hence the name *extra index*. The index URL option is the one to use if you want to *replace* the index.
Nick, i don't know why you are saying this. Do you think i don't know this?
My point is that PyPI makes for a very different default repository than the Debian or Redhat one. Or do you disagree there?
If you understand that, then your statements in here don’t make any sense to me.
What is it you’re trying to achieve exactly? Do you think the PEP should be rejected? Do you think it needs amended? You’re saying things that I can’t reconcile how they relate to the PEP (and I’m apparently not the only one) nor can I convert them into actionable feedback.
Sorry that it's so unclear to you, Nick and Paul. I tried my best. And i tried to make suggestions what to change, what to avoid, what kind of options pip would need to become safer etc.. That was all meant as useful feedback to get a better PEP and end result.
But if you and Nick as authors refuse my suggestions (mainly: backward compat, more careful reasoning about multi-index ops) then i am currently clearly -1 on the PEP because i think it does more harm than good.
And i'll let it all rest at that for a bit because i don't want to spend more time on it right now.
I think I responded why I had considered and then rejected the backwards compatibility concern. We may just disagree on that point. I don’t understand what “more careful reasoning about multi-index ops” means. Maybe if you suggest a rewording or point to a specific part of the PEP that you think should be removed/edited/added to? If you’d rather not do that above that’s fine! Just saying if you care to spend more time on it that maybe an explicit suggestion of what to change in the PEP would be easier to understand. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
data:image/s3,"s3://crabby-images/2ffc5/2ffc57797bd7cd44247b24896591b7a1da6012d6" alt=""
I have a suggestion. Holger obviously feels he has something very important to say, and a lot of e-mails have already been sent back and forth. Is there some way that Donald, Nick, and Holger could perhaps have a conference call or hangout of some sort just for the purpose of understanding and/or confirming exactly what his concern is (and, if possible, coming to agreement on a resolution)? And then the result of that conversation can be summarized for the list? I think that might be more constructive at this point and courteous to Holger. I know that for me, sometimes "a quick phone call" can do wonders. --Chris On Wed, Oct 8, 2014 at 6:13 AM, Donald Stufft <donald@stufft.io> wrote:
On Oct 8, 2014, at 8:59 AM, holger krekel <holger@merlinux.eu> wrote:
On Oct 8, 2014, at 8:43 AM, holger krekel <holger@merlinux.eu> wrote:
On Wed, Oct 08, 2014 at 22:18 +1000, Nick Coghlan wrote:
On 8 October 2014 21:40, holger krekel <holger@merlinux.eu> wrote:
No, i am not concerned about the extra index supplying whatever
After all, the users specifies the option and should trust that index.
I am concerned about the fact that public PyPI links are merged in even for my private packages residing on the extra index.
That's what a default repository *does*. It's always on, unless you explicitly turn it off. Hence the name *extra index*. The index URL option is the one to use if you want to *replace* the index.
Nick, i don't know why you are saying this. Do you think i don't know
My point is that PyPI makes for a very different default repository
On Wed, Oct 08, 2014 at 08:47 -0400, Donald Stufft wrote: packages. this? than the
Debian or Redhat one. Or do you disagree there?
If you understand that, then your statements in here don’t make any sense to me.
What is it you’re trying to achieve exactly? Do you think the PEP should be rejected? Do you think it needs amended? You’re saying things that I can’t reconcile how they relate to the PEP (and I’m apparently not the only one) nor can I convert them into actionable feedback.
Sorry that it's so unclear to you, Nick and Paul. I tried my best. And i tried to make suggestions what to change, what to avoid, what kind of options pip would need to become safer etc.. That was all meant as useful feedback to get a better PEP and end result.
But if you and Nick as authors refuse my suggestions (mainly: backward compat, more careful reasoning about multi-index ops) then i am currently clearly -1 on the PEP because i think it does more harm than good.
And i'll let it all rest at that for a bit because i don't want to spend more time on it right now.
I think I responded why I had considered and then rejected the backwards compatibility concern. We may just disagree on that point.
I don’t understand what “more careful reasoning about multi-index ops” means. Maybe if you suggest a rewording or point to a specific part of the PEP that you think should be removed/edited/added to?
If you’d rather not do that above that’s fine! Just saying if you care to spend more time on it that maybe an explicit suggestion of what to change in the PEP would be easier to understand.
--- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 8 October 2014 13:59, holger krekel <holger@merlinux.eu> wrote:
But if you and Nick as authors refuse my suggestions (mainly: backward compat, more careful reasoning about multi-index ops) then i am currently clearly -1 on the PEP because i think it does more harm than good.
Holger, there's been a lot said in this thread, and it's entirely possible I may have missed something crucial. But it seems to me that a lot of the debate has been about wording and rationale. Can I just cross-check with you, before you leave the discussion: 1. Ignoring all of the explanations and rationale, are you -1 on the technical changes being proposed? 2. Do you have an alternative proposal, or is your -1 in effect a vote to do nothing? Personally, I think that we have to do something about the pip user interface, as the current situation is harming our users. If PEP 470 isn't accepted, we'll need to look again at what we do in relation to PEP 438 support. Frankly, I'd rather not go there, as I think it's clear from the feedback we've received that full support is harmful to our users. Paul
data:image/s3,"s3://crabby-images/ea43d/ea43d8087dffe5250fc50c362c667c3462671c81" alt=""
On Fri, Oct 03, 2014 at 02:05:50AM -0400, Donald Stufft wrote:
Using this additional location within pip is also simple and can be included on a per invocation, per shell, or per user basis. The pip 6.0 will also include the ability to configure this on a per virtual environment or per machine basis as well. This can be as simple as:
::
$ # As a CLI argument $ pip install --extra-index-url https://index.example.com/ myproject $ # As an environment variable $ PIP_EXTRA_INDEX_URL=https://pypi.example.com/ pip install myproject $ # With a configuration file $ echo "[global]\nextra-index-url = https://pypi.example.com/" > ~/.pip/pip.conf $ pip install myproject
This is where I get a question: what do I do if package X wants an extra repository FOO, and package Y wants an extra repository BAR, and my project relies on both X and Y? I assume the --extra-index-url=URL argument to pip install can be repeated multiple times. It's less clear what to do about environment variables or config file settings. Do I specify space-separated URLs? Newline separated? An example would be good. Marius Gedminas -- Jim's Three Laws of Engineering: 1. F = ma 2. You can't solve a problem unless you know the answer 3. You can't push a rope
data:image/s3,"s3://crabby-images/37a4a/37a4aa454918b0961eaf9b44f307b79aea0e872f" alt=""
On Fri, Oct 3, 2014 at 9:42 AM, Marius Gedminas <marius@pov.lt> wrote:
On Fri, Oct 03, 2014 at 02:05:50AM -0400, Donald Stufft wrote:
Using this additional location within pip is also simple and can be included on a per invocation, per shell, or per user basis. The pip 6.0 will also include the ability to configure this on a per virtual environment or per machine basis as well. This can be as simple as:
::
$ # As a CLI argument $ pip install --extra-index-url https://index.example.com/ myproject $ # As an environment variable $ PIP_EXTRA_INDEX_URL=https://pypi.example.com/ pip install myproject $ # With a configuration file $ echo "[global]\nextra-index-url = https://pypi.example.com/" > ~/.pip/pip.conf $ pip install myproject
This is where I get a question: what do I do if package X wants an extra repository FOO, and package Y wants an extra repository BAR, and my project relies on both X and Y?
I assume the --extra-index-url=URL argument to pip install can be repeated multiple times. It's less clear what to do about environment variables or config file settings. Do I specify space-separated URLs? Newline separated?
An example would be good.
Marius Gedminas
I would assume something like $ PIP_EXTRA_INDEX_URL=https://pypi1.example.com/,https://pypi2.example.com/ pip install myproject $ pip install --extra-index-url=https://pypi1.example.com/ --extra-index-url=https://pypi2.example.com/ myproject
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 3 October 2014 15:44, Ian Cordasco <graffatcolmingov@gmail.com> wrote:
I would assume something like
$ PIP_EXTRA_INDEX_URL=https://pypi1.example.com/,https://pypi2.example.com/
Separate with spaces. See https://pip.pypa.io/en/latest/user_guide.html#configuration for the details. Paul
participants (9)
-
Chris Jerdonek
-
Donald Stufft
-
holger krekel
-
Ian Cordasco
-
M.-A. Lemburg
-
Marius Gedminas
-
Nick Coghlan
-
Paul Moore
-
Wichert Akkerman