Re: disable building wheel for a package

On 2018-09-14 12:55, Alex Grönholm wrote:
I'm curious: what data does it attempt to install and where? Have you created a ticket for this somewhere?
The OP mentioned absolute paths. However, it really sounds like a bad idea to hard-code an absolute installation path. Let's consider it a feature that wheel doesn't support that. See https://github.com/pypa/wheel/issues/92

Yes that feels very dangerous to me. From the github thread it sounds like setuptools *does* support it. I would be inclined to deprecate support for that in setuptools if that's the case (though obviously since setup.py is a regular python file you can write your own code to do whatever you want in it, so it's more in the spirit of a deterrent than a protection). On September 14, 2018 11:42:37 AM UTC, Jeroen Demeyer <J.Demeyer@UGent.be> wrote:

On Fri, 14 Sep 2018 at 12:43, Jeroen Demeyer <J.Demeyer@ugent.be> wrote:
The OP hasn't said, but I assumed that it was expecting to install something in a "standard" Unix location like /etc. As a Windows user, I think that's a bad idea (and regardless of what OS I use, it's clearly a non-portable idea) but I've no idea if wheels that install stuff to locations like /etc have a sensible meaning on Unix. (They obviously don't handle being installed in --user, or in a virtualenv, so they aren't what I'd call "proper" Python packages, but maybe there's still a reasonable use case here). Regardless, I do consider it a feature that wheel doesn't support installation to absolute paths. Paul

No one wants wheel to be able to install things outside of the virtualenv. What people have repeatedly asked for is the ability to install things somewhere besides $VIRTUAL_ENV/lib/python#.#/site-packages/, places like $VIRTUAL_ENV/etc/ for example. Should all the config files, documentation, data, man pages, licenses, images, go into $VIRTUAL_ENV/lib/python#.#/site-packages/? Or can we do a better job letting people put files in $VIRTUAL_ENV/xyz? On Fri, Sep 14, 2018 at 9:51 AM sashk <b@sashk.xyz> wrote:

It can be hard to predict where data went at runtime. Maybe we could record its path in .dist-info during the install. I think it may also not be clear how to put files into wheel's data directory from setup.py. If we added more categories it would allow the installer to override e.g. just the config directory rather than copying a Unix-like tree under data/ onto the install path. On Fri, Sep 14, 2018, 10:46 Tzu-ping Chung <uranusjr@gmail.com> wrote:

On Fri, 14 Sep 2018 at 16:03, Daniel Holth <dholth@gmail.com> wrote:
It can be hard to predict where data went at runtime.
I don't think it's "hard to predict". I *do* think it's badly documented/not standardised. See my previous note - pip installs into the install_data location that setuptools/distutils chooses. Ideally: a) Setuptools and/or distutils would document where that is clearly and in an easy to find location. b) There should be a standard defining "schemes" like this so that the choices aren't implementation-defined (by pip, setuptools and distutils in some weird combination). Of course "there should be..." means "someone who cares enough needs to invest time in..." :-(
Maybe we could record its path in .dist-info during the install. I think it may also not be clear how to put files into wheel's data directory from setup.py.
Better would be to have a supported library that exposes the logic pip uses (or as I said above, the standard-defined logic) to determine such paths. See https://github.com/pypa/pip/issues/5191
If we added more categories it would allow the installer to override e.g. just the config directory rather than copying a Unix-like tree under data/ onto the install path.
That's a reasonable but somewhat unrelated suggestion. I don't think it's needed for the OP's use case, but it may well be helpful for others. Paul

A corner case is where the package is importable because it is on $PYTHONPATH and so the $virtual_env at runtime is different than the one at install time. That is why it might be useful to store the data directory per-package. wheel.install.get_install_paths(package_name) shows where things would be installed according to wheel. It comes from a call to distutils. pip has its own implementation. On my machine data goes into $virtualenv in a virtualenv and into /usr on system python. On Fri, Sep 14, 2018 at 11:26 AM Paul Moore <p.f.moore@gmail.com> wrote:

On Fri, Sep 14, 2018, at 4:26 PM, Paul Moore wrote:
There is an official standard library API in the sysconfig module to find installation locations: https://docs.python.org/3/library/sysconfig.html#installation-paths Unfortunately, distutils has a copy of this logic rather than using the sysconfig module, from what I remember. Some Linux distros have patched distutils to put installed files in different locations, but have not necessarily patched sysconfig, presumably because they didn't think about it. Even if sysconfig were patched, distros may have a different location for files installed by the distro package manager and files installed by other means (Debian based distros use /usr and /usr/local for these). So there's no one data directory where you can find all files related to importable packages. (Of course, we advise against 'sudo pip install', but people still do it anyway). This may be somewhat outdated - it's been a while since I looked into this, but I don't think the relevant pieces are changing rapidly. My conclusion at the time was that the only reliable way to have data files findable at runtime was to put them inside the importable package.

On Tue, 18 Sep 2018 at 19:54, Thomas Kluyver <thomas@kluyver.me.uk> wrote:
Yes, if it weren't for the stuff mentioned below, that would be the one obvious way of doing things. However...
Unfortunately, distutils has a copy of this logic rather than using the sysconfig module, from what I remember. Some Linux distros have patched distutils to put installed files in different locations, but have not necessarily patched sysconfig, presumably because they didn't think about it.
(Technically, I think the history is that sysconfig was created by pulling the distutils logic out into a standalone module, but no-one modified distutils to use sysconfig - probably the usual "fear of changing anything in distutils" problem :-() Even more unfortunately, pip further layers its own logic on this (https://github.com/pypa/pip/blob/master/src/pip/_internal/locations.py#L136) - basically a "sledgehammer to crack a nut" approach of initialising a distutils install command object, and then introspecting it to find out where it'll install stuff - then hacking some special cases on top of that. I assume it's doing that to patch over the incomplete Linux distro patching.
Even if sysconfig were patched, distros may have a different location for files installed by the distro package manager and files installed by other means (Debian based distros use /usr and /usr/local for these). So there's no one data directory where you can find all files related to importable packages. (Of course, we advise against 'sudo pip install', but people still do it anyway).
Yeah. We don't really have any control or influence over what distros do, we just end up reacting to their changes. That's why I think we need a centralised supported library - unless the distros start working better with the Python stdlib APIs, someone needs to collect together all the workarounds and hacks needed to interact with them, and that's better done in one place than duplicating the work in multiple projects (and the fact that people have asked for pip's locations module to be exposed as an API confirms to me that there *are* multiple projects that need this).
This may be somewhat outdated - it's been a while since I looked into this, but I don't think the relevant pieces are changing rapidly. My conclusion at the time was that the only reliable way to have data files findable at runtime was to put them inside the importable package.
Agreed. There's more engagement with the distros these days (I think) but it hasn't resulted in much change in how they handle things yet. Paul

Part of the challenge here from my perspective is that even though I can envision what the end solution looks like, it's not clear how to actually get there. How do we designate someone to make decisions about what to include? How do we protect ourselves against special-case logic being kept in individual project repos as it is now, but instead pushed back up to the main repo? From our perspective (pipenv/various other projects I maintain at least), the implementation in pip is the minimum viable standard to which all other tooling _must_ conform, at least from a UX standpoint. Since package installation is step 0 of any developer's work, even minor workflow changes can be jarring and seriously disruptive, which is why we rely on the pip implementations a lot It reminds me of the old adage about starting off with one problem... you don't want a new developer to have a clever idea (problem one) and go to set up their new environment and the first thing that happens is they experience some initial error before they've even managed to install a package (now they have two problems and one of them has nothing to do with their actual task). As a consequence even though there are other libraries that may provide some of this functionality, pip has the reference implementation and that contains some significant additional logic. I don't imagine that pip is going to simply adopt some new library without significant review... The substantial effort required to actually get people to review the code involved in standardizing the functionality people are 'borrowing' from pip is probably going to be a challenge, and that's before we ever consider that it will be difficult getting people to agree on what should be standardized and extracted. People are basically taking a path-of-least-resistance approach right now which means everyone is building their own tooling as needed or simply using the tooling in pip because they know that will always provide the same UX that the user is accustomed to (see my first point). I don't see anyone sinking significant effort into a shared library until we agree on what needs to be shared and can be confident that it will land in the various tools at some point. That said, there aren't too many of us that can put all the pieces together so it shouldn't be that difficult to outline what we need -- anyone who wants to have a look at what most of our projects are using from pip can shoot me a message off-list and I can send you a link to the repository Dan Ryan gh: @techalchemy // e: dan@danryan.co this,

Speaking for myself, generally if someone spins functionality out of pip into a dedicated library, and that library is well tested, and has done the diligence to ensure that answers aren’t changing if pip switches to that library (or if they do change, they changed purposefully and we can document it and deal with deprecation) then I don’t think there’s a whole lot of blocker there. Obviously the level of review and testing should be commiserate with the importance of the part that library controls. For instance when we switched to using packaging’s implementation of version handling, I had spent hours compiling what the differences were going to be between PEP 440 versions and the new version handling across all of PyPI. However for platform detection for the User Agent we switched to a third party library with just a cursory glance. Mainly I think the important thing, as far as pip is concerned, is for someone to identify before hand what piece they want to carve out of pip into a library and make sure that none of the pip developers have a problem with that, then make sure that they do the diligence in making sure that the new library matches the old behavior etc, and then submit a PR to pip that swaps us to using it. For a lot of things, moving the reference implementation into pypa/packaging is going to be the right answer, which is already bundled in pip anyways.

Agreed. Furthermore, if people are of the opinion that pip's implementation is suitable, copying it out into packaging is likely not going to be at all controversial. Of course, it's not going to be any direct advantage to pip if that's done (we get the same functionality, just in a different place), so the people benefiting are those who want a supported API for that functionality, and it seems only reasonable to expect them to do the job of moving the code, rather than expecting the pip developers to do so. In the case of pip's location code, more time has likely been spent discussing the problem than it would actually take to make the change. (Of course, no one person has spent that much time in discussions, but it adds up - coding doesn't work that way sadly). Paul On Tue, 18 Sep 2018 at 22:03, Donald Stufft <donald@stufft.io> wrote:

This is where I think we disagree and I feel the rhetoric is a bit harmful -- personally I don't benefit much at all, I actually don't think any individual maintainer inside the PyPA benefits much beyond having a new project to maintain, so the 'helps me vs helps you' framing isn't really the point. If it strictly helped me to add a project to my list of things to maintain I would have done that already. The real issue here is that we all have different implementations and they create non-uniform / disjointed user experiences. Converging on a set of common elements to extract seems like step 1 I am fairly new to the PyPA, and I don't know how any of these processes actually work. But I do know that painting this as "us vs you" when my interest actually in helping the user of packaging tools is causing a disconnect for me anytime we engage on this -- and I'm not asking you to tackle any of this yourself, except possibly review someone's PR down the road to swap out some internals. Dan Ryan gh: @techalchemy // e: dan@danryan.co

On Wed, 19 Sep 2018 at 00:52, Dan Ryan <dan@danryan.co> wrote:
Apologies. I misread your email, and so I was mostly addressing the issues we've seen posted to pip asking for us to simply expose the internal functions, not your comment about multiple projects implementing the logic. Sorry for that. Agreed if we already have multiple implementations, merging them is a useful thing, but the benefits are diffuse and long term, so it's the sort of thing that tends to remain on the back burner indefinitely. (One of the problems with open source is that unless something is *already* available as a library, we tend to reimplement rather than refactoring existing code out of a different project, because the cost of that interaction is high - which unfortunately I demonstrated above by my comment "people needing an API should do the work" :-(). Paul

Risking thread hijacking, I want to take this chance and ask about one particular multiple implementation problem I found recently. What is the current situation regarding distlib vs packaging and various pieces in pip? Many parts of distlib seems to have duplicates in either packaging or pip/setuptools internals. I understand this is a historical artifact, but what is the plan going forward, and what strategy, if any, should a person take if they are to make the attempt of merging, or collecting pieces from existing code bases into a workable library? From what I can tell (very limited), distlib seems to contain a good baseline design of a library fulfilling the intended purpose, but is currently missing parts to be fully usable on its own. Would it be a good idea to extend it with picked parts from pip? Should I contribute directly to it, or make a (higher level) wrapper around it with those parts? Should I actually use parts from it, or from other projects (e.g. distlib.version vs packaging.version, distlib.locator or pip’s PackageFinder)? It would be extremely helpful if there is a somewhat general, high-level view to the whole situation. TP

On Wed, 19 Sep 2018 at 09:39, Tzu-ping Chung <uranusjr@gmail.com> wrote:
Risking thread hijacking, I want to take this chance and ask about one particular multiple implementation problem I found recently.
I changed the subject to keep things easier to follow. Hope that's OK.
What is the current situation regarding distlib vs packaging and various pieces in pip? Many parts of distlib seems to have duplicates in either packaging or pip/setuptools internals. I understand this is a historical artifact, but what is the plan going forward, and what strategy, if any, should a person take if they are to make the attempt of merging, or collecting pieces from existing code bases into a workable library?
Note: This is my personal view of the history only, Vinay and Donald would be better able to give definitive answers
From what I can tell (very limited), distlib seems to contain a good baseline design of a library fulfilling the intended purpose, but is currently missing parts to be fully usable on its own. Would it be a good idea to extend it with picked parts from pip? Should I contribute directly to it, or make a (higher level) wrapper around it with those parts? Should I actually use parts from it, or from other projects (e.g. distlib.version vs packaging.version, distlib.locator or pip’s PackageFinder)? It would be extremely helpful if there is a somewhat general, high-level view to the whole situation.
Distlib was created as a place to experiment with making a library-style interface to various pieces of packaging functionality. At the time it was created, there were not many standardised parts of the packaging ecosystem, so while it followed the standards where they existed, it also implemented a number of pieces of functionality that *weren't* backed by standards (obvious examples being the script creation stuff and the package finder). Packaging, in the other hand, was designed to focus strictly on implementations of agreed standards, providing reference APIs for projects to use. Pip uses both libraries, but as far as I'm aware, we'd use an API from packaging in preference to distlib. The only distlib API we use is the script maker API. Pretty much everything else in distlib, we already had an internal implementation for by the time distlib was written, so there was no benefit in changing (in contrast, the benefit in switching to packaging is "by design conformance to the relevant standards"). My recommendations would be: 1. Use packaging APIs always where they exist, even if a distlib equivalent exists. 2. Never use pip APIs, they are internal use only (Paul bangs on that old drum again :-)) 3. Consider using distlib APIs for things like the locator API, because it's better than writing your own code, but be aware of the risks. When I say risks here, the things I'd consider are: * Distlibs APIs aren't used in many projects, so they are likely less well tested (that's a chicken and egg issue, people need to use them to make them better). * Be aware that they may not behave identically to pip outside of the standardised areas. * In particular, review the details of which locators you use - the default is very different from pip's process (although the results should be the same). I don't know what the longer term goals are for distlib. It's not yet really become the "toolkit for building packaging tools" that it could be. Whether there's an intention for it to become that, I don't know. In some ways I'd like to turn the question round - why didn't tools like pipenv and pip-tools use distlib for their core functionality, rather than patching into pip's internals? The answers to that question might clarify better what needs to happen if distlib is to become the obvious place to find packaging functionality. Paul

On Wed, 19 Sep 2018 at 19:19, Paul Moore <p.f.moore@gmail.com> wrote:
When it comes to things like pip-tools and pipenv, my experience is that users are really expecting to get the same results as they get from pip, and get upset when they differ (even if what pip is doing is arbitrary, and what the wrapper tool does is similarly arbitrary, but also different). So, using pip's internals makes more sense than attempting to explain the behavioural differences between pip and distlib. However, pipenv at least is finding that pip's behaviour doesn't necessarily match what pipenv needs (in particular, it needs much better support for working with Python installations other than the one hosting pipenv itself). Given that, and assuming Vinay is amenable to the idea, it would be nice to revisit the concept of the two layer architecture, with packaging as the lower level minimalist strictly standards compliant layer, and distlib as the higher level general purpose toolkit that brings together various other libraries (including packaging itself) under a more comprehensive API. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, 19 Sep 2018 at 10:54, Nick Coghlan <ncoghlan@gmail.com> wrote:
I'd certainly be OK with delegating more of the "common activities" to distlib. But it's a *long* way from being simple to do so, and pip would need to take a lot of care to ensure that doing so didn't result in behavioural differences. Also, we'd need to be careful of dumping too much on distlib without making the sustainability problem even worse - at the moment, as far as I know, Vinay is the sole distlib developer. Paul

I have the same experience with Pipenv as Nick’s. I would also guess another reason is the lack of knowledge—this is certainly my own before I get involved in Pipenv. There is barely any guide on how I should implement such a thing, and my developer’s instinct would tell me to look at a known implementation, i.e. pip. This also ties back to the problem that pip uses distlib internally barely at all—had it used more of it, people might be pointed to the right direction. Migrating Pipenv’s internals from pip instead of distlib is actually the exact thing I am thinking about when I raised the question. There are, as mentioned, a lot of pieces missing in distlib. For example, distlib knows how to find a distribution, and how to install wheels, but not how a non-wheel distribution can be turned into a wheel. [1] It also has no functionalities on uninstallation. If I’m to glue together a working thing, I would likely need to copy/reimplement parts of pip, but where should they live? Do I add yet another layer above distlib to include them, or do I try to include them in distlib? Although distlib provides a nice basis, I feel it is still one layer below what most people want to do, e.g. install a thing by name (or URL). But would a three-layer design be too much, or should distlib have a high-level API as well? [1]: Also, while I’m having your attention—I’m trying to use the pep517 library as part of the solution to build an sdist into wheel, but I’m hitting a bug. Could you help review my PR? :p https://github.com/pypa/pep517/pull/15 <https://github.com/pypa/pep517/pull/15> TP

On Wed, 19 Sep 2018 at 11:34, Tzu-ping Chung <uranusjr@gmail.com> wrote:
That's probably Vinay's call. IMO, whatever layer there is above packaging, it should do stuff like this.
I don't think a 3-layer approach is sensible, two layers is more than enough. Maybe not go all the way to "install something by name". Maybe APIs to * find what's available in a set of indexes/locations * check if a compatible wheel is in that list * make a wheel from a sdist * install a wheel It's hard to make something like this sound like anything other than put "pip into a library and make pip itself a thin shell round that library" though. At which point we come full circle round to the fact that the *reason* we don't support a pip API is that the cost of designing one, restructuring pip to expose (and use!) it, and creating documentation and tests to make sure that API is stable and properly supported, is huge. And we don't have the resources. It's not so much a technical/design or layering issue, it's really a resourcing and sustainability issue.
It's not really me that can comment, you probably need Thomas. To clarify, even though I have commit rights on the pep517 project, I'm not really a project maintainer there. My interest in it is as a consumer from pip, which only uses the hook wrapper code. So this PR is in something I've not used or looked at closely. Pip does its own build isolation, ironically enough :-( Paul

I’ve personally always planned on pulling out the last bits of what we do use distlib for in pip, and not relying on it any longer. My general plan for extracting stuff from pip and/or setuptools has always been to first standardize in a PEP (if a sufficient one doesn’t already exist) anything that makes sense to be as a standard, and then start either reimplementing or pulling code out of pip (or setuptools if pip is using setuptools). When doing that I had always planned on spending a lot of effort ensuring that the behavior matches what pip is already doing (or have known, specific divergences). Now some of this already exists in distlib, but I don’t plan on using it. Part of that is because I find it easier to identify things that should be standardized but aren’t if I’m not using a big bundle of already implemented stuff already (for instance, script support needs to be standardized, but it didn’t occur to us at the time because we just used what distlib had). It also has a bunch of functionality that exists only in distlib, like attempting to use JSON to find packages (at one point there was even a locator that implicitly used a non PyPI server, no idea if there still is) that I felt made it harder to use the library in a way that didn’t basically create a new set of implementation defined semantics. It also copied APIs and semantics that I think *shouldn’t* have been (for instance, it has an implicitly caching resource API like setuptools does… a generally bad idea IMO, whereas using the new importlib.resources is a much saner API). So in general, there are things that currently only exist in distlib or setuptools, but my personal long term plan for pip is that we should get solid implementations of those things out of those libraries, but generally my mind puts distlib and setuptools in largely the same boat.

I feel the plan is quite solid. This however leaves us (who want a Python implementation and interface to do what pip does) in an interesting place. So I can tell there are a couple of principles: 1. Do not use pip internals 2. pip won’t be using either distlib or setuptools, so they might not match what pip does, in the long run Does this leaves us only one option left, to implement a library that matches what pip does (follows the standards), but is not pip? That feels quite counter-productive to me, but if it’s what things would be, I’d accept it. The next step (for me) in that case would then be to start working on that library. Since existing behaviours in setuptools and pip (including the part it uses distlib for) are likely to be standardised, I can rely on distlib for script creation, setuptools for some miscellaneous things (editable installs?), and pull (or reimplement) parts out of pip for others. Are there caveats I should look out? TP -- Tzu-ping Chung (@uranusjr) uranusjr@gmail.com Sent from my iPhone

My general recommendation if you want a Python implementation/interface for something pip does, is: - Open an issue on the pip repository to document your intent and to make sure that there is nobody there who is against having that functionality split out. This might also give a chance for people with familiarity in that API to mention pain points that you can solve in a new API. We can also probably give you a good sense if the thing you want in a library is something that probably has multiple things that are dependent on getting split out first (for instance, if you said you wanted a library for installing wheels, we’d probably tell you that there is a dependency on PEP 425 tags, pip locations, maybe other that need resolved first) and also whether this is something that should have a PEP first or not. Getting some rough agreement on the plan to split X thing out before you start is overall a good thing. - Create or update a PEP if required, and get it into the provisional state. - Make the library, either as a PR to packaging or as it’s own independent library. If there are questions that come up while creating that library/PR that have to do with specific pip behaviors, go back to that original issue and ask for clarification etc. Ideally at some point you’ll open a PR on pip that uses the new library (my suggestion is to not bundle the library in the initial PR, and just import it normally so that the PR diff doesn’t include the full bundled library until there’s agreement on it). If there’s another tool (pipenv, whatever) that is looking to use that same functionality, open a WIP PR there too that switches it to using that. Use feedback and what you learn from trying to integrate in those libraries to influence back the design of the API itself. Creating a PEP and creating the library and the PRs can happen in parallel, but at least for pip if something deserves a PEP, we’re not going to merge a PR until that PEP is generally agreed on. However it can be supremely useful to have them all going at the same time, because you run into things that you didn’t really notice until you went to actually implement it. My other big suggestion would be to e careful about how much you bite off at one time. Pip’s internal code base is not the greatest, so pulling out smaller chunks at a time rather than trying to start right off pulling out a big topic is more likely to meet with success. Be cognizant of what the dependencies are for the feature you want to implement, because if it has dependencies, you’ll need to pull them out first before you can pull it out OR you’ll need to design the API to invert those dependencies so they get passed in instead. I personally would be happy to at a minimum participate on any issue where someone was trying to split out some functionality from pip into a re-usable library if not follow the develop of that library directly to help guide it more closely. My hope for pip is that it ends up being the glue around a bunch of these libraries, and that it doesn’t implement most of the stuff itself anymore.

On Wed, 19 Sep 2018 at 18:52, Donald Stufft <donald@stufft.io> wrote:
I basically agree with everything Donald said, and I'd also be happy to support any work along these lines. If you're looking for a place to start, I'd strongly recommend some of the foundational areas - something like pip.locations (I know there are others who have expressed an interest in this PI being exposed), or pep425tags which has the advantage of already having a standard, or something at that level. Starting with something at the level of the finder or the installer is likely to be way too much to start with, even if it feels like it would be more directly useful to you. Paul

On Wed, 19 Sep 2018 at 11:41 Paul Moore <p.f.moore@gmail.com> wrote:
And if you start with pep425tags I have a bunch of notes for you. ;) (CoC stuff has sucked almost all of my volunteer time for the past two weeks so I have not had a chance to try to write up a proposed library for PEP 425 like I had planned to.) -Brett

Thanks for the advices, they are really helpful. Incidentally (or maybe not? I wonder if there is an underlying pattern here) the two areas I do want to work on first are a) how to find a package, and b) how to choose an artifact for a given package. I think I’m starting with the package discovery part first and work my way from there. I’ll create an issue in pypa/pip and try to outline the intention (and summarise this thread), but there’s a couple of things I wish to clarify first: 1. Should dependency link processing be included? Since it is un-deprecated now, I guess the answer is yes? 2. (Maybe not an immediate issue) What formats should I include? Wheels and .tar.gz sdists, of course, what others? Eggs? .zip sdists? Are there other formats? TP

Make the library, either as a PR to packaging or as it's own independent
pulling out smaller chunks at a time rather than trying to start right off
I should clarify that we have already implemented a number of these as libraries over the last several months (and I am super familiar with pip's internals by now and I'm sure TP is getting there as well). More on this below probably has multiple things that are dependent on getting split out first This would be super helpful, although there is a decent chance we can make some initial headway on this aspect of it just with the first pass agreement library Basically the entire InstallRequirement model is the single most imported item (in my experience) from pip's internals and it is (almost) never used for installing, but often just for metadata access/normalization/parsing. I reimplemented the parsing logic for pipenv in 'requirementslib' (link below) pulling out a big topic is more likely to meet with success. Be cognizant of what the dependencies are for the feature you want to implement, because if it has dependencies, you'll need to pull them out first before you can pull it out OR you'll need to design the API to invert those dependencies so they get passed in instead We are super cognizant of that aspect as I am pretty sure we are hitting this wall in a full (nearly) pip-free reimplementation of all of the pipenv internals from the ground up, including wheel building/installation, but we basically had to start by calling pip directly, then slowly reimplement each aspect of the underlying logic using various elements in distlib/setuptools or rebuilding those. Since you mentioned following along, here's what we're working on right now: https://github.com/sarugaku/requirementslib -- abstraction layer for parsing and converting various requirements formats (pipfile/requirements.txt/command line/InstallRequirement) and moving between all of them https://github.com/sarugaku/resolvelib -- directed acyclic graph library for handling dependency resolution (not yet being used in pipenv) https://github.com/sarugaku/passa -- dependency resolver/installer/pipfile manager (bulk of the logic we have been talking about is in here right now) -- I think we will probably split this back out into multiple other smaller libraries or something based on the discussion https://github.com/sarugaku/plette -- this is a rewrite of pipfile with some additional logic / validation https://github.com/sarugaku/shellingham -- this is a shell detection library made up of some tooling we built in pipenv for environment detection https://github.com/sarugaku/pythonfinder -- this is a library for finding python (pep 514 compliant) by version and for finding any other executables (cross platform) https://github.com/sarugaku/virtenv -- python api for virtualenv creation Happy to provide access or take advice as needed on any of those. Thanks all for the receptiveness and collaboration Dan Ryan gh: @techalchemy // e: dan@danryan.co From: Donald Stufft [mailto:donald@stufft.io] Sent: Wednesday, September 19, 2018 1:52 PM To: Tzu-ping Chung Cc: Distutils Subject: [Distutils] Re: Distlib vs Packaging (Was: disable building wheel for a package) My general recommendation if you want a Python implementation/interface for something pip does, is: - Open an issue on the pip repository to document your intent and to make sure that there is nobody there who is against having that functionality split out. This might also give a chance for people with familiarity in that API to mention pain points that you can solve in a new API. We can also probably give you a good sense if the thing you want in a library is something that probably has multiple things that are dependent on getting split out first (for instance, if you said you wanted a library for installing wheels, we'd probably tell you that there is a dependency on PEP 425 tags, pip locations, maybe other that need resolved first) and also whether this is something that should have a PEP first or not. Getting some rough agreement on the plan to split X thing out before you start is overall a good thing. - Create or update a PEP if required, and get it into the provisional state. - Make the library, either as a PR to packaging or as it's own independent library. If there are questions that come up while creating that library/PR that have to do with specific pip behaviors, go back to that original issue and ask for clarification etc. Ideally at some point you'll open a PR on pip that uses the new library (my suggestion is to not bundle the library in the initial PR, and just import it normally so that the PR diff doesn't include the full bundled library until there's agreement on it). If there's another tool (pipenv, whatever) that is looking to use that same functionality, open a WIP PR there too that switches it to using that. Use feedback and what you learn from trying to integrate in those libraries to influence back the design of the API itself. Creating a PEP and creating the library and the PRs can happen in parallel, but at least for pip if something deserves a PEP, we're not going to merge a PR until that PEP is generally agreed on. However it can be supremely useful to have them all going at the same time, because you run into things that you didn't really notice until you went to actually implement it. My other big suggestion would be to e careful about how much you bite off at one time. Pip's internal code base is not the greatest, so pulling out smaller chunks at a time rather than trying to start right off pulling out a big topic is more likely to meet with success. Be cognizant of what the dependencies are for the feature you want to implement, because if it has dependencies, you'll need to pull them out first before you can pull it out OR you'll need to design the API to invert those dependencies so they get passed in instead. I personally would be happy to at a minimum participate on any issue where someone was trying to split out some functionality from pip into a re-usable library if not follow the develop of that library directly to help guide it more closely. My hope for pip is that it ends up being the glue around a bunch of these libraries, and that it doesn't implement most of the stuff itself anymore.

On Wed, Sep 19, 2018 at 8:54 PM, Dan Ryan <dan@danryan.co> wrote:
Is the hope or game plan then for pipenv not to have to depend on pip? This is partly what I was trying to learn in my email to this list a month ago (on Aug. 20, with subject: "pipenv and pip"): https://mail.python.org/mm3/archives/list/distutils-sig@python.org/thread/2Q... Based on the replies, I wasn't getting that impression at the time (though I don't remember getting a clear answer), but maybe things have changed since then. It should certainly be a lot easier for pipenv to move fast since there is no legacy base of users to maintain compatibility with. However, I worry about the fracturing this will cause. In creating these libraries, from the pip tracker it doesn't look like any effort is going into refactoring pip to make use of them. This relates to the point I made earlier today about how there won't be an easy way to cut pip over to using a new library unless an effort is made from the beginning. Thus, it's looking like things could be on track to split the user and maintainer base in two, with pip bearing the legacy burden and perhaps not seeing the improvements. Are we okay with that future? --Chris Since you mentioned following along, here's what we're working on right now:

We have been attempting to collaborate but there was not a clear path because there are many moving pieces. It is clear that the user experience is important and our primary focus is on providing a consistent UX so that things that work in pip also work in pipenv. We are directly discussing the opposite of the situation you are asking about. Things are already fragmented and we are all hoping to have a unified set of libraries to work with. This is now an attempt to find a process. I want to point out that this was essentially laid out as the only way forward, since using pip internals isn’t viable and contributing random changes doesn’t help us as a result. Creating libraries based on existing tooling has helped us determine what is possible, where the limitations are, and what a generic implementation might look like. Now that we have a sense of what is possible it is a lot easier to propose changes. So given that we are - discussing a path to refactoring functionality out of pip and into other libraries for consistent behavior across tools - looking at candidates for extraction - forming concrete actions to actually get this underway - already working on some early potential code What is it exactly that you are looking for in continuing down this road of ‘why are these two packaging tools doing different things’? The short answer is that sometimes it’s for no reason, sometimes for a good reason, and sometimes for a bad reason. But is that question meaningful when the conversation is about ‘how do we stop doing different things’ ? Dan Ryan // pipenv maintainer gh: @techalchemy

The resolution side of Pipenv really needs a Python API, and also cannot really use the CLI because it needs something slightly different than pip’s high-level logic (Nick mentioned this briefly). If we can’t use pip internals, then yes, the plan is to not depend on pip. The hope is we can share those internals with pip (either following the same standards, or using the same implementation), hence my series of questions. The installation side of Pipenv will continue to use pip directly, at least for a while more even after the resolution side breaks away, since “pip install” is adequate enough for our purposes. There are some possible improvements if there is a lower-layer library (e.g. to avoid pip startup overhead), but that is far less important.
It should certainly be a lot easier for pipenv to move fast since there is no legacy base of users to maintain compatibility with. However, I worry about the fracturing this will cause. In creating these libraries, from the pip tracker it doesn't look like any effort is going into refactoring pip to make use of them. This relates to the point I made earlier today about how there won't be an easy way to cut pip over to using a new library unless an effort is made from the beginning. Thus, it's looking like things could be on track to split the user and maintainer base in two, with pip bearing the legacy burden and perhaps not seeing the improvements. Are we okay with that future?
I’m afraid the new implementation will still need to deal with compatibility issues. Users expect Pipenv to work exactly as pip, and get very angry if it does not, especially when they see it is under the PyPA organisation on GitHub. The last time Pipenv tries to explain it does whatever arbitrary things it does, we get labelled as “toxic” (there are other issues in play, but this is IMO the ultimate cause). Whether the image is fair or not, I would most definitely want to avoid similar incidents from happening again. I think Pipenv would be okay to maintain a different (from scratch) implementation than pip’s, but it would still need to do (almost) exactly what pip is doing, unless we can have people (pip maintainers or otherwise) backing the differences. Whether pip uses the new implementation or not wouldn’t change the requirement :( TP

On Thu, 20 Sep 2018 at 08:01, Tzu-ping Chung <uranusjr@gmail.com> wrote:
IMO, the only way to address that is by defining standards for the behaviour. Having a standard document to point to that says "this is what's been agreed in public debate" gives both pipenv and pip a solid basis to explain why we do what we do. There will likely be corner cases where the details are implementation dependent, but again, the fact that the standard doesn't mandate behaviour is the best argument you're going to get for that. There will always be people that complain if you're not 100% bug-for-bug compatible with pip, but that's life. Obviously, any standard will have to look at pip's behaviour as a starting point (simply because pip's been around as the only implementation for so long). But simplifying and cutting out some of the cruft is part of any standards process, so it's perfectly OK to mark certain parts of what pip does now as "implementation defined" or "needs to change". Also, Dan said:
Since you mentioned following along, here's what we're working on right now:
The problem here is the same - without some sort of agreement (in the form of a documented standard[1]) that what those libraries do is "the right behaviour", it's not clear how pip can switch to using them. And promises that they "do the same as pip does" are not likely to work, for precisely the reasons Tzu-ping noted above (there's always someone that will pick up on any discrepancy, no mater how small). Paul PS While I don't have much time for people standing on the sidelines and telling us what we "should" do, I do think that by putting projects under the PyPA banner, we assume a responsibility for making sure we behave consistently, whether we like it or not. Interop standards documents have been how we've discharged that responsibility so far, but pipenv has such a strong overlap with pip that it opens up a lot of areas where we haven't even thought about standards yet. Managing expectations while we get things in line is not a pleasant task, but it's one we need to do. [1] I'm at least as sick of saying "standard" as you are of hearing it. Take it to mean "everyone's agreed and anyone likely to complain afterwards has had an opportunity to speak up, and there's a record" - I'm not wedded to any particular process here.

On Sep 19, 2018, at 23:22, Chris Jerdonek <chris.jerdonek@gmail.com> wrote:
Thus, it's looking like things could be on track to split the user and maintainer base in two, with pip bearing the legacy burden and perhaps not seeing the improvements. Are we okay with that future?
This'll be a sad day. pip is still used as an installer by other build system where using pipenv is simply not a possibility.

I am not quite sure I understand why you’d think so. pip has been bearing the legacy burden for years, and if this is the future (not saying it is), it would more like just another day in the office for pip users, since nothing is changing.

pip not seeing any improvements is something I think will be sad. I don't use pipenv, but use poetry which uses pip behind the scenes to do installation. I also use flit. For either of those cases I would think it sad that pipenv splits from pip, and then developers of alternate tooling around building packages (but not installing) don't get new improvements because "pip is legacy". pipenv doesn't work in various scenarios, and trying to shoehorn it into those scenarios is just wrong especially since it wasn't designed to do those things.

I think it's far-fetched to start thinking pip is legacy. Pipfile has had a goal from day 1 to be a format that pip would support. PEP 582 is a path forward here for providing a default location for a virtualenv [2] - it's just that everything moves slower in pip because it supports more use-cases than a tool like pipenv. What started out as a reference implementation has definitely taken on a life of its own of course and it's up to PyPA to manage that relationship and offer a good story around the tooling it's building. [1] https://github.com/pypa/pipfile#pip-integration-eventual [2] https://www.python.org/dev/peps/pep-0582/ On Thu, Sep 20, 2018 at 1:38 PM Bert JW Regeer <xistence@0x58.com> wrote:

On Thu, 20 Sep 2018 at 19:52, Michael Merickel <mmericke@gmail.com> wrote:
I think it's far-fetched to start thinking pip is legacy. Pipfile has had a goal from day 1 to be a format that pip would support. PEP 582 is a path forward here for providing a default location for a virtualenv [2] - it's just that everything moves slower in pip because it supports more use-cases than a tool like pipenv.
I don't think anyone's even spoken to the pip maintainers (yet?) about supporting the pipfile format. And no-one from the pip team has ever said that we're retiring pip in favour of pipenv. At one point, I think there was a lot of rhetoric around pipenv, but IMO it was just that, rhetoric. I'm not sure where the "everything moves slower in pip" comment comes from - pip's moving at a fair pace. I've no feel for how fast pipenv is moving (although for the parts where they use pip, they are "obviously" going to move faster in some sense, because they can use all the changes in pip and add their own :-))
What started out as a reference implementation has definitely taken on a life of its own of course and it's up to PyPA to manage that relationship and offer a good story around the tooling it's building.
As far as I'm concerned, pip and pipenv are different tools, supporting different use cases. I don't know enough about pipenv to say much more than that. The "official PyPA position" (if that's a thing, and if it's what someone is after) is probably at https://packaging.python.org/ and that document describes pip in the "Installing Packages" section, and pipenv under "Managing Application Dependencies". To me, that's a pretty clear distinction. Paul

That comes from me, I initially wrote the Pipfile as a proof of concept / sketch of an API for replacing the requirements.txt format, which Kenneth took and created pipenv from. At some point I plan on trying to push support for those ideas back into pip (not the virtual environment management bits though). That’s obviously my personal goal though, and doesn’t represent an agreed upon direction for pip.

On Fri, 21 Sep 2018 at 05:47, Donald Stufft <donald@stufft.io> wrote:
And it's one where I think there are a couple of different levels of support that are worth considering: Q. Should pip support installing from Pipfile.lock files as well as requirements.txt files? A. Once the lock file format stabilises, absolutely, since this is squarely in pip's "component installer" wheelhouse. Q. Should "pip install" support saving the installed components to a Pipfile, and then regenerating Pipfile.lock? A. This is far less of a clearcut decision, as managing updates to a file that's intended to be checked in to source control is where I draw the line between "component installer" and "application/environment dependency manager". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, 21 Sep 2018 at 11:41, Nick Coghlan <ncoghlan@gmail.com> wrote:
Speaking as a pip developer: Where's there a good summary of the pipfile format, the pipfile.lock format, and their relationship and contrast with requirements.txt? I don't view https://github.com/pypa/pipfile as a "good summary", because it explicitly states that pipfile is intended to *replace* requirements.txt, and I disagree strongly with that. Also, pipfile is human-readable, but pipfile.lock isn't. As far as I know, pipfile.lock is currently generated solely by pipfile - before pip consumes pipfile.lock, I'd like to see that format discussed and agreed as a formal interop standard that any tools wanting to pass data to pip (for the use case the standard describes) can use. One obvious thing I'd like to consider is changing the name to something less tool-specific - requirements.lock maybe? As far as the pipfile format is concerned, I see that more as pipenv's human readable input file that is used to *generate* the lock file, and I don't see it as something pip should consume directly, as that would mean pip overlapping in functionality with pipenv. If I'm misunderstanding the relationship between pip and pipenv, or between pipenv and pipfile, I'm happy to be corrected. But can I suggest that the best way to do so would be to amend the project pages that are giving me the impressions I have above, and pointing me at the corrected versions? That way, we can make sure that any misinformation is corrected at source... Paul PS Full disclosure - I've tried to use pipenv in a couple of local projects, because of the hype about it being the "great new thing" and found it basically of no use for my requirements/workflow. So I may have a biased view of either pipenv, or how it's being presented. I'm trying to be objective in the above, but my bias may have slipped through.

I agree with you about Pipfile. It is likely not something pip would not directly install packages based on. pip could potentially add a “lock” command that is able to generate a Pipfile.lock from Pipfile, or even start work in a fashion like npm etc., but conceptually, pip would only install things based on Pipfile.lock, and if it takes a Pipfile, it’s used to generate a Pipfile.lock (and maybe install from that). Regarding the format of Pipfile.lock, the proposal of it being less tool-specific is interesting to me. Brett also touched on a similar proposition a while ago that maybe we could standardise a common lock file format (shared by Pipenv and Poetry in the context of the time), and I think it is a nice idea, too. On the other hand, there are many other application dependency management tools out there, and as far as I know none of them actually have a lock file format with interoperability. JavaScript, for example, has maybe the most bipartisan state in that area (in npm and Yarn), and I don’t recall reading anything of this nature at all. I’m not saying this is wrong, but it’s interesting that Python, being relatively behind in this particular area, has this somewhat unique proposal here. (Again, this does not imply it’s either good or bad, just unique.) An extremely generic name like requirements.lock is probably not a good idea, since it is not uncommon for a project to require multiple package managers (e.g. for multiple languages), and it would be a disaster if everyone uses generic names. If not tool-specific (e.g. yarn.lock), the name should at least be context-specific, like… I don’t know, pyproject? But that is taken :p (This is intentionally rhetoric to touch on the we-should-use-pyproject-for-this camp. To be clear: I am not in that camp, that’s likely a bad idea unless we rethink the whole application-library distinction Python packaging makes.) TP Sent from Mail for Windows 10 From: Paul Moore Sent: 21 September 2018 20:16 To: Nick Coghlan Cc: Michael Merickel; Bert JW Regeer; Distutils Subject: [Distutils] Re: Distlib vs Packaging (Was: disable building wheel fora package) On Fri, 21 Sep 2018 at 11:41, Nick Coghlan <ncoghlan@gmail.com> wrote:
Speaking as a pip developer: Where's there a good summary of the pipfile format, the pipfile.lock format, and their relationship and contrast with requirements.txt? I don't view https://github.com/pypa/pipfile as a "good summary", because it explicitly states that pipfile is intended to *replace* requirements.txt, and I disagree strongly with that. Also, pipfile is human-readable, but pipfile.lock isn't. As far as I know, pipfile.lock is currently generated solely by pipfile - before pip consumes pipfile.lock, I'd like to see that format discussed and agreed as a formal interop standard that any tools wanting to pass data to pip (for the use case the standard describes) can use. One obvious thing I'd like to consider is changing the name to something less tool-specific - requirements.lock maybe? As far as the pipfile format is concerned, I see that more as pipenv's human readable input file that is used to *generate* the lock file, and I don't see it as something pip should consume directly, as that would mean pip overlapping in functionality with pipenv. If I'm misunderstanding the relationship between pip and pipenv, or between pipenv and pipfile, I'm happy to be corrected. But can I suggest that the best way to do so would be to amend the project pages that are giving me the impressions I have above, and pointing me at the corrected versions? That way, we can make sure that any misinformation is corrected at source... Paul PS Full disclosure - I've tried to use pipenv in a couple of local projects, because of the hype about it being the "great new thing" and found it basically of no use for my requirements/workflow. So I may have a biased view of either pipenv, or how it's being presented. I'm trying to be objective in the above, but my bias may have slipped through. -- Distutils-SIG mailing list -- distutils-sig@python.org To unsubscribe send an email to distutils-sig-leave@python.org https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/ Message archived at https://mail.python.org/mm3/archives/list/distutils-sig@python.org/message/S...

On Fri, 21 Sep 2018 at 14:09, Tzu-ping Chung <uranusjr@gmail.com> wrote:
On the other hand, there are many other application dependency management tools out there, and as far as I know none of them actually have a lock file format with interoperability. JavaScript, for example, has maybe the most bipartisan state in that area (in npm and Yarn), and I don’t recall reading anything of this nature at all. I’m not saying this is wrong, but it’s interesting that Python, being relatively behind in this particular area, has this somewhat unique proposal here. (Again, this does not imply it’s either good or bad, just unique.)
If it's intended as being specifically managed by pipenv, then a generic name and a standard aren't appropriate. The fact that "pip" is part of the name doesn't indicate a (current) relationship to pip. Given Donald's clarification, I'd say this isn't something we need to discuss at the moment. pipenv/pipfile/pipfile.lock are their own thing, and independent of pip. There's no support in pip for them, and there won't be unless/until there's a concrete proposal on the table. In the meantime, pip's alive and kicking, and no-one is making it legacy or proposing that anyone migrate away from it. Maybe there's some confusion or overlap over functionality, but that's a documentation/PR issue, and not one I'm going to get up tight about (other than to say let the people who made the statements do the job of clarifying any misunderstandings ;-)) Paul

AFAIK none of the pipenv maintainers have access to the Pipfile repo anyway besides Kenneth — We are maintaining a fork? Reimplementation? with additional validation Dan Ryan // pipenv maintainer gh: @techalchemy

On Fri., Sep. 21, 2018, 06:12 Tzu-ping Chung, <uranusjr@gmail.com> wrote:
So my motivation behind suggesting a standardized lockfile is my experience of having to deal with Node's two options. Either you end up alienating the subset of Node users who chose the other tool with the different lockfile or you have to put in extra effort to make sure to support both. I can give 2 concrete examples. One is I wrote a tool to generate the 3rd-party notices file for the Python extension for VS Code. Originally we were using Yarn, but then it turned out an internal tool at work only supported npm, so I had to choose to either not use the internal tool or switch us to npm (we did the latter because it turned out that package-lock.json is easier to work with for our use-case). Having a single format means tooling can be reused and this forcing function doesn't exist. The other example is serverless/PaaS cloud deployments. To support Python in that scenario you typically install dependencies upon deployment. So that means detecting and then using requirements.txt files on the cloud side. But then users ask for Pipfile.lock support. And the next request will be for pyporject.lock support for Poetry (and this exact scenario has happened to me at work when teams have asked for my opinion of what requirements file formats to support). Basically you are constantly trying to keep up with what the community is using which is tough to gauge. Having a single lock format means providers only need to support the one format and tools can come and go, innovating however they want as long as they produce the same artifact in the end (just like we have standardized on wheels while letting libraries have build tool flexibility through pyproject.toml). IOW I view libraries as having a build artifact of wheels while applications (should) have a build artifact of a lockfile. -Brett An extremely generic name like requirements.lock is probably not a good

On that note, specifically because pyproject.lock is not standardized, Poetry is changing it's lock file name to poetry.lock so that it is clear where the lock file is coming from and what it is meant to be consumed by. Having a standardized lock file would be nice if only so that I can lock with poetry and install with pip for example. Or create a lock file with pipenv and install with pip. This'll make it easier to deploy applications that don't need the full source code available and don't require tools like pipenv or poetry since we have wheels available. The way we've been doing this at work is to lock our packages on our own pypi mirror (devpi), and making sure that poetry/pipenv/pip freeze (requirements.txt) lock file is used to populate said pypi. It's not the greatest, but it works okay-ish, having a single lock file would be great. We have both libraries and applications create wheels though, especially since our applications may also contain Cython and we don't want to compile on production.

So I should probably be explicit that this was basically an idea I had, that Kenneth rolled with and originally tried to implement as a PR to pip but when pip’s code base was too, gnarly to get it in directly, he made pipenv to wrap pip to implement it, and since then pipenv has grown it’s own features and such and is now an independent tool. Maybe it no longer makes sense for Pipfile to be part of pip, maybe the idea is still good, but we need to do it in a different way, I dunno. My thoughts aren’t really fully formulated on it beyond “this is a thing I’d like to do, or at least explore doing” because I’ve been focused on other things. The original intent *was* to replace requirements.txt, because the requirements.txt format is not super great and is kind of unwieldy to work with. It conflates the idea of a lock file with the idea of the things I actually want to install, which makes it harder to work with than needs to be. It’s fine-ish for simple stuff, since it ultimately ends up being a simple text file, but the more complex you need to go, the worse it gets since everything is implemented as a —flag=whatever option on the line, so lines end up becoming huge and unwieldy. I don’t know that there’s a good summary available and honestly, I haven’t tracked what has happened to the format since Kenneth started working on it, so possible that my opinion is that the current state isn’t acceptable for inclusion directly in pip! I wouldn’t know currently, the idea is very much just a vague “I wanted this for pip years ago, pipenv actually implemented it, it would be great to fold that back into pip” Digging back through my gists, I’m not sure I can find the original Pipfile idea, but I did find one of the early incantations of me sketching ideas out — https://gist.github.com/dstufft/e61c97ee30192e575140 <https://gist.github.com/dstufft/e61c97ee30192e575140>, I know I have others in my gists somewhere but I got tired of clicking through them.
So the name came from me, because at the time it was intended to be a pip specific format (though one of my TODOs was asking if the name made sense).
It’s probably not super productive to try to hash this stuff out now TBH, unless people are interested in it. It’s something I plan to explore, at some point, but I’m not doing so yet because I have higher priority things to work on, and AFAIK nobody else is planning on doing it currently (although maybe someone is?). There are some folks who know that I want to do it, who I maybe gave the wrong impression to that it was a done deal that it was going to happen, vs being something I wanted to do, but that may ultimately end up not happening.

On Fri, 21 Sep 2018 at 23:10, Donald Stufft <donald@stufft.io> wrote:
https://github.com/pypa/pipfile/issues/108 is an issue I put together after starting to use pipenv and work with Kenneth, Dan, et al on pipenv UX design questions. TL;DR is that I think native Pipfile/Pipfile.lock support at the pip layer does still make sense, but the UX of it wouldn't look the same as it does in pipenv (since it would be an opt-in thing to request to track your changes to the current environment, rather than the default behaviour, and because pip *wouldn't* gain the ability to generate `Pipfile.lock` directly, only the ability to derive one from the current environment as part of `pip freeze`). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Sep 21, 2018 at 5:14 AM, Paul Moore <p.f.moore@gmail.com> wrote:
[This question isn't directed at Paul, even though it's in reply to his email.] Thanks for the good discussion, all. To clarify the overlap in functionality between pip and pipenv further, will pipenv be a strict superset in terms of what it can do? Tzu-ping said earlier that users expect pipenv to behave the same as pip, so I'm wondering if there are any areas of functionality that pip covers that pipenv won't be able to (or that doesn't plan to). --Chris

Pipenv wraps pip usages inside a virtual environment, so pip is always available via “pipenv run pip”, so in a sense Pipenv “supports” everything pip does. But as far as things Pipenv actually has wrapper commands for, it only tries to be pip’s functional superset in “install” and “uninstall”; everything else is out of scope. There are constantly people advocating adding more, or having confusion between similarly-named commands (e.g. check), but that’s another issue… TP

On Tue, Sep 25, 2018 at 1:25 AM, Tzu-ping Chung <uranusjr@gmail.com> wrote:
I was asking more in terms of useful functionality in the broader sense, rather than literal pip commands. Is any of pip's functionality in the broader sense out of scope? For example, I'm guessing that the information in commands like pip freeze, hash, list, and show might already be covered in a different way by pipenv analogs. For reference, I'm including the list of commands outputted by "pip help" below. What I'm trying to gauge is, if the plan is for pipenv not to depend on pip, and pipenv has strictly greater functionality than pip, then what purpose will PyPA have in continuing to develop pip in addition to pipenv? This is what I was referring to in my earlier email when I said it looks like it could "split the user and maintainer base in two." This question is personal to me because I currently spend some time improving and contributing to pip. But I may want to reevaluate that if another, more active PyPA project is in the process of duplicating that work or making pip superfluous. It seems like duplication of effort within an organization should be discouraged when there is limited person power to begin with. --Chris $ pip help ... Commands: install - Install packages. download - Download packages. uninstall - Uninstall packages. freeze - Output installed packages in requirements format. list - List installed packages. show - Show information about installed packages. check - Verify installed packages have compatible dependencies. config - Manage local and global configuration. search - Search PyPI for packages. wheel - Build wheels from your requirements. hash - Compute hashes of package archives. completion - A helper command used for command completion. help - Show help for commands.

On Tue, 25 Sep 2018 at 19:48, Chris Jerdonek <chris.jerdonek@gmail.com> wrote:
That's not the plan, as all of pip's features for actually installing/uninstalling packages, and for introspecting the *as built* package environment, aren't things where pipenv's needs diverge from pip's. Where their needs diverge is at the dependency resolver level, as pipenv needs to be able to generate a lock file for an arbitrary target environment that may not match the currently running Python interpreter *without* necessarily installing those packages anywhere (although it may need to build wheels to get the dependency listing), whereas pip has the more concrete task of "get theses packages and their dependencies installed into the currently active environment". If it helps, think of pipenv relating to pip in much the same way as pip-tools (pip-compile/pip-sync) relates to pip, just with Pipfile and Pipfile.lock instead of requirements.in and requirements.txt. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Sep 25, 2018 at 3:21 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That's not what Tzu-ping said though. In an earlier email, he said, "If we can’t use pip internals, then yes, the plan is to not depend on pip." --Chris

We are using pip internals for things pip wasn’t implemented for. Specifically, Pipenv uses pip’s package-fetching functions to implement its platform-agnostic resolver. pip does not have this, so there’s no functional overlap here. Those utilities are used to build something that doesn’t exist in pip, so there’s no duplicated efforts. My recent focus on making sense of packaging implementations and splitting out parts of pip is exactly to prevent potential duplicated efforts. If we can’t use pip internals, let’s make things we want to use *not* internal! TP

On Tue, Sep 25, 2018 at 3:47 AM, Tzu-ping Chung <uranusjr@gmail.com> wrote:
I don't see this practice of "splitting out parts of pip" actually being followed so far though, and I want to tell you how it's leading to duplicated efforts right now -- even for the PR's I happen to be working on today. Earlier in this thread, Dan said the pipenv maintainers are reimplementing certain pip functionality rather than splitting it out: "we basically had to start by calling pip directly, then slowly reimplement each aspect of the underlying logic using various elements in distlib/setuptools or rebuilding those." And then he listed seven libraries they are working on that contain code like this. Today I happened to file a simple PR to pip to continue DRYing up pip's VCS-related code base and making it more testable, etc: https://github.com/pypa/pip/pull/5810 My PR adds a function called `make_vcs_requirement_url()` and refactors various parts of pip to use it: https://github.com/pypa/pip/pull/5810/files#diff-c9e9f4633cef78355bd4a581feb... However, when I look at requirementslib (one of the libraries Dan mentioned), I see they have already reimplemented the function I added in my PR. In requirementslib, it's called `build_vcs_link()`: https://github.com/sarugaku/requirementslib/blob/d457054d969fcb5e712c8f74f9d... So instead of splitting it out of pip and modifying pip to use that, they created a new implementation separate from pip. And now I am redoing the implementation a second time. Other VCS-related functions in that same requirementslib file (e.g. "add_ssh_scheme_to_git_uri()") overlap with VCS code in pip that I've been doing similar work on. So by creating implementations of functions separate from pip instead of splitting them out of pip, it's leading to duplicated work for people working on pip's code base. A process of splitting something out of pip would I think involve a process more like what I'm following -- namely to refactor functionality in pip *first* so it can be used separately, and then pull that out so you only have one implementation at any point in time. Otherwise, you wind up creating two versions of similar functionality that not only need to be created twice, but also later reconciled if the plan is merge them back together. --Chris

On Tue, 25 Sep 2018 at 12:53, Chris Jerdonek <chris.jerdonek@gmail.com> wrote:
[...]
This is a very good point, and thanks for the concrete example. For the record, I'm also a little annoyed that pipenv have been doing all this work in isolation. There's been very little communication between the pip and pipenv projects, and I think that's contributed to the problem. I, for one, wasn't even aware for a long time that they were embedding their own copy of pip. When we made pip's internals in private in pip 10, we got essentially no feedback from anyone saying that it would be a problem, and finding out that it had a sufficiently major impact on pipenv that they had to embed the old version of pip *after* the release was (to put it politely) suboptimal :-( Having said all this, this thread is the start of an attempt to work better together and I think we should be glad that it's happening and try to make it work. But I don't think we've got off to a particularly good start and it will need work. In particular, I'd like to see an attempt from *both* projects to clarify our goals - I've been speaking solely for myself so far, it would be good to get the other pip developers' views and set out a sort of roadmap, and for pipenv to do the same. Paul

FYI pipenv have been vendoring pip since pretty early in the projects existence, since before 10 was released. At no point was I in touch to say that was a major issue though because we use the api precisely as you describe, *at our own risk*. Every time I’ve been in touch it’s been to ask how to go about starting the conversation around sharing elements of a codebase, to which the answer has universally been ‘don’t use our internals, go write a library if you want’. So as a result we’ve mostly gone away and kept pip vendored as we started some libraries. The internal changes didn’t do much as far as I’m concerned, and as we knew the risks and had a vendored copy we were able to upgrade on our own clock. We use this copy only to fetch things from the index for generating lockfiles. Also there was significant publicity saying that people shouldn’t use internals and it seemed like it should honestly not be on you to worry about — that’s the whole point of vendoring it. I don’t think any of us saw a point in trying to force that problem on pip given the internals changes. And pipenv was not a part of the pypa at the start of the project, so we certainly _were_ working in isolation at that time. To Chris’ broader point it is definitely duplicated effort and I am in full agreement which is why I want to establish which code should be extracted and generalized and where it should be maintained. But as Paul mentioned there also is no PyPA strategy at play here. We aren’t trying to move into pip’s space and vice versa AFAIU but I’m also not too sure how much trying to point a finger at us here is helpful. You find it frustrating that we are working in isolation, I find it frustrating that I keep being pushed to go away and write libraries whenever I mention sharing functionality, which to me sounds like you want me to work in isolation. Obviously there is some communication breakdown— At the end of the day I just want to give users a consistent experience. Hopefully this is a goal others share. As far as the pipenv roadmap, we are rewriting a good chunk of it but it essentially the atomic things we do or would like to do with regards to shared functionality at a high level: - parse user input/requirements - retrieve package metadata - resolve dependencies (not in pip right now obviously) — may involve building wheels/sdists - download / build packages - install packages (into a virtualenv sometimes) - uninstall packages (from a virtualenv sometimes) - list packages etc (in a virtualenv sometimes) We know how these are accomplished in pip and we know how we are handling them both in pipenv and in our alternative implementations, which I believe still have some dependency on pip. For us the roadmap is basically to figure out - where do the abstractions live - how do we start - ??? Dan Ryan // pipenv maintainer gh: @techalchemy

On Tue, Sep 25, 2018 at 6:46 AM, Dan Ryan <dan@danryan.co> wrote:
Thanks, Dan. I think Donald answered this in an email much earlier in this thread -- the one he sent on Sept. 19 beginning: """ My general recommendation if you want a Python implementation/interface for something pip does, is: - Open an issue on the pip repository to document your intent and to make sure that there is nobody there who is against having that functionality split out. ... ... """ In addition, to ensure you're taking the approach of splitting out parts of pip instead of creating yet another implementation separate from pip (leading to duplicated work as I said), I would recommend also doing the following: File PR's in pip to refactor out various functionality that makes sense into stand-alone bits with their own tests (e.g. unit tests, etc). This would live in pip but be disentangled from the rest of pip from a code dependency perspective. It could even be in a separate "future libraries" subdirectory if the pip committers are okay with it. (Optional: since you're already vendoring pip, you could perhaps use this factored out code at your own risk, like you're already doing with some of pip's code base anyway.) The above is something I started doing at a small scale -- not to break it out into a library, but for its own sake because it makes it easier to maintain and work on pip as a code base, fixing bugs, etc. Some of the nice things about this approach are that-- 1. It can be done even before the steps that Donald outlined (like I'm doing), e.g. so that you don't need to wait for a PEP to be finished. And it could even help in the later creation of such PEPs. 2. By doing so, as a side effect of the PR process, it automatically causes communication with pip maintainers and works towards the "splitting out parts of pip" goal -- all without duplicating any work because both pip and pipenv developers are aware and collaborating on these changes. 3. It is incremental as you go, ensuring that it is compatible with pip and that pip is using it, etc. 4. It improves pip in the process (as well as pipenv, if you're still vendoring in some form). In other words, the work is useful from the beginning, whether or not it's ever broken out into a separately maintained library. 5. It is forward progress towards the goal of a separate library. --Chris

Thanks Chris, these are excellent suggestions. Can you provide a reference to a PR you’re working or have worked on so we can jumpstart the process on our end? Dan Ryan // pipenv maintainer gh: @techalchemy

On Tue, Sep 25, 2018 at 6:30 PM, Dan Ryan <dan@danryan.co> wrote:
Thanks Chris, these are excellent suggestions. Can you provide a reference to a PR you’re working or have worked on so we can jumpstart the process on our end?
Hi Dan, The PR's I've been submitting and working on aren't *primarily* of the "split functionality out into stand-alone bits" variety. But there are some. Two such recent PR's of mine that fit that category are-- 1) one from July (merged) to split out and add split_auth_from_netloc() to misc.py: https://github.com/pypa/pip/pull/5657 2) the one I mentioned in an email yesterday and that I just submitted (not yet merged), which splits out a function I called make_vcs_requirement_url() and adds it to misc.py: https://github.com/pypa/pip/pull/5810 My PR's have mostly been limited to the VCS part of the code: refactoring, fixing bugs, enhancements, etc. My advice to you (and which I've been following myself) would be for you to break changes into small, tightly-focused PR's, and to have only one outstanding at a time (or perhaps two if non-overlapping). The main reason is that there aren't too many people reviewing and merging changes (mainly just Pradyun in my case), so I don't want to overwhelm them. I've been trying to follow the maxim "go slow to go fast." Also, plan to expect / be okay with a fair of amount of preliminary refactoring PR's before splitting something out might even be possible. I.e. you will probably need a lot of "set up," especially if it's a gnarlier area of the code base. --Chris

On Wed., 26 Sep. 2018, 2:40 pm Chris Jerdonek, <chris.jerdonek@gmail.com> wrote:
Note that the difficulty of doing this is a key part of why my own recommendation to the folks working on refactoring pipenv's internals was to handle the API design process reasonably independently of pip (at least initially): the point of providing a public Python level API is to address the cases that pip itself doesn't address, and while refactoring pip to use the same common set of libraries is important to reducing the long term collective maintenance burden, it's unlikely to be the quickest way of addressing the near term ecosystem level API need for projects like pip-tools, pipenv, and poetry. So if anyone's irritated by that approach, best to focus your irritation on me for providing that advice, not on the folks doing the work of breaking out pipenv's internals as reusable libraries that can hopefully eventually be shared with other projects (including pip) as their APIs stabilise. When they say that's the advice they received, it's an accurate statement, it's just that by participating directly in distutils-sig rather than proxying through me, they're now also hearing from folks that would have favoured a different approach :) Cheers, Nick. P.S. At a larger meta-level, I find it's worth keeping in mind that there will be always be ebbs and flows of convergence and divergence in any large scale organisational system, and I really like http://theagilepirate.net/archives/1392 as a write-up of that in a corporate context. In an open source community context, I think the same phenomenon shows up as cycles of competition (different approaches to superficially common problems with no clear winner amongst them) and consolidation (improved understanding of the problem space resulting in clearer articulation of the design trade-offs involved, and the emergence of "obvious" default choices at particular points in the problem space).

On Mon, Oct 1, 2018 at 8:46 PM Nick Coghlan <ncoghlan@gmail.com> wrote:
In general, I think it's better to be more open about long-term strategies within an organization and who is working on what and toward what goals, etc. That way people can make better decisions about what to work on and how to spend their time. For example, if I had known that the pipenv developers were working on reusable libraries that were intended to later be used by pip, maybe I would have chosen to help them instead of working on pip. This in turn could have helped address the community's near-term needs more quickly, or at least ensure that my own work is in concert with theirs (e.g. by preparing for the longer term "convergence," as the piece you linked to calls it). It's not a good feeling when people feel like they've been kept in the dark. As I said, it can lead to duplicated work and the feeling that one has been wasting their time by working on something that is later going to be swapped out or replaced. --Chris

On Tue., 2 Oct. 2018, 4:05 pm Chris Jerdonek, <chris.jerdonek@gmail.com> wrote:
Yeah, I'd agree with that, and the fact I wasn't making the time to better facilitate that kind of information sharing is one of the reasons it seemed like a good idea to step back from the way I had been doing things and instead try to encourage more direct communication about what folks are working on. Cheers, Nick.

On Tue, 25 Sep 2018 at 22:09, Dan Ryan <dan@danryan.co> wrote:
To Chris’ broader point it is definitely duplicated effort and I am in full agreement which is why I want to establish which code should be extracted and generalized and where it should be maintained. But as Paul mentioned there also is no PyPA strategy at play here. We aren’t trying to move into pip’s space and vice versa AFAIU but I’m also not too sure how much trying to point a finger at us here is helpful. You find it frustrating that we are working in isolation, I find it frustrating that I keep being pushed to go away and write libraries whenever I mention sharing functionality, which to me sounds like you want me to work in isolation. Obviously there is some communication breakdown— At the end of the day I just want to give users a consistent experience. Hopefully this is a goal others share.
Yeah, miscommunication. It's not useful to dwell on mistakes made in the past, so thanks for simply stating your position, and moving forward. In the interests of doing the same I'll just say that I've always *intended* the message to be "if you need pip's internal functionality, pull it out into a library and write an API for it", not that people should reimplement what we'd done, like you and Chris say, that's wasted effort. If that didn't come across properly, that's my mistake. Donald said much the same, but gave more detail about the best way to go about that process, and Chris has added further suggestions in the message he just sent, all of which I agree with. Paul

On Tue, 25 Sep 2018 at 10:46, Chris Jerdonek <chris.jerdonek@gmail.com> wrote:
Speaking as a member of the pip development team, who has invested a lot of time in the development of pip (and still does), I can say with 100% certainty that there's no intention from me that pip is going to be "made superfluous". As a member of PyPA, I'll say that PyPA is basically a loose confederation of projects, and there's no real "overall PyPA plan" to consider here. There's likely a perception problem here about the role of the PyPA that we should try to address, but that's somewhat a separate problem. There's also something of a perception that pipenv and pip are somehow "competitors". There are a couple of aspects to that: 1. Is pipenv *actually* competing with pip? I don't think so, and I don't think that pipenv do either, but personally, I'd like to see a bit more clarity in the pipenv docs of how it addresses a different use case than pip. There was some unfortunate (again, IMO) miscommunication earlier on that they need to address, but again, I think it's just that - miscomunication, not reality. 2. Regardless, is it even a problem for 2 competing projects to be under the PyPA banner? I don't think it is (see "loose confederation" above) but again, that's related to clarifying the role of the PyPA. As far as duplication of effort is concerned, one of the points of this conversation is to ensure that the work that the pipenv developers are doing can also be used by pip - that's very much because we *don't* want pip to get relegated to a legacy role. But neither pip, nor (to my knowledge) pipenv support use as a library - so code sharing *has* to be done by identifying useful code and moving it out into supported libraries. Maybe, in the long term future, we'll end up with the PyPA managing a set of "packaging interop libraries" and a number of tools like pip, pipenv and maybe others, that layer a UI on top of that functionality. But there will still be a pip project, and it will still be developing and growing. tl; dr; Pip needs all the help you can offer and that's not going to change as a result of pipenv. Only you can choose what you're interested in working on, but don't make that decision based on any misguided idea that someone's making your work on pip obsolete. Paul

Pipenv also uses pip as mentioned several times in the thread, and (reiterating here) the entire point of the conversation is about how both can work together on changes. That is the thrust of the whole discussion. We are actively using pip via its internals and pips developers (who _actively develop pip_) would like us to an alternate approach. The discussion is about how to find one and then contribute it back to pip. Nobody is discontinuing work on pip, nobody is splitting from pip, and I would prefer if we could refrain from trying to spread this kind of inaccurate picture. I know we have had unproductive conversations on the issue tracker, please don’t bring them to the mailing list. Dan Ryan // pipenv maintainer gh: @techalchemy

Wait, what? How did my apparently misunderstanding of what "it's looking like things could be on track to split the user and maintainer base in two" and me explaining why I don't think all new innovation should go into pipenv suddenly turn into "spread this kind of inaccurate picture".
I know we have had unproductive conversations on the issue tracker, please don’t bring them to the mailing list.
This isn't about you, has absolutely NOTHING to do with you, don't make it about you. I am trying to contribute my thoughts back to the discussion which only is of peripherally concerned about pipenv, but is about the future of pip/package installation, and a comment that was made regarding pip becoming "legacy". You made me feel incredibly unwelcome to pipenv, I will no longer actively attempt to contribute back to that community. I have gone out of my way to stay away from any PyPA projects because of the actions and behaviours you showed on the pipenv tracker, and have actively encouraged others to do the same and look at other open source projects instead. Let us be crystal clear here, the way you and Kenneth have shown your colours on the pipenv issue tracker is a real shame and is turning off many potential contributors and good feedback to help improve pipenv. This post, right here, has re-iterated that view. Don't contact me again.

Sorry that you feel such hostility toward the PyPA, they are certainly not responsible for my actions;
explaining why I don't think all new innovation should go into pipenv
Nobody is advocating this position
I am trying to contribute my thoughts back to the discussion which only is of peripherally concerned about pipenv, but is about the future of pip/package installation
The thread in question is a direct dialog about how we can work together on joint projects at this point, which does include pipenv and pip and several other tools; hopefully we will be able to break this down into a coherent list of other useful smaller utilities which could be consumed. Unfortunately this is a mailing list and the discussion is likely to continue so there is a high probability that my name will show up in your inbox Dan Ryan gh: @techalchemy <https://github.com/techalchemy> // e: dan@danryan.co From: Bert JW Regeer [mailto:xistence@0x58.com] Sent: Thursday, September 20, 2018 9:34 PM To: Dan Ryan Cc: Tzu-ping Chung; Chris Jerdonek; Distutils; Donald Stufft Subject: Re: [Distutils] Distlib vs Packaging (Was: disable building wheel for a package) On Sep 20, 2018, at 16:30, Dan Ryan <dan@danryan.co> wrote: Pipenv also uses pip as mentioned several times in the thread, and (reiterating here) the entire point of the conversation is about how both can work together on changes. That is the thrust of the whole discussion. We are actively using pip via its internals and pips developers (who _actively develop pip_) would like us to an alternate approach. The discussion is about how to find one and then contribute it back to pip. Nobody is discontinuing work on pip, nobody is splitting from pip, and I would prefer if we could refrain from trying to spread this kind of inaccurate picture. Wait, what? How did my apparently misunderstanding of what "it's looking like things could be on track to split the user and maintainer base in two" and me explaining why I don't think all new innovation should go into pipenv suddenly turn into "spread this kind of inaccurate picture". I know we have had unproductive conversations on the issue tracker, please don’t bring them to the mailing list. This isn't about you, has absolutely NOTHING to do with you, don't make it about you. I am trying to contribute my thoughts back to the discussion which only is of peripherally concerned about pipenv, but is about the future of pip/package installation, and a comment that was made regarding pip becoming "legacy". You made me feel incredibly unwelcome to pipenv, I will no longer actively attempt to contribute back to that community. I have gone out of my way to stay away from any PyPA projects because of the actions and behaviours you showed on the pipenv tracker, and have actively encouraged others to do the same and look at other open source projects instead. Let us be crystal clear here, the way you and Kenneth have shown your colours on the pipenv issue tracker is a real shame and is turning off many potential contributors and good feedback to help improve pipenv. This post, right here, has re-iterated that view. Don't contact me again. Dan Ryan // pipenv maintainer gh: @techalchemy On Sep 20, 2018, at 2:29 PM, Bert JW Regeer <xistence@0x58.com> wrote: On Sep 20, 2018, at 12:11, Tzu-ping Chung <uranusjr@gmail.com> wrote: On 21 Sep 2018, at 02:01, Bert JW Regeer <xistence@0x58.com> wrote: On Sep 19, 2018, at 23:22, Chris Jerdonek <chris.jerdonek@gmail.com> wrote: Thus, it's looking like things could be on track to split the user and maintainer base in two, with pip bearing the legacy burden and perhaps not seeing the improvements. Are we okay with that future? This'll be a sad day. pip is still used as an installer by other build system where using pipenv is simply not a possibility. I am not quite sure I understand why you’d think so. pip has been bearing the legacy burden for years, and if this is the future (not saying it is), it would more like just another day in the office for pip users, since nothing is changing. pip not seeing any improvements is something I think will be sad. I don't use pipenv, but use poetry which uses pip behind the scenes to do installation. I also use flit. For either of those cases I would think it sad that pipenv splits from pip, and then developers of alternate tooling around building packages (but not installing) don't get new improvements because "pip is legacy". pipenv doesn't work in various scenarios, and trying to shoehorn it into those scenarios is just wrong especially since it wasn't designed to do those things.

Bert, it sounds like you have a Code of Conduct complaint about a PyPA- maintained project -- may I formally pass that along to all of the pipenv maintainers perhttp://www.pypa.io/en/latest/code-of-conduct/ for followup and ask for more specific details? You can let me know on- list or offlist.-- Sumana Harihareswara Changeset Consulting sh@changeset.nyc On Thu, Sep 20, 2018, at 9:34 PM, Bert JW Regeer wrote:

There’s been a lot of great discussion here, and I’m going to try to find time to go through these, though I wanted to make something explicit that I think maybe I was leaving implicit, and I thought should be made explicit: Depending on how vital a particular bit of functionality is to pip, we’re likely going to want most libraries that are pulling functionality out of pip to live under the PyPA banner, and ideally should be setup in a way that existing pip contributors can work on them as well. While conceptually these are becoming distinct entities, for end users they’re going to be part of the nebulous thing that is pip, and changes and for “core” bits, pip wouldn’t want to lose the ability to work on these bits of functionality directly. Obviously there is some stuff that isn’t “core” to what pip does, that we’re generally fine with being owned in a way that we aren’t part owners of. For instance we use requests, CacheControl, etc. The key difference there is that these are all things that aren’t really specific to pip’s core functionality (even though we may use them in implementing that) and so we don’t need to care too much about their implementation one way or another. We’re also probably going to need/want to figure out some sort of shared requirements for things like “when do we drop support for a version of Python in xyzlib” and such like that.

On Sep 20, 2018, at 8:12 AM, Donald Stufft <donald@stufft.io> wrote:
Depending on how vital a particular bit of functionality is to pip, we’re likely going to want most libraries that are pulling functionality out of pip to live under the PyPA banner, and ideally should be setup in a way that existing pip contributors can work on them as well. While conceptually these are becoming distinct entities, for end users they’re going to be part of the nebulous thing that is pip, and changes and for “core” bits, pip wouldn’t want to lose the ability to work on these bits of functionality directly.
Quick clarification— That’s not to suggest that any particular one of these libraries need to move under the PyPA banner at this point. Just that as a general rule of thumb, stuff that is core to pip, we’re going to want the above before we would likely accept a PR that switches pip over to using it (again, depending on how “core” it is).

I have no issue moving things to the PyPA, the list was more of a ‘which of these is useful’ check Dan Ryan gh: @techalchemy // e: dan@danryan.co From: Donald Stufft [mailto:donald@stufft.io] Sent: Thursday, September 20, 2018 8:15 AM To: Dan Ryan Cc: Tzu-ping Chung; Distutils Subject: Re: [Distutils] Distlib vs Packaging (Was: disable building wheel for a package) On Sep 20, 2018, at 8:12 AM, Donald Stufft <donald@stufft.io> wrote: Depending on how vital a particular bit of functionality is to pip, we’re likely going to want most libraries that are pulling functionality out of pip to live under the PyPA banner, and ideally should be setup in a way that existing pip contributors can work on them as well. While conceptually these are becoming distinct entities, for end users they’re going to be part of the nebulous thing that is pip, and changes and for “core” bits, pip wouldn’t want to lose the ability to work on these bits of functionality directly. Quick clarification— That’s not to suggest that any particular one of these libraries need to move under the PyPA banner at this point. Just that as a general rule of thumb, stuff that is core to pip, we’re going to want the above before we would likely accept a PR that switches pip over to using it (again, depending on how “core” it is).

On Wed, Sep 19, 2018 at 10:14 AM, Tzu-ping Chung <uranusjr@gmail.com> wrote:
From what Donald said, the first step is to standardize some chunk of functionality into a PEP (for functionality that doesn't already have a PEP). If you start the library before standardizing via a PEP, it seems like you're just going to exacerbate the problem by creating yet another library with its own implementation variations, introduce new compatibility problems for the future, etc. And a PEP will still need to be created later, anyways. For cases where a PEP already exists, I would also encourage you to add to the library only as you are having pip start to use it. Otherwise, there won't be an easy way to cut pip over to using it, without risking major breakage. This process would also help ensure the implementation matches pip's behavior. --Chris

On Fri, Sep 14, 2018, 07:27 Daniel Holth <dholth@gmail.com> wrote:
They certainly *can* all go in site-packages; it's just a directory, it can contain anything. And there are some concrete benefits to having all the package "stuff" together in one place: it makes the data easy to find when starting from the code, everyone already understands the namespace conventions, and it makes it harder for code and data to get inadvertantly separated from each other. What's the benefit of having a separate data dir? Just aesthetics, or are there technical reasons too? Aesthetics is certainly worth *something* – beautiful is better than ugly – but OTOH it'd be really nice if there were one and only one obvious way to do it, and right now that's putting data into site-packages. This is even blessed by the core import machinery: https://docs.python.org/3.7/library/importlib.html#module-importlib.resource... (And if we do decide to double down on a data/ directory, then we should talk to Brett and Barry about getting support for finding that directory into importlib.resources.) -n

On Fri, 14 Sep 2018 at 14:51, sashk <b@sashk.xyz> wrote:
My understanding (and I'm not an expert here, so hopefully someone else will confirm or correct) is that yes, the data directory is installed to "the installation root", which is $VIRTUAL_ENV for a virtualenv, and "something else" for non-virtualenvs (I think it's / on Unix and sys.prefix on Windows, no idea what happens for user installs). See https://github.com/pypa/pip/blob/master/src/pip/_internal/locations.py#L136 for what pip does, but basically it defers to whatever distutils/setuptools considers to be the `install_data` location. This also sounds like the sort of requirement that came up in the discussion around Daniel's proposal for more install locations, from https://mail.python.org/pipermail/distutils-sig/2015-April/026222.html, so maybe it's another motivating case for that proposal (although the discussion in that thread petered out, so someone will need to pick it up again if there's interest). Paul

Yes that feels very dangerous to me. From the github thread it sounds like setuptools *does* support it. I would be inclined to deprecate support for that in setuptools if that's the case (though obviously since setup.py is a regular python file you can write your own code to do whatever you want in it, so it's more in the spirit of a deterrent than a protection). On September 14, 2018 11:42:37 AM UTC, Jeroen Demeyer <J.Demeyer@UGent.be> wrote:

On Fri, 14 Sep 2018 at 12:43, Jeroen Demeyer <J.Demeyer@ugent.be> wrote:
The OP hasn't said, but I assumed that it was expecting to install something in a "standard" Unix location like /etc. As a Windows user, I think that's a bad idea (and regardless of what OS I use, it's clearly a non-portable idea) but I've no idea if wheels that install stuff to locations like /etc have a sensible meaning on Unix. (They obviously don't handle being installed in --user, or in a virtualenv, so they aren't what I'd call "proper" Python packages, but maybe there's still a reasonable use case here). Regardless, I do consider it a feature that wheel doesn't support installation to absolute paths. Paul

No one wants wheel to be able to install things outside of the virtualenv. What people have repeatedly asked for is the ability to install things somewhere besides $VIRTUAL_ENV/lib/python#.#/site-packages/, places like $VIRTUAL_ENV/etc/ for example. Should all the config files, documentation, data, man pages, licenses, images, go into $VIRTUAL_ENV/lib/python#.#/site-packages/? Or can we do a better job letting people put files in $VIRTUAL_ENV/xyz? On Fri, Sep 14, 2018 at 9:51 AM sashk <b@sashk.xyz> wrote:

It can be hard to predict where data went at runtime. Maybe we could record its path in .dist-info during the install. I think it may also not be clear how to put files into wheel's data directory from setup.py. If we added more categories it would allow the installer to override e.g. just the config directory rather than copying a Unix-like tree under data/ onto the install path. On Fri, Sep 14, 2018, 10:46 Tzu-ping Chung <uranusjr@gmail.com> wrote:

On Fri, 14 Sep 2018 at 16:03, Daniel Holth <dholth@gmail.com> wrote:
It can be hard to predict where data went at runtime.
I don't think it's "hard to predict". I *do* think it's badly documented/not standardised. See my previous note - pip installs into the install_data location that setuptools/distutils chooses. Ideally: a) Setuptools and/or distutils would document where that is clearly and in an easy to find location. b) There should be a standard defining "schemes" like this so that the choices aren't implementation-defined (by pip, setuptools and distutils in some weird combination). Of course "there should be..." means "someone who cares enough needs to invest time in..." :-(
Maybe we could record its path in .dist-info during the install. I think it may also not be clear how to put files into wheel's data directory from setup.py.
Better would be to have a supported library that exposes the logic pip uses (or as I said above, the standard-defined logic) to determine such paths. See https://github.com/pypa/pip/issues/5191
If we added more categories it would allow the installer to override e.g. just the config directory rather than copying a Unix-like tree under data/ onto the install path.
That's a reasonable but somewhat unrelated suggestion. I don't think it's needed for the OP's use case, but it may well be helpful for others. Paul

A corner case is where the package is importable because it is on $PYTHONPATH and so the $virtual_env at runtime is different than the one at install time. That is why it might be useful to store the data directory per-package. wheel.install.get_install_paths(package_name) shows where things would be installed according to wheel. It comes from a call to distutils. pip has its own implementation. On my machine data goes into $virtualenv in a virtualenv and into /usr on system python. On Fri, Sep 14, 2018 at 11:26 AM Paul Moore <p.f.moore@gmail.com> wrote:

On Fri, Sep 14, 2018, at 4:26 PM, Paul Moore wrote:
There is an official standard library API in the sysconfig module to find installation locations: https://docs.python.org/3/library/sysconfig.html#installation-paths Unfortunately, distutils has a copy of this logic rather than using the sysconfig module, from what I remember. Some Linux distros have patched distutils to put installed files in different locations, but have not necessarily patched sysconfig, presumably because they didn't think about it. Even if sysconfig were patched, distros may have a different location for files installed by the distro package manager and files installed by other means (Debian based distros use /usr and /usr/local for these). So there's no one data directory where you can find all files related to importable packages. (Of course, we advise against 'sudo pip install', but people still do it anyway). This may be somewhat outdated - it's been a while since I looked into this, but I don't think the relevant pieces are changing rapidly. My conclusion at the time was that the only reliable way to have data files findable at runtime was to put them inside the importable package.

On Tue, 18 Sep 2018 at 19:54, Thomas Kluyver <thomas@kluyver.me.uk> wrote:
Yes, if it weren't for the stuff mentioned below, that would be the one obvious way of doing things. However...
Unfortunately, distutils has a copy of this logic rather than using the sysconfig module, from what I remember. Some Linux distros have patched distutils to put installed files in different locations, but have not necessarily patched sysconfig, presumably because they didn't think about it.
(Technically, I think the history is that sysconfig was created by pulling the distutils logic out into a standalone module, but no-one modified distutils to use sysconfig - probably the usual "fear of changing anything in distutils" problem :-() Even more unfortunately, pip further layers its own logic on this (https://github.com/pypa/pip/blob/master/src/pip/_internal/locations.py#L136) - basically a "sledgehammer to crack a nut" approach of initialising a distutils install command object, and then introspecting it to find out where it'll install stuff - then hacking some special cases on top of that. I assume it's doing that to patch over the incomplete Linux distro patching.
Even if sysconfig were patched, distros may have a different location for files installed by the distro package manager and files installed by other means (Debian based distros use /usr and /usr/local for these). So there's no one data directory where you can find all files related to importable packages. (Of course, we advise against 'sudo pip install', but people still do it anyway).
Yeah. We don't really have any control or influence over what distros do, we just end up reacting to their changes. That's why I think we need a centralised supported library - unless the distros start working better with the Python stdlib APIs, someone needs to collect together all the workarounds and hacks needed to interact with them, and that's better done in one place than duplicating the work in multiple projects (and the fact that people have asked for pip's locations module to be exposed as an API confirms to me that there *are* multiple projects that need this).
This may be somewhat outdated - it's been a while since I looked into this, but I don't think the relevant pieces are changing rapidly. My conclusion at the time was that the only reliable way to have data files findable at runtime was to put them inside the importable package.
Agreed. There's more engagement with the distros these days (I think) but it hasn't resulted in much change in how they handle things yet. Paul

Part of the challenge here from my perspective is that even though I can envision what the end solution looks like, it's not clear how to actually get there. How do we designate someone to make decisions about what to include? How do we protect ourselves against special-case logic being kept in individual project repos as it is now, but instead pushed back up to the main repo? From our perspective (pipenv/various other projects I maintain at least), the implementation in pip is the minimum viable standard to which all other tooling _must_ conform, at least from a UX standpoint. Since package installation is step 0 of any developer's work, even minor workflow changes can be jarring and seriously disruptive, which is why we rely on the pip implementations a lot It reminds me of the old adage about starting off with one problem... you don't want a new developer to have a clever idea (problem one) and go to set up their new environment and the first thing that happens is they experience some initial error before they've even managed to install a package (now they have two problems and one of them has nothing to do with their actual task). As a consequence even though there are other libraries that may provide some of this functionality, pip has the reference implementation and that contains some significant additional logic. I don't imagine that pip is going to simply adopt some new library without significant review... The substantial effort required to actually get people to review the code involved in standardizing the functionality people are 'borrowing' from pip is probably going to be a challenge, and that's before we ever consider that it will be difficult getting people to agree on what should be standardized and extracted. People are basically taking a path-of-least-resistance approach right now which means everyone is building their own tooling as needed or simply using the tooling in pip because they know that will always provide the same UX that the user is accustomed to (see my first point). I don't see anyone sinking significant effort into a shared library until we agree on what needs to be shared and can be confident that it will land in the various tools at some point. That said, there aren't too many of us that can put all the pieces together so it shouldn't be that difficult to outline what we need -- anyone who wants to have a look at what most of our projects are using from pip can shoot me a message off-list and I can send you a link to the repository Dan Ryan gh: @techalchemy // e: dan@danryan.co this,

Speaking for myself, generally if someone spins functionality out of pip into a dedicated library, and that library is well tested, and has done the diligence to ensure that answers aren’t changing if pip switches to that library (or if they do change, they changed purposefully and we can document it and deal with deprecation) then I don’t think there’s a whole lot of blocker there. Obviously the level of review and testing should be commiserate with the importance of the part that library controls. For instance when we switched to using packaging’s implementation of version handling, I had spent hours compiling what the differences were going to be between PEP 440 versions and the new version handling across all of PyPI. However for platform detection for the User Agent we switched to a third party library with just a cursory glance. Mainly I think the important thing, as far as pip is concerned, is for someone to identify before hand what piece they want to carve out of pip into a library and make sure that none of the pip developers have a problem with that, then make sure that they do the diligence in making sure that the new library matches the old behavior etc, and then submit a PR to pip that swaps us to using it. For a lot of things, moving the reference implementation into pypa/packaging is going to be the right answer, which is already bundled in pip anyways.

Agreed. Furthermore, if people are of the opinion that pip's implementation is suitable, copying it out into packaging is likely not going to be at all controversial. Of course, it's not going to be any direct advantage to pip if that's done (we get the same functionality, just in a different place), so the people benefiting are those who want a supported API for that functionality, and it seems only reasonable to expect them to do the job of moving the code, rather than expecting the pip developers to do so. In the case of pip's location code, more time has likely been spent discussing the problem than it would actually take to make the change. (Of course, no one person has spent that much time in discussions, but it adds up - coding doesn't work that way sadly). Paul On Tue, 18 Sep 2018 at 22:03, Donald Stufft <donald@stufft.io> wrote:

This is where I think we disagree and I feel the rhetoric is a bit harmful -- personally I don't benefit much at all, I actually don't think any individual maintainer inside the PyPA benefits much beyond having a new project to maintain, so the 'helps me vs helps you' framing isn't really the point. If it strictly helped me to add a project to my list of things to maintain I would have done that already. The real issue here is that we all have different implementations and they create non-uniform / disjointed user experiences. Converging on a set of common elements to extract seems like step 1 I am fairly new to the PyPA, and I don't know how any of these processes actually work. But I do know that painting this as "us vs you" when my interest actually in helping the user of packaging tools is causing a disconnect for me anytime we engage on this -- and I'm not asking you to tackle any of this yourself, except possibly review someone's PR down the road to swap out some internals. Dan Ryan gh: @techalchemy // e: dan@danryan.co

On Wed, 19 Sep 2018 at 00:52, Dan Ryan <dan@danryan.co> wrote:
Apologies. I misread your email, and so I was mostly addressing the issues we've seen posted to pip asking for us to simply expose the internal functions, not your comment about multiple projects implementing the logic. Sorry for that. Agreed if we already have multiple implementations, merging them is a useful thing, but the benefits are diffuse and long term, so it's the sort of thing that tends to remain on the back burner indefinitely. (One of the problems with open source is that unless something is *already* available as a library, we tend to reimplement rather than refactoring existing code out of a different project, because the cost of that interaction is high - which unfortunately I demonstrated above by my comment "people needing an API should do the work" :-(). Paul

Risking thread hijacking, I want to take this chance and ask about one particular multiple implementation problem I found recently. What is the current situation regarding distlib vs packaging and various pieces in pip? Many parts of distlib seems to have duplicates in either packaging or pip/setuptools internals. I understand this is a historical artifact, but what is the plan going forward, and what strategy, if any, should a person take if they are to make the attempt of merging, or collecting pieces from existing code bases into a workable library? From what I can tell (very limited), distlib seems to contain a good baseline design of a library fulfilling the intended purpose, but is currently missing parts to be fully usable on its own. Would it be a good idea to extend it with picked parts from pip? Should I contribute directly to it, or make a (higher level) wrapper around it with those parts? Should I actually use parts from it, or from other projects (e.g. distlib.version vs packaging.version, distlib.locator or pip’s PackageFinder)? It would be extremely helpful if there is a somewhat general, high-level view to the whole situation. TP

On Wed, 19 Sep 2018 at 09:39, Tzu-ping Chung <uranusjr@gmail.com> wrote:
Risking thread hijacking, I want to take this chance and ask about one particular multiple implementation problem I found recently.
I changed the subject to keep things easier to follow. Hope that's OK.
What is the current situation regarding distlib vs packaging and various pieces in pip? Many parts of distlib seems to have duplicates in either packaging or pip/setuptools internals. I understand this is a historical artifact, but what is the plan going forward, and what strategy, if any, should a person take if they are to make the attempt of merging, or collecting pieces from existing code bases into a workable library?
Note: This is my personal view of the history only, Vinay and Donald would be better able to give definitive answers
From what I can tell (very limited), distlib seems to contain a good baseline design of a library fulfilling the intended purpose, but is currently missing parts to be fully usable on its own. Would it be a good idea to extend it with picked parts from pip? Should I contribute directly to it, or make a (higher level) wrapper around it with those parts? Should I actually use parts from it, or from other projects (e.g. distlib.version vs packaging.version, distlib.locator or pip’s PackageFinder)? It would be extremely helpful if there is a somewhat general, high-level view to the whole situation.
Distlib was created as a place to experiment with making a library-style interface to various pieces of packaging functionality. At the time it was created, there were not many standardised parts of the packaging ecosystem, so while it followed the standards where they existed, it also implemented a number of pieces of functionality that *weren't* backed by standards (obvious examples being the script creation stuff and the package finder). Packaging, in the other hand, was designed to focus strictly on implementations of agreed standards, providing reference APIs for projects to use. Pip uses both libraries, but as far as I'm aware, we'd use an API from packaging in preference to distlib. The only distlib API we use is the script maker API. Pretty much everything else in distlib, we already had an internal implementation for by the time distlib was written, so there was no benefit in changing (in contrast, the benefit in switching to packaging is "by design conformance to the relevant standards"). My recommendations would be: 1. Use packaging APIs always where they exist, even if a distlib equivalent exists. 2. Never use pip APIs, they are internal use only (Paul bangs on that old drum again :-)) 3. Consider using distlib APIs for things like the locator API, because it's better than writing your own code, but be aware of the risks. When I say risks here, the things I'd consider are: * Distlibs APIs aren't used in many projects, so they are likely less well tested (that's a chicken and egg issue, people need to use them to make them better). * Be aware that they may not behave identically to pip outside of the standardised areas. * In particular, review the details of which locators you use - the default is very different from pip's process (although the results should be the same). I don't know what the longer term goals are for distlib. It's not yet really become the "toolkit for building packaging tools" that it could be. Whether there's an intention for it to become that, I don't know. In some ways I'd like to turn the question round - why didn't tools like pipenv and pip-tools use distlib for their core functionality, rather than patching into pip's internals? The answers to that question might clarify better what needs to happen if distlib is to become the obvious place to find packaging functionality. Paul

On Wed, 19 Sep 2018 at 19:19, Paul Moore <p.f.moore@gmail.com> wrote:
When it comes to things like pip-tools and pipenv, my experience is that users are really expecting to get the same results as they get from pip, and get upset when they differ (even if what pip is doing is arbitrary, and what the wrapper tool does is similarly arbitrary, but also different). So, using pip's internals makes more sense than attempting to explain the behavioural differences between pip and distlib. However, pipenv at least is finding that pip's behaviour doesn't necessarily match what pipenv needs (in particular, it needs much better support for working with Python installations other than the one hosting pipenv itself). Given that, and assuming Vinay is amenable to the idea, it would be nice to revisit the concept of the two layer architecture, with packaging as the lower level minimalist strictly standards compliant layer, and distlib as the higher level general purpose toolkit that brings together various other libraries (including packaging itself) under a more comprehensive API. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, 19 Sep 2018 at 10:54, Nick Coghlan <ncoghlan@gmail.com> wrote:
I'd certainly be OK with delegating more of the "common activities" to distlib. But it's a *long* way from being simple to do so, and pip would need to take a lot of care to ensure that doing so didn't result in behavioural differences. Also, we'd need to be careful of dumping too much on distlib without making the sustainability problem even worse - at the moment, as far as I know, Vinay is the sole distlib developer. Paul

I have the same experience with Pipenv as Nick’s. I would also guess another reason is the lack of knowledge—this is certainly my own before I get involved in Pipenv. There is barely any guide on how I should implement such a thing, and my developer’s instinct would tell me to look at a known implementation, i.e. pip. This also ties back to the problem that pip uses distlib internally barely at all—had it used more of it, people might be pointed to the right direction. Migrating Pipenv’s internals from pip instead of distlib is actually the exact thing I am thinking about when I raised the question. There are, as mentioned, a lot of pieces missing in distlib. For example, distlib knows how to find a distribution, and how to install wheels, but not how a non-wheel distribution can be turned into a wheel. [1] It also has no functionalities on uninstallation. If I’m to glue together a working thing, I would likely need to copy/reimplement parts of pip, but where should they live? Do I add yet another layer above distlib to include them, or do I try to include them in distlib? Although distlib provides a nice basis, I feel it is still one layer below what most people want to do, e.g. install a thing by name (or URL). But would a three-layer design be too much, or should distlib have a high-level API as well? [1]: Also, while I’m having your attention—I’m trying to use the pep517 library as part of the solution to build an sdist into wheel, but I’m hitting a bug. Could you help review my PR? :p https://github.com/pypa/pep517/pull/15 <https://github.com/pypa/pep517/pull/15> TP

On Wed, 19 Sep 2018 at 11:34, Tzu-ping Chung <uranusjr@gmail.com> wrote:
That's probably Vinay's call. IMO, whatever layer there is above packaging, it should do stuff like this.
I don't think a 3-layer approach is sensible, two layers is more than enough. Maybe not go all the way to "install something by name". Maybe APIs to * find what's available in a set of indexes/locations * check if a compatible wheel is in that list * make a wheel from a sdist * install a wheel It's hard to make something like this sound like anything other than put "pip into a library and make pip itself a thin shell round that library" though. At which point we come full circle round to the fact that the *reason* we don't support a pip API is that the cost of designing one, restructuring pip to expose (and use!) it, and creating documentation and tests to make sure that API is stable and properly supported, is huge. And we don't have the resources. It's not so much a technical/design or layering issue, it's really a resourcing and sustainability issue.
It's not really me that can comment, you probably need Thomas. To clarify, even though I have commit rights on the pep517 project, I'm not really a project maintainer there. My interest in it is as a consumer from pip, which only uses the hook wrapper code. So this PR is in something I've not used or looked at closely. Pip does its own build isolation, ironically enough :-( Paul

I’ve personally always planned on pulling out the last bits of what we do use distlib for in pip, and not relying on it any longer. My general plan for extracting stuff from pip and/or setuptools has always been to first standardize in a PEP (if a sufficient one doesn’t already exist) anything that makes sense to be as a standard, and then start either reimplementing or pulling code out of pip (or setuptools if pip is using setuptools). When doing that I had always planned on spending a lot of effort ensuring that the behavior matches what pip is already doing (or have known, specific divergences). Now some of this already exists in distlib, but I don’t plan on using it. Part of that is because I find it easier to identify things that should be standardized but aren’t if I’m not using a big bundle of already implemented stuff already (for instance, script support needs to be standardized, but it didn’t occur to us at the time because we just used what distlib had). It also has a bunch of functionality that exists only in distlib, like attempting to use JSON to find packages (at one point there was even a locator that implicitly used a non PyPI server, no idea if there still is) that I felt made it harder to use the library in a way that didn’t basically create a new set of implementation defined semantics. It also copied APIs and semantics that I think *shouldn’t* have been (for instance, it has an implicitly caching resource API like setuptools does… a generally bad idea IMO, whereas using the new importlib.resources is a much saner API). So in general, there are things that currently only exist in distlib or setuptools, but my personal long term plan for pip is that we should get solid implementations of those things out of those libraries, but generally my mind puts distlib and setuptools in largely the same boat.

I feel the plan is quite solid. This however leaves us (who want a Python implementation and interface to do what pip does) in an interesting place. So I can tell there are a couple of principles: 1. Do not use pip internals 2. pip won’t be using either distlib or setuptools, so they might not match what pip does, in the long run Does this leaves us only one option left, to implement a library that matches what pip does (follows the standards), but is not pip? That feels quite counter-productive to me, but if it’s what things would be, I’d accept it. The next step (for me) in that case would then be to start working on that library. Since existing behaviours in setuptools and pip (including the part it uses distlib for) are likely to be standardised, I can rely on distlib for script creation, setuptools for some miscellaneous things (editable installs?), and pull (or reimplement) parts out of pip for others. Are there caveats I should look out? TP -- Tzu-ping Chung (@uranusjr) uranusjr@gmail.com Sent from my iPhone

My general recommendation if you want a Python implementation/interface for something pip does, is: - Open an issue on the pip repository to document your intent and to make sure that there is nobody there who is against having that functionality split out. This might also give a chance for people with familiarity in that API to mention pain points that you can solve in a new API. We can also probably give you a good sense if the thing you want in a library is something that probably has multiple things that are dependent on getting split out first (for instance, if you said you wanted a library for installing wheels, we’d probably tell you that there is a dependency on PEP 425 tags, pip locations, maybe other that need resolved first) and also whether this is something that should have a PEP first or not. Getting some rough agreement on the plan to split X thing out before you start is overall a good thing. - Create or update a PEP if required, and get it into the provisional state. - Make the library, either as a PR to packaging or as it’s own independent library. If there are questions that come up while creating that library/PR that have to do with specific pip behaviors, go back to that original issue and ask for clarification etc. Ideally at some point you’ll open a PR on pip that uses the new library (my suggestion is to not bundle the library in the initial PR, and just import it normally so that the PR diff doesn’t include the full bundled library until there’s agreement on it). If there’s another tool (pipenv, whatever) that is looking to use that same functionality, open a WIP PR there too that switches it to using that. Use feedback and what you learn from trying to integrate in those libraries to influence back the design of the API itself. Creating a PEP and creating the library and the PRs can happen in parallel, but at least for pip if something deserves a PEP, we’re not going to merge a PR until that PEP is generally agreed on. However it can be supremely useful to have them all going at the same time, because you run into things that you didn’t really notice until you went to actually implement it. My other big suggestion would be to e careful about how much you bite off at one time. Pip’s internal code base is not the greatest, so pulling out smaller chunks at a time rather than trying to start right off pulling out a big topic is more likely to meet with success. Be cognizant of what the dependencies are for the feature you want to implement, because if it has dependencies, you’ll need to pull them out first before you can pull it out OR you’ll need to design the API to invert those dependencies so they get passed in instead. I personally would be happy to at a minimum participate on any issue where someone was trying to split out some functionality from pip into a re-usable library if not follow the develop of that library directly to help guide it more closely. My hope for pip is that it ends up being the glue around a bunch of these libraries, and that it doesn’t implement most of the stuff itself anymore.

On Wed, 19 Sep 2018 at 18:52, Donald Stufft <donald@stufft.io> wrote:
I basically agree with everything Donald said, and I'd also be happy to support any work along these lines. If you're looking for a place to start, I'd strongly recommend some of the foundational areas - something like pip.locations (I know there are others who have expressed an interest in this PI being exposed), or pep425tags which has the advantage of already having a standard, or something at that level. Starting with something at the level of the finder or the installer is likely to be way too much to start with, even if it feels like it would be more directly useful to you. Paul

On Wed, 19 Sep 2018 at 11:41 Paul Moore <p.f.moore@gmail.com> wrote:
And if you start with pep425tags I have a bunch of notes for you. ;) (CoC stuff has sucked almost all of my volunteer time for the past two weeks so I have not had a chance to try to write up a proposed library for PEP 425 like I had planned to.) -Brett

Thanks for the advices, they are really helpful. Incidentally (or maybe not? I wonder if there is an underlying pattern here) the two areas I do want to work on first are a) how to find a package, and b) how to choose an artifact for a given package. I think I’m starting with the package discovery part first and work my way from there. I’ll create an issue in pypa/pip and try to outline the intention (and summarise this thread), but there’s a couple of things I wish to clarify first: 1. Should dependency link processing be included? Since it is un-deprecated now, I guess the answer is yes? 2. (Maybe not an immediate issue) What formats should I include? Wheels and .tar.gz sdists, of course, what others? Eggs? .zip sdists? Are there other formats? TP

Make the library, either as a PR to packaging or as it's own independent
pulling out smaller chunks at a time rather than trying to start right off
I should clarify that we have already implemented a number of these as libraries over the last several months (and I am super familiar with pip's internals by now and I'm sure TP is getting there as well). More on this below probably has multiple things that are dependent on getting split out first This would be super helpful, although there is a decent chance we can make some initial headway on this aspect of it just with the first pass agreement library Basically the entire InstallRequirement model is the single most imported item (in my experience) from pip's internals and it is (almost) never used for installing, but often just for metadata access/normalization/parsing. I reimplemented the parsing logic for pipenv in 'requirementslib' (link below) pulling out a big topic is more likely to meet with success. Be cognizant of what the dependencies are for the feature you want to implement, because if it has dependencies, you'll need to pull them out first before you can pull it out OR you'll need to design the API to invert those dependencies so they get passed in instead We are super cognizant of that aspect as I am pretty sure we are hitting this wall in a full (nearly) pip-free reimplementation of all of the pipenv internals from the ground up, including wheel building/installation, but we basically had to start by calling pip directly, then slowly reimplement each aspect of the underlying logic using various elements in distlib/setuptools or rebuilding those. Since you mentioned following along, here's what we're working on right now: https://github.com/sarugaku/requirementslib -- abstraction layer for parsing and converting various requirements formats (pipfile/requirements.txt/command line/InstallRequirement) and moving between all of them https://github.com/sarugaku/resolvelib -- directed acyclic graph library for handling dependency resolution (not yet being used in pipenv) https://github.com/sarugaku/passa -- dependency resolver/installer/pipfile manager (bulk of the logic we have been talking about is in here right now) -- I think we will probably split this back out into multiple other smaller libraries or something based on the discussion https://github.com/sarugaku/plette -- this is a rewrite of pipfile with some additional logic / validation https://github.com/sarugaku/shellingham -- this is a shell detection library made up of some tooling we built in pipenv for environment detection https://github.com/sarugaku/pythonfinder -- this is a library for finding python (pep 514 compliant) by version and for finding any other executables (cross platform) https://github.com/sarugaku/virtenv -- python api for virtualenv creation Happy to provide access or take advice as needed on any of those. Thanks all for the receptiveness and collaboration Dan Ryan gh: @techalchemy // e: dan@danryan.co From: Donald Stufft [mailto:donald@stufft.io] Sent: Wednesday, September 19, 2018 1:52 PM To: Tzu-ping Chung Cc: Distutils Subject: [Distutils] Re: Distlib vs Packaging (Was: disable building wheel for a package) My general recommendation if you want a Python implementation/interface for something pip does, is: - Open an issue on the pip repository to document your intent and to make sure that there is nobody there who is against having that functionality split out. This might also give a chance for people with familiarity in that API to mention pain points that you can solve in a new API. We can also probably give you a good sense if the thing you want in a library is something that probably has multiple things that are dependent on getting split out first (for instance, if you said you wanted a library for installing wheels, we'd probably tell you that there is a dependency on PEP 425 tags, pip locations, maybe other that need resolved first) and also whether this is something that should have a PEP first or not. Getting some rough agreement on the plan to split X thing out before you start is overall a good thing. - Create or update a PEP if required, and get it into the provisional state. - Make the library, either as a PR to packaging or as it's own independent library. If there are questions that come up while creating that library/PR that have to do with specific pip behaviors, go back to that original issue and ask for clarification etc. Ideally at some point you'll open a PR on pip that uses the new library (my suggestion is to not bundle the library in the initial PR, and just import it normally so that the PR diff doesn't include the full bundled library until there's agreement on it). If there's another tool (pipenv, whatever) that is looking to use that same functionality, open a WIP PR there too that switches it to using that. Use feedback and what you learn from trying to integrate in those libraries to influence back the design of the API itself. Creating a PEP and creating the library and the PRs can happen in parallel, but at least for pip if something deserves a PEP, we're not going to merge a PR until that PEP is generally agreed on. However it can be supremely useful to have them all going at the same time, because you run into things that you didn't really notice until you went to actually implement it. My other big suggestion would be to e careful about how much you bite off at one time. Pip's internal code base is not the greatest, so pulling out smaller chunks at a time rather than trying to start right off pulling out a big topic is more likely to meet with success. Be cognizant of what the dependencies are for the feature you want to implement, because if it has dependencies, you'll need to pull them out first before you can pull it out OR you'll need to design the API to invert those dependencies so they get passed in instead. I personally would be happy to at a minimum participate on any issue where someone was trying to split out some functionality from pip into a re-usable library if not follow the develop of that library directly to help guide it more closely. My hope for pip is that it ends up being the glue around a bunch of these libraries, and that it doesn't implement most of the stuff itself anymore.

On Wed, Sep 19, 2018 at 8:54 PM, Dan Ryan <dan@danryan.co> wrote:
Is the hope or game plan then for pipenv not to have to depend on pip? This is partly what I was trying to learn in my email to this list a month ago (on Aug. 20, with subject: "pipenv and pip"): https://mail.python.org/mm3/archives/list/distutils-sig@python.org/thread/2Q... Based on the replies, I wasn't getting that impression at the time (though I don't remember getting a clear answer), but maybe things have changed since then. It should certainly be a lot easier for pipenv to move fast since there is no legacy base of users to maintain compatibility with. However, I worry about the fracturing this will cause. In creating these libraries, from the pip tracker it doesn't look like any effort is going into refactoring pip to make use of them. This relates to the point I made earlier today about how there won't be an easy way to cut pip over to using a new library unless an effort is made from the beginning. Thus, it's looking like things could be on track to split the user and maintainer base in two, with pip bearing the legacy burden and perhaps not seeing the improvements. Are we okay with that future? --Chris Since you mentioned following along, here's what we're working on right now:

We have been attempting to collaborate but there was not a clear path because there are many moving pieces. It is clear that the user experience is important and our primary focus is on providing a consistent UX so that things that work in pip also work in pipenv. We are directly discussing the opposite of the situation you are asking about. Things are already fragmented and we are all hoping to have a unified set of libraries to work with. This is now an attempt to find a process. I want to point out that this was essentially laid out as the only way forward, since using pip internals isn’t viable and contributing random changes doesn’t help us as a result. Creating libraries based on existing tooling has helped us determine what is possible, where the limitations are, and what a generic implementation might look like. Now that we have a sense of what is possible it is a lot easier to propose changes. So given that we are - discussing a path to refactoring functionality out of pip and into other libraries for consistent behavior across tools - looking at candidates for extraction - forming concrete actions to actually get this underway - already working on some early potential code What is it exactly that you are looking for in continuing down this road of ‘why are these two packaging tools doing different things’? The short answer is that sometimes it’s for no reason, sometimes for a good reason, and sometimes for a bad reason. But is that question meaningful when the conversation is about ‘how do we stop doing different things’ ? Dan Ryan // pipenv maintainer gh: @techalchemy

The resolution side of Pipenv really needs a Python API, and also cannot really use the CLI because it needs something slightly different than pip’s high-level logic (Nick mentioned this briefly). If we can’t use pip internals, then yes, the plan is to not depend on pip. The hope is we can share those internals with pip (either following the same standards, or using the same implementation), hence my series of questions. The installation side of Pipenv will continue to use pip directly, at least for a while more even after the resolution side breaks away, since “pip install” is adequate enough for our purposes. There are some possible improvements if there is a lower-layer library (e.g. to avoid pip startup overhead), but that is far less important.
It should certainly be a lot easier for pipenv to move fast since there is no legacy base of users to maintain compatibility with. However, I worry about the fracturing this will cause. In creating these libraries, from the pip tracker it doesn't look like any effort is going into refactoring pip to make use of them. This relates to the point I made earlier today about how there won't be an easy way to cut pip over to using a new library unless an effort is made from the beginning. Thus, it's looking like things could be on track to split the user and maintainer base in two, with pip bearing the legacy burden and perhaps not seeing the improvements. Are we okay with that future?
I’m afraid the new implementation will still need to deal with compatibility issues. Users expect Pipenv to work exactly as pip, and get very angry if it does not, especially when they see it is under the PyPA organisation on GitHub. The last time Pipenv tries to explain it does whatever arbitrary things it does, we get labelled as “toxic” (there are other issues in play, but this is IMO the ultimate cause). Whether the image is fair or not, I would most definitely want to avoid similar incidents from happening again. I think Pipenv would be okay to maintain a different (from scratch) implementation than pip’s, but it would still need to do (almost) exactly what pip is doing, unless we can have people (pip maintainers or otherwise) backing the differences. Whether pip uses the new implementation or not wouldn’t change the requirement :( TP

On Thu, 20 Sep 2018 at 08:01, Tzu-ping Chung <uranusjr@gmail.com> wrote:
IMO, the only way to address that is by defining standards for the behaviour. Having a standard document to point to that says "this is what's been agreed in public debate" gives both pipenv and pip a solid basis to explain why we do what we do. There will likely be corner cases where the details are implementation dependent, but again, the fact that the standard doesn't mandate behaviour is the best argument you're going to get for that. There will always be people that complain if you're not 100% bug-for-bug compatible with pip, but that's life. Obviously, any standard will have to look at pip's behaviour as a starting point (simply because pip's been around as the only implementation for so long). But simplifying and cutting out some of the cruft is part of any standards process, so it's perfectly OK to mark certain parts of what pip does now as "implementation defined" or "needs to change". Also, Dan said:
Since you mentioned following along, here's what we're working on right now:
The problem here is the same - without some sort of agreement (in the form of a documented standard[1]) that what those libraries do is "the right behaviour", it's not clear how pip can switch to using them. And promises that they "do the same as pip does" are not likely to work, for precisely the reasons Tzu-ping noted above (there's always someone that will pick up on any discrepancy, no mater how small). Paul PS While I don't have much time for people standing on the sidelines and telling us what we "should" do, I do think that by putting projects under the PyPA banner, we assume a responsibility for making sure we behave consistently, whether we like it or not. Interop standards documents have been how we've discharged that responsibility so far, but pipenv has such a strong overlap with pip that it opens up a lot of areas where we haven't even thought about standards yet. Managing expectations while we get things in line is not a pleasant task, but it's one we need to do. [1] I'm at least as sick of saying "standard" as you are of hearing it. Take it to mean "everyone's agreed and anyone likely to complain afterwards has had an opportunity to speak up, and there's a record" - I'm not wedded to any particular process here.

On Sep 19, 2018, at 23:22, Chris Jerdonek <chris.jerdonek@gmail.com> wrote:
Thus, it's looking like things could be on track to split the user and maintainer base in two, with pip bearing the legacy burden and perhaps not seeing the improvements. Are we okay with that future?
This'll be a sad day. pip is still used as an installer by other build system where using pipenv is simply not a possibility.

I am not quite sure I understand why you’d think so. pip has been bearing the legacy burden for years, and if this is the future (not saying it is), it would more like just another day in the office for pip users, since nothing is changing.

pip not seeing any improvements is something I think will be sad. I don't use pipenv, but use poetry which uses pip behind the scenes to do installation. I also use flit. For either of those cases I would think it sad that pipenv splits from pip, and then developers of alternate tooling around building packages (but not installing) don't get new improvements because "pip is legacy". pipenv doesn't work in various scenarios, and trying to shoehorn it into those scenarios is just wrong especially since it wasn't designed to do those things.

I think it's far-fetched to start thinking pip is legacy. Pipfile has had a goal from day 1 to be a format that pip would support. PEP 582 is a path forward here for providing a default location for a virtualenv [2] - it's just that everything moves slower in pip because it supports more use-cases than a tool like pipenv. What started out as a reference implementation has definitely taken on a life of its own of course and it's up to PyPA to manage that relationship and offer a good story around the tooling it's building. [1] https://github.com/pypa/pipfile#pip-integration-eventual [2] https://www.python.org/dev/peps/pep-0582/ On Thu, Sep 20, 2018 at 1:38 PM Bert JW Regeer <xistence@0x58.com> wrote:

On Thu, 20 Sep 2018 at 19:52, Michael Merickel <mmericke@gmail.com> wrote:
I think it's far-fetched to start thinking pip is legacy. Pipfile has had a goal from day 1 to be a format that pip would support. PEP 582 is a path forward here for providing a default location for a virtualenv [2] - it's just that everything moves slower in pip because it supports more use-cases than a tool like pipenv.
I don't think anyone's even spoken to the pip maintainers (yet?) about supporting the pipfile format. And no-one from the pip team has ever said that we're retiring pip in favour of pipenv. At one point, I think there was a lot of rhetoric around pipenv, but IMO it was just that, rhetoric. I'm not sure where the "everything moves slower in pip" comment comes from - pip's moving at a fair pace. I've no feel for how fast pipenv is moving (although for the parts where they use pip, they are "obviously" going to move faster in some sense, because they can use all the changes in pip and add their own :-))
What started out as a reference implementation has definitely taken on a life of its own of course and it's up to PyPA to manage that relationship and offer a good story around the tooling it's building.
As far as I'm concerned, pip and pipenv are different tools, supporting different use cases. I don't know enough about pipenv to say much more than that. The "official PyPA position" (if that's a thing, and if it's what someone is after) is probably at https://packaging.python.org/ and that document describes pip in the "Installing Packages" section, and pipenv under "Managing Application Dependencies". To me, that's a pretty clear distinction. Paul

That comes from me, I initially wrote the Pipfile as a proof of concept / sketch of an API for replacing the requirements.txt format, which Kenneth took and created pipenv from. At some point I plan on trying to push support for those ideas back into pip (not the virtual environment management bits though). That’s obviously my personal goal though, and doesn’t represent an agreed upon direction for pip.

On Fri, 21 Sep 2018 at 05:47, Donald Stufft <donald@stufft.io> wrote:
And it's one where I think there are a couple of different levels of support that are worth considering: Q. Should pip support installing from Pipfile.lock files as well as requirements.txt files? A. Once the lock file format stabilises, absolutely, since this is squarely in pip's "component installer" wheelhouse. Q. Should "pip install" support saving the installed components to a Pipfile, and then regenerating Pipfile.lock? A. This is far less of a clearcut decision, as managing updates to a file that's intended to be checked in to source control is where I draw the line between "component installer" and "application/environment dependency manager". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, 21 Sep 2018 at 11:41, Nick Coghlan <ncoghlan@gmail.com> wrote:
Speaking as a pip developer: Where's there a good summary of the pipfile format, the pipfile.lock format, and their relationship and contrast with requirements.txt? I don't view https://github.com/pypa/pipfile as a "good summary", because it explicitly states that pipfile is intended to *replace* requirements.txt, and I disagree strongly with that. Also, pipfile is human-readable, but pipfile.lock isn't. As far as I know, pipfile.lock is currently generated solely by pipfile - before pip consumes pipfile.lock, I'd like to see that format discussed and agreed as a formal interop standard that any tools wanting to pass data to pip (for the use case the standard describes) can use. One obvious thing I'd like to consider is changing the name to something less tool-specific - requirements.lock maybe? As far as the pipfile format is concerned, I see that more as pipenv's human readable input file that is used to *generate* the lock file, and I don't see it as something pip should consume directly, as that would mean pip overlapping in functionality with pipenv. If I'm misunderstanding the relationship between pip and pipenv, or between pipenv and pipfile, I'm happy to be corrected. But can I suggest that the best way to do so would be to amend the project pages that are giving me the impressions I have above, and pointing me at the corrected versions? That way, we can make sure that any misinformation is corrected at source... Paul PS Full disclosure - I've tried to use pipenv in a couple of local projects, because of the hype about it being the "great new thing" and found it basically of no use for my requirements/workflow. So I may have a biased view of either pipenv, or how it's being presented. I'm trying to be objective in the above, but my bias may have slipped through.
participants (16)
-
Bert JW Regeer
-
Brett Cannon
-
Chris Jerdonek
-
Dan Ryan
-
Daniel Holth
-
Donald Stufft
-
Jeroen Demeyer
-
Michael Merickel
-
Nathaniel Smith
-
Nick Coghlan
-
Paul G
-
Paul Moore
-
sashk
-
Sumana Harihareswara
-
Thomas Kluyver
-
Tzu-ping Chung