On Wed, 19 Sep 2018 at 09:39, Tzu-ping Chung <
uranusjr@gmail.com> wrote:
Risking thread hijacking, I want to take this chance and ask about one particular multiple implementation problem I found recently.
I changed the subject to keep things easier to follow. Hope that's OK.
What is the current situation regarding distlib vs packaging and various pieces in pip? Many parts of distlib seems to have duplicates in either packaging or pip/setuptools internals. I understand this is a historical artifact, but what is the plan going forward, and what strategy, if any, should a person take if they are to make the attempt of merging, or collecting pieces from existing code bases into a workable library?
Note: This is my personal view of the history only, Vinay and Donald
would be better able to give definitive answers
From what I can tell (very limited), distlib seems to contain a good baseline design of a library fulfilling the intended purpose, but is currently missing parts to be fully usable on its own. Would it be a good idea to extend it with picked parts from pip? Should I contribute directly to it, or make a (higher level) wrapper around it with those parts? Should I actually use parts from it, or from other projects (e.g. distlib.version vs packaging.version, distlib.locator or pip’s PackageFinder)? It would be extremely helpful if there is a somewhat general, high-level view to the whole situation.
Distlib was created as a place to experiment with making a
library-style interface to various pieces of packaging functionality.
At the time it was created, there were not many standardised parts of
the packaging ecosystem, so while it followed the standards where they
existed, it also implemented a number of pieces of functionality that
*weren't* backed by standards (obvious examples being the script
creation stuff and the package finder).
Packaging, in the other hand, was designed to focus strictly on
implementations of agreed standards, providing reference APIs for
projects to use.
Pip uses both libraries, but as far as I'm aware, we'd use an API from
packaging in preference to distlib. The only distlib API we use is the
script maker API. Pretty much everything else in distlib, we already
had an internal implementation for by the time distlib was written, so
there was no benefit in changing (in contrast, the benefit in
switching to packaging is "by design conformance to the relevant
standards").
My recommendations would be:
1. Use packaging APIs always where they exist, even if a distlib
equivalent exists.
2. Never use pip APIs, they are internal use only (Paul bangs on that
old drum again :-))
3. Consider using distlib APIs for things like the locator API,
because it's better than writing your own code, but be aware of the
risks.
When I say risks here, the things I'd consider are:
* Distlibs APIs aren't used in many projects, so they are likely less
well tested (that's a chicken and egg issue, people need to use them
to make them better).
* Be aware that they may not behave identically to pip outside of the
standardised areas.
* In particular, review the details of which locators you use - the
default is very different from pip's process (although the results
should be the same).
I don't know what the longer term goals are for distlib. It's not yet
really become the "toolkit for building packaging tools" that it could
be. Whether there's an intention for it to become that, I don't know.
In some ways I'd like to turn the question round - why didn't tools
like pipenv and pip-tools use distlib for their core functionality,
rather than patching into pip's internals? The answers to that
question might clarify better what needs to happen if distlib is to
become the obvious place to find packaging functionality.
Paul