[Distutils] Maintaining a curated set of Python packages

Freddy Rietdijk freddyrietdijk at fridh.nl
Thu Dec 1 04:45:57 EST 2016


I would like to propose that, as a community, we jointly maintain a curated
set of Python packages that are known to work together. These packages
would receive security updates for some time and every couple of months a
new major release of the curated set comes available. The idea of this is
inspired by Haskell LTS, so maybe we should call this PyPI LTS?

So why a PyPI LTS?

PyPI makes available all versions of packages that were uploaded, and by
default installers like pip will try to use the latest available versions
of packages, unless told otherwise. With a requirements.txt file (or a
future pipfile.lock) and setup.py we can pin as much as we like our
requirements of respectively the environment and package requirements,
thereby making a more reproducible environment possible and also fixing the
API for developers. Pinning requirements is often a manual job, although
one could use pip freeze or other tools.

A common problem is when two packages in a certain environment require
different versions of a package. Having a curated set of packages,
developers could be encouraged to test against the latest stable and
nightly of the curated package set, thereby increasing compatibility
between different packages, something I think we all want.

Having a compatible set of packages is not only interesting for developers,
but also for downstream distributions. All distributions try to find a set
of packages that are working together and release them. This is a lot of
work, and I think it would be in everyone's benefit if we try to solve this
issue together.

A possible solution

Downstream, that is developers and distributions, will need a set of
packages that are known to work together. At minimum this would consist of,
per package, the name of the package and its version, but for
reproducibility I would propose adding the filename and hash as well.
Because there isn't any reliable method to extract the requirements of a
package, I propose also including `setup_requires`, install_requires`, and
`tests_require` explicitly. That way, distributions can automatically build
recipes for the packages (although non-Python dependencies would still have
to be resolved by the distribution).

The package set would be released as lts-YYYY-MM-REVISION, and developers
can choose to track a specific revision, but would typically be asked to
track only lts-YYYY-MM which would resolve to the latest REVISION.

Because dependencies vary per Python language version, interpreter, and
operating system, we would have to have these sets for each combination and
therefore I propose having a source which evaluates to say a TOML/JSON file
per version/interpreter/OS.
How this source file should be written I don't know; while I think the Nix
expression language is an excellent choice for this, it is not possible for
everyone to use and therefore likely not an option.

Open questions

There are still plenty of open questions.

- Who decides when a package is updated that would break dependents? This
is an issue all distributions face, so maybe we should involve them.
- How would this be integrated with pip / virtualenv / pipfile.lock /
requirements.txt / setup.py? See e.g.

References to Haskell LTS

Here are several links to some interesting documents on how Haskell LTS
- A blog post describing what Haskell LTS is:
- Rules regarding uploading and breaking packages:
- The actual LTS files https://github.com/fpco/lts-haskell

What do you think of this proposal? Would you be interested in this as
developer, or packager?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20161201/b62f6cd0/attachment.html>

More information about the Distutils-SIG mailing list