[Distutils] New PEP : dependency specification

Robert Collins robertc at robertcollins.net
Thu Nov 5 22:32:41 EST 2015


Since we ended up with a hard dependency on this for the bootstrap
thing (regardless of 'smaller step' or not) - I've broken this out of
PEP 426, made it an encoding of the current status quo rather than an
aspirational change. Since it has a dependency on markers, I had to
choose whether to block on James' marker PEP, contribute to that, or
include it. I think on balance it makes sense to have it in one
document since the markers bit is actually quite shallow, so I've done
that (after discussing with James). This could perhaps replace PEP 496
then, or be given a new number.

Donald has graciously agreed to be a BDFL-delegate for it.

The PR for it is https://github.com/pypa/interoperability-peps/pull/56

Full text follows.

:PEP: XX
:Title: Dependency specification for Python Software Packages
:Version: $Revision$
:Last-Modified: $Date$
:Author: Robert Collins <rbtcollins at hp.com>
BDFL-Delegate: Donald Stufft <donald at stufft.io>
:Discussions-To: distutils-sig <distutils-sig at python.org>
:Status: Draft
:Type: Standards Track
:Content-Type: text/x-rst
:Created: 11-Nov-2015
:Post-History: XX


Abstract
========

This PEP specifies the language used to describe dependencies for packages.
It draws a border at the edge of describing a single dependency - the
different sorts of dependencies and when they should be installed is a higher
level problem. The intent is provide a building block for higher layer
specifications.

The job of a dependency is to enable tools like pip [#pip]_ to find the right
package to install. Sometimes this is very loose - just specifying a name, and
sometimes very specific - referring to a specific file to install. Sometimes
dependencies are only relevant in one platform, or only some versions are
acceptable, so the language permits describing all these cases.

The language defined is a compact line based format which is already in
widespread use in pip requirements files, though we do not specify the command
line option handling that those files permit. There is one caveat - the
URL reference form, specified in PEP-440 [#pep440]_ is not actually
implemented in pip, but since PEP-440 is accepted, we use that format rather
than pip's current native format.

Motivation
==========

Any specification in the Python packaging ecosystem that needs to consume
lists of dependencies needs to build on an approved PEP for such, but
PEP-426 [#pep426]_ is mostly aspirational - and there are already existing
implementations of the dependency specification which we can instead adopt.
The existing implementations are battle proven and user friendly, so adopting
them is arguably much better than approving an aspirational, unconsumed, format.

Specification
=============

Examples
--------

All features of the language shown with a name based lookup::

    requests \
        [security] >= 2.8.1, == 2.8.* \
        ; python_version < "2.7.10" \
        # Fix HTTPS on older Python versions.

A minimal URL based lookup::

    pip @ https://github.com/pypa/pip/archive/1.3.1.zip#sha1=da9234ee9982d4bbb3c72346a6de940a148ea686

Concepts
--------

A dependency specification always specifies a distribution name. It may
include extras, which expand the dependencies of the named distribution to
enable optional features. The version installed can be controlled using
version limits, or giving the URL to a specific artifact to install. Finally
the dependency can be made conditional using environment markers.

Grammar
-------

We first cover the grammar briefly and the drill into the semantics of each
section later.

A distribution specification is written in ASCII text. We use ABNF [#abnf]_ to
provide a precise grammar. Specifications may have comments starting with a
'#' and running to the end of the line::

    comment       = "#" *(WSP / VCHAR)

Specifications may be spread across multiple lines if desired using
continuations - a single backslash followed by a new line ('\\\\n')::

    CSP           = 1*(WSP / ("\\" LF))

Versions may be specified according to the PEP-440 [#pep440]_ rules. (Note:
URI is defined in std-66 [#std66]_::

    version-cmp   = "<" / "<=" / "!=" / "==" / ">=" / ">" / "~=" / "==="
    version       = 1*( DIGIT / ALPHA / "-" / "_" / "." / "*" )
    versionspec   = version-cmp version *(',' version-cmp version)
    urlspec       = "@" URI

Environment markers allow making a specification only take effect in some
environments::

    marker-op     = version-cmp / "in" / "not in"
    python-str-c  = (WSP / ALPHA / DIGIT / "(" / ")" / "." / "{" / "}" /
                    "-" / "_" / "*"
    python-str    = "'" *(python-str-c / DQUOTE) "'"
    python-str    =/ DQUOTE *(python-str-c / "'") DQUOTE
    marker-var    = python-str / "python_version" / "python_full_version" /
                    "os_name"" / "sys_platform" / "platform_release" /
                    "platform_version" / "platform_machine" /
                    "platform_python_implementation" / "implementation_name" /
                    "implementation_version" / "platform_dist_name"
                    "platform_dist_version" / "platform_dist_id"
    marker-expr   = "(" marker ")" / (marker-var [marker-op marker-var])
    marker        = marker-expr *( ("and" / "or") marker-expr)
    name-marker   = ";" *CSP marker
    url-marker    = ";" 1*CSP marker

Optional components of a distribution may be specified using the extras
field::

    identifier    = 1*( DIGIT / ALPHA / "-" / "_" )
    name          = identifier
    extras        = "[" identifier *("," identifier) "]"

Giving us a rule for name based requirements::

    name_req      = name [CSP extras] [CSP versionspec] [CSP name-marker]

And a rule for direct reference specifications::

    url_req       = name [CSP extras] urlspec [CSP url-marker]

Leading to the unified rule that can specify a dependency::

    specification = (name_req / location_req) [CSP comment]

Whitespace
----------

Non line-breaking whitespace is optional and has no semantic meaning.

A line break indicates the end of a specification. Specifications can be
continued across multiple lines using a continuation.

Comments
--------

A specification can have a comment added to it by starting the comment with a
"#". After a "#" the rest of the line can contain any text whatsoever.
Continuations within a comment are ignored.

Names
-----

Python distribution names are currently defined in PEP-345 [#pep345]_. Names
act as the primary identifier for distributions. They are present in all
dependency specifications, and are sufficient to be a specification on their
own.

Extras
------

An extra is an optional part of a distribution. Distributions can specify as
many extras as they wish, and each extra results in the declaration of
additional dependencies of the distribution **when** the extra is used in a
dependency specification. For instance::

    requests[security]

Extras union in the dependencies they define with the dependencies of the
distribution they are attached to. The example above would result in requests
being installed, and requests own dependencies, and also any dependencies that
are listed in the "security" extra of requests.

If multiple extras are listed, all the dependencies are unioned together.

Versions
--------

See PEP-440 [#pep440]_ for more detail on both version numbers and version
comparisons. Version specifications limit the versions of a distribution that
can be used. They only apply to distributions looked up by name, rather than
via a URL. Version comparison are also used in the markers feature.

Environment Markers
-------------------

Environment markers allow a dependency specification to provide a rule that
describes when the dependency should be used. For instance, consider a package
that needs argparse. In Python 2.7 argparse is always present. On older Python
versions it has to be installed as a dependency. This can be expressed as so::

    argparse;python_version<"2.7"

A marker expression evalutes to either True or False. When it evaluates to
False, the dependency specification should be ignored.

The marker language is a subset of Python itself, chosen for the ability to
safely evaluate it without running arbitrary code that could become a security
vulnerability. Markers were first standardised in PEP-345 [#pep345]_. This PEP
fixes some issues that were observed in the described in PEP-426 [#pep426]_.

Comparisons in marker expressions are typed by the comparison operator.  The
<marker-op> operators that are not in <version-cmp> perform the same as they
do for strings in Python. The <version-cmp> operators use the PEP-440
[#pep440]_ version comparison rules if both sides are valid versions. If
either side is not a valid version, then the comparsion falls back to the same
behaviour as in for string in Python if the operator exists in Python. For
those operators which are not defined in Python, the result should be False.

The variables in the marker grammar such as "os_name" resolve to values looked
up in the Python runtime. If a particular value is not available (such as
``sys.implementation.name`` in versions of Python prior to 3.3, or
``platform.dist()`` on non-Linux systems), the default value will be used.

.. list-table::
   :header-rows: 1

   * - Marker
     - Python equivalent
     - Sample values
     - Default if unavailable
   * - ``os_name``
     - ``os.name``
     - ``posix``, ``java``
     - ""
   * - ``sys_platform``
     - ``sys.platform``
     - ``linux``, ``darwin``, ``java1.8.0_51``
     - ""
   * - ``platform_release``
     - ``platform.release()``
     - ``3.14.1-x86_64-linode39``, ``14.5.0``, ``1.8.0_51``
     - ""
   * - ``platform_machine``
     - ``platform.machine()``
     - ``x86_64``
     - ""
   * - ``platform_python_implementation``
     - ``platform.python_implementation()``
     - ``CPython``, ``Jython``
     - ""
   * - ``implementation_name``
     - ``sys.implementation.name``
     - ``cpython``
     - ""
   * - ``platform_version``
     - ``platform.version()``
     - ``#1 SMP Fri Apr 25 13:07:35 EDT 2014``

       ``Java HotSpot(TM) 64-Bit Server VM, 25.51-b03, Oracle Corporation``

       ``Darwin Kernel Version 14.5.0: Wed Jul 29 02:18:53 PDT 2015;
root:xnu-2782.40.9~2/RELEASE_X86_64``
     - ""
   * - ``platform_dist_name``
     - ``platform.dist()[0]``
     - ``Ubuntu``
     - ""
   * - ``platform_dist_version``
     - ``platform.dist()[1]``
     - ``14.04``
     - ""
   * - ``platform_dist_id``
     - ``platform.dist()[2]``
     - ``trusty``
     - ""
   * - ``python_version``
     - ``platform.python_version()[:3]``
     - ``3.4``, ``2.7``
     - "0"
   * - ``python_full_version``
     - see definition below
     - ``3.4.0``, ``3.5.0b1``
     - "0"
   * - ``implementation_version``
     - see definition below
     - ``3.4.0``, ``3.5.0b1``
     - "0"

The ``python_full_version`` and ``implementation_version`` marker variables
are derived from ``sys.version_info`` and ``sys.implementation.version``
respectively, in accordance with the following algorithm::

    def format_full_version(info):
        version = '{0.major}.{0.minor}.{0.micro}'.format(info)
        kind = info.releaselevel
        if kind != 'final':
            version += kind[0] + str(info.serial)
        return version

    python_full_version = format_full_version(sys.version_info)
    implementation_version = format_full_version(sys.implementation.version)

``python_full_version`` will typically correspond to ``sys.version.split()[0]``.

If a particular version number value is not available (such as
``sys.implementation.version`` in versions of Python prior to 3.3) the
corresponding marker variable returned by setuptools will be set to ``0``

Backwards Compatibility
=======================

Most of this PEP is already widely deployed and thus offers no compatibiltiy
concerns.

There are however two key points where the PEP differs from the deployed base.

Firstly, PEP-440 direct references haven't actually been deployed in the wild,
but they were designed to be compatibly added, and there are no known
obstacles to adding them to pip or other tools that consume the existing
dependency metadata in distributions.

Secondly, PEP-426 markers which have had some reasonable deployment,
particularly in wheels and pip, will handle version comparisons with
``python_version`` "2.7.10" differently. Specifically in 426 "2.7.10" is less
than "2.7.9". This backward incompatibility is deliberate. We are also
defining new operators - "~=" and "===", and new variables - the
``platform_dist_*`` variables which are not present in older marker
implementations. The variables will fall back to "" on those implementations,
permitting reasonably graceful upgrade. The new version comparisons will cause
errors, so adoption may require waiting some time for deployment to be
widespread.

Rationale
=========

In order to move forward with any new PEPs that depend on environment markers,
we needed a specification that included them.

The requirement specifier EBNF is lifted from setuptools pkg_resources
documentation, since we can't sensible depend on a defacto standard.


References
==========

.. [#pip] pip, the recommended installer for Python packages
   (http://pip.readthedocs.org/en/stable/)

.. [#pep345] PEP-345, Python distribution metadata version 1.2.
   (https://www.python.org/dev/peps/pep-0345/)

.. [#pep426] PEP-426, Python distribution metadata.
   (https://www.python.org/dev/peps/pep-0426/)

.. [#pep440] PEP-440, Python distribution metadata.
   (https://www.python.org/dev/peps/pep-0440/)

.. [#abnf] ABNF specification.
   (https://tools.ietf.org/html/rfc5234)

.. [#std66] The URL specification.
   (https://tools.ietf.org/html/rfc3986)

Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:


-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud


More information about the Distutils-SIG mailing list