New subject: Keyword meanings [was: Accept just PEP-0426]

Nov. 19, 2012

      I think this PEP is a significant improvement from its predecessor. It
represents features like extras (provides-extra) and build requirements
(setup-requires-dist) that are in use in the Python community but cannot be
represented in older versions of the format, it finally specifies a UTF-8
encoding, removes RFC 822, provides an extension mechanism, and allows the
description to be placed in the document payload.

PEP 426 doesn't have anything to do with the Wheel PEPs 425 and 427, other
than that its features are necessary to usefully represent a large number
of existing Python packages. How about moving this one along to focus on
the other two.

I'm not sure what the Post-History should be. We have been talking about it
for a while.

Thanks,

Daniel Holth

PEP: 426
Title: Metadata for Python Software Packages 1.3
Version: $Revision$
Last-Modified: $Date$
Author: Daniel Holth <dholth@fastmail.fm>
Discussions-To: Distutils SIG
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 30 Aug 2012

Abstract
========

This PEP describes a mechanism for adding metadata to Python distributions.
It includes specifics of the field names, and their semantics and
usage.

This document specifies version 1.3 of the metadata format.
Version 1.0 is specified in PEP 241.
Version 1.1 is specified in PEP 314.
Version 1.2 is specified in PEP 345.

Version 1.3 of the metadata format adds fields designed to make
third-party packaging of Python Software easier and defines a
formal extension mechanism.  The fields are "Setup-Requires-Dist"
"Provides-Extra", and "Extension".  This version also adds the `extra`
variable to the `environment markers` specification and allows the
description to be placed into a payload section.

Metadata Files
==============

The syntax defined in this PEP is for use with Python distribution
metadata files. The file format is a simple UTF-8 encoded Key: value
format with case-insensitive keys and no maximum line length, followed by
a blank line and an arbitrary payload.  It is parseable by the ``email``
module with an appropriate ``email.policy.Policy()``.

When ``metadata`` is a Unicode string,
```email.parser.Parser().parsestr(metadata)`` is a serviceable parser.

There are two standard locations for these metadata files:

* the ``PKG-INFO`` file included in the base directory of Python
  source distribution archives (as created by the distutils ``sdist``
  command)
* the ``.dist-info/METADATA`` files in a Python installation database, as
  described in PEP 376.

Other tools involved in Python distribution may also use this format.

Encoding
========

Metadata 1.3 files are UTF-8 with the restriction that keys must be
ASCII. Parser implementations should be aware that older versions of
the Metadata specification do not specify an encoding.

Fields
======

This section specifies the names and semantics of each of the
supported metadata fields.

In a single Metadata 1.3 file, fields marked with "(optional)" may occur
0 or 1 times.  Fields marked with "(multiple use)" may be specified
0, 1 or more times.  Only "Metadata-Version", "Name", "Version", and
"Summary" must appear exactly once.  The fields may appear in any order
within the file.

Metadata-Version
::::::::::::::::

Version of the file format; "1.3" is the only legal value.

Example::

    Metadata-Version: 1.3

Name
::::

The name of the distribution.

Example::

    Name: BeagleVote

Version
:::::::

A string containing the distribution's version number.  This
field  must be in the format specified in PEP 386.

Example::

    Version: 1.0a2

Summary
:::::::

A one-line summary of what the distribution does.

Example::

    Summary: A module for collecting votes from beagles.

Platform (multiple use)
:::::::::::::::::::::::

A Platform specification describing an operating system supported by
the distribution which is not listed in the "Operating System" Trove
classifiers.
See "Classifier" below.

Examples::

    Platform: ObscureUnix
    Platform: RareDOS

Supported-Platform (multiple use)
:::::::::::::::::::::::::::::::::

Binary distributions containing a metadata file will use the
Supported-Platform field in their metadata to specify the OS and
CPU for which the binary distribution was compiled.  The semantics of
the Supported-Platform field are not specified in this PEP.

Example::

    Supported-Platform: RedHat 7.2
    Supported-Platform: i386-win32-2791

Description (optional, deprecated)
::::::::::::::::::::::::::::::::::

A longer description of the distribution that can run to several
paragraphs.  Software that deals with metadata should not assume
any maximum size for this field.

The contents of this field can be written using reStructuredText
markup [1]_.  For programs that work with the metadata, supporting
markup is optional; programs can also display the contents of the
field as-is.  This means that authors should be conservative in
the markup they use.

Since a line separator immediately followed by another line separator
indicates the end of the headers section, any line separators in the
description must be suffixed by whitespace to indicate continuation.

Since Metadata 1.3 the recommended place for the description is in the
payload section of the document, after the last header.  The description
does not need to be reformatted when it is included in the payload.

Keywords (optional)
:::::::::::::::::::

A list of additional keywords to be used to assist searching
for the distribution in a larger catalog.

Example::

    Keywords: dog puppy voting election

Home-page (optional)
::::::::::::::::::::

A string containing the URL for the distribution's home page.

Example::

    Home-page: http://www.example.com/~cschultz/bvote/

Download-URL (optional)
:::::::::::::::::::::::

A string containing the URL from which this version of the distribution
can be downloaded.  (This means that the URL can't be something like
".../BeagleVote-latest.tgz", but instead must be ".../BeagleVote-0.45.tgz".)

Author (optional)
:::::::::::::::::

A string containing the author's name at a minimum; additional
contact information may be provided.

Example::

    Author: C. Schultz, Universal Features Syndicate,
            Los Angeles, CA <cschultz@peanuts.example.com>

Author-email (optional)
:::::::::::::::::::::::

A string containing the author's e-mail address.  It contains a name
and e-mail address in the RFC 5322 recommended ``Address Specification``
format.

Example::

    Author-email: "C. Schultz" <cschultz@example.com>

Maintainer (optional)
:::::::::::::::::::::

A string containing the maintainer's name at a minimum; additional
contact information may be provided.

Note that this field is intended for use when a project is being
maintained by someone other than the original author:  it should be
omitted if it is identical to ``Author``.

Example::

    Maintainer: C. Schultz, Universal Features Syndicate,
            Los Angeles, CA <cschultz@peanuts.example.com>

Maintainer-email (optional)
:::::::::::::::::::::::::::

A string containing the maintainer's e-mail address.  It has the same
format as ``Author-email``.

Note that this field is intended for use when a project is being
maintained by someone other than the original author:  it should be
omitted if it is identical to ``Author-email``.

Example::

    Maintainer-email: "C. Schultz" <cschultz@example.com>

License (optional)
::::::::::::::::::

Text indicating the license covering the distribution where the license
is not a selection from the "License" Trove classifiers. See
"Classifier" below.  This field may also be used to specify a
particular version of a license which is named via the ``Classifier``
field, or to indicate a variation or exception to such a license.

Examples::

    License: This software may only be obtained by sending the
            author a postcard, and then the user promises not
            to redistribute it.

    License: GPL version 3, excluding DRM provisions

The full text of the license would normally be included in a separate
file.

Classifier (multiple use)
:::::::::::::::::::::::::

Each entry is a string giving a single classification value
for the distribution.  Classifiers are described in PEP 301 [2].

Examples::

    Classifier: Development Status :: 4 - Beta
    Classifier: Environment :: Console (Text Based)

Requires-Dist (multiple use)
::::::::::::::::::::::::::::

Each entry contains a string naming some other distutils
project required by this distribution.

The format of a requirement string is identical to that of a
distutils project name (e.g., as found in the ``Name:`` field.
optionally followed by a version declaration within parentheses.

The distutils project names should correspond to names as found
on the `Python Package Index`_.

Version declarations must follow the rules described in
`Version Specifiers`_

Examples::

    Requires-Dist: pkginfo
    Requires-Dist: PasteDeploy
    Requires-Dist: zope.interface (>3.5.0)

Setup-Requires-Dist (multiple use)
::::::::::::::::::::::::::::::::::

Like Requires-Dist, but names dependencies needed while the
distributions's distutils / packaging `setup.py` / `setup.cfg` is run.
Commonly used to generate a manifest from version control.

Examples::

    Setup-Requires-Dist: custom_setup_command

Dependencies mentioned in `Setup-Requires-Dist` may be installed exclusively
for setup and are not guaranteed to be available at run time.

Provides-Dist (multiple use)
::::::::::::::::::::::::::::

Each entry contains a string naming a Distutils project which
is contained within this distribution.  This field *must* include
the project identified in the ``Name`` field, followed by the
version : Name (Version).

A distribution may provide additional names, e.g. to indicate that
multiple projects have been bundled together.  For instance, source
distributions of the ``ZODB`` project have historically included
the ``transaction`` project, which is now available as a separate
distribution.  Installing such a source distribution satisfies
requirements for both ``ZODB`` and ``transaction``.

A distribution may also provide a "virtual" project name, which does
not correspond to any separately-distributed project:  such a name
might be used to indicate an abstract capability which could be supplied
by one of multiple projects.  E.g., multiple projects might supply
RDBMS bindings for use by a given ORM:  each project might declare
that it provides ``ORM-bindings``, allowing other projects to depend
only on having at most one of them installed.

A version declaration may be supplied and must follow the rules described
in `Version Specifiers`_. The distribution's version number will be implied
if none is specified.

Examples::

    Provides-Dist: OtherProject
    Provides-Dist: AnotherProject (3.4)
    Provides-Dist: virtual_package

Obsoletes-Dist (multiple use)
:::::::::::::::::::::::::::::

Each entry contains a string describing a distutils project's distribution
which this distribution renders obsolete, meaning that the two projects
should not be installed at the same time.

Version declarations can be supplied.  Version numbers must be in the
format specified in `Version Specifiers`_.

The most common use of this field will be in case a project name
changes, e.g. Gorgon 2.3 gets subsumed into Torqued Python 1.0.
When you install Torqued Python, the Gorgon distribution should be
removed.

Examples::

    Obsoletes-Dist: Gorgon
    Obsoletes-Dist: OtherProject (<3.0)

Requires-Python (optional)
::::::::::::::::::::::::::

This field specifies the Python version(s) that the distribution is
guaranteed to be compatible with.

Version numbers must be in the format specified in `Version Specifiers`_.

Examples::

    Requires-Python: 2.5
    Requires-Python: >2.1
    Requires-Python: >=2.3.4
    Requires-Python: >=2.5,<2.7

Requires-External (multiple use)
::::::::::::::::::::::::::::::::

Each entry contains a string describing some dependency in the
system that the distribution is to be used.  This field is intended to
serve as a hint to downstream project maintainers, and has no
semantics which are meaningful to the ``distutils`` distribution.

The format of a requirement string is a name of an external
dependency, optionally followed by a version declaration within
parentheses.

Because they refer to non-Python software releases, version numbers
for this field are **not** required to conform to the format
specified in PEP 386:  they should correspond to the
version scheme used by the external dependency.

Notice that there's is no particular rule on the strings to be used.

Examples::

    Requires-External: C
    Requires-External: libpng (>=1.5)

Project-URL (multiple use)
::::::::::::::::::::::::::

A string containing a label and a browsable URL for the project, separated
by the last occurrence of comma and space ", ".

Example::

    Bug, Issue Tracker, http://bitbucket.org/tarek/distribute/issues/

The label is a free text.

Provides-Extra (multiple use)
:::::::::::::::::::::::::::::

A string containing the name of an optional feature. Must be printable
ASCII, not containing whitespace, comma (,), or square brackets [].
May be used to make a dependency conditional on whether the optional
feature has been requested.

Example::

    Name: beaglevote
    Provides-Extra: pdf
    Requires-Dist: reportlab; extra == 'pdf'
    Requires-Dist: nose; extra == 'test'
    Requires-Dist: sphinx; extra == 'doc'

A second distribution requires an optional dependency by placing it
inside square brackets and can request multiple features by separating
them with a comma (,).

The full set of requirements is the union of the `Requires-Dist` sets
evaluated with `extra` set to `None` and then to the name of each
requested feature.

Example::

    Requires-Dist: beaglevote[pdf]
        -> requires beaglevote, reportlab

    Requires-Dist: beaglevote[test, doc]
        -> requires beaglevote, sphinx, nose

Two feature names `test` and `doc` are reserved to mark dependencies that
are needed for running automated tests and generating documentation,
respectively.

It is legal to specify `Provides-Extra` without referencing it in any
`Requires-Dist`. It is an error to request a feature name that has
not been declared with `Provides-Extra`.

Extension (multiple use)
::::::::::::::::::::::::

An ASCII string, not containing whitespace or the / character, that
indicates the presence of extended metadata. Additional tags defined by
an `Extension: Chili` must be of the form `Chili/Name`::

    Extension: Chili
    Chili/Type: Poblano
    Chili/Heat: Mild

An implementation might iterate over all the declared `Extension:`
fields to invoke the processors for those extensions.  As the order of
the fields is not used, the `Extension: Chili` field may appear before
or after its declared tags `Chili/Type:` etc.

Version Specifiers
==================

Version specifiers are a series of conditional operators and
version numbers, separated by commas.  Conditional operators
must be one of "<", ">", "<=", ">=", "==" and "!=".

Any number of conditional operators can be specified, e.g.
the string ">1.0, !=1.3.4, <2.0" is a legal version declaration.
The comma (",") is equivalent to the **and** operator.

Each version number must be in the format specified in PEP 386.

When a version is provided, it always includes all versions that
starts with the same value. For example the "2.5" version of Python
will include versions like "2.5.2" or "2.5.3". Pre and post releases
in that case are excluded. So in our example, versions like "2.5a1" are
not included when "2.5" is used. If the first version of the range is
required, it has to be explicitly given. In our example, it will be
"2.5.0".

Notice that some projects might omit the ".0" prefix for the first release
of the "2.5.x" series:

- 2.5
- 2.5.1
- 2.5.2
- etc.

In that case, "2.5.0" will have to be explicitly used to avoid any confusion
between the "2.5" notation that represents the full range. It is a
recommended
practice to use schemes of the same length for a series to completely avoid
this problem.

Some Examples:

- ``Requires-Dist: zope.interface (3.1)``: any version that starts with 3.1,
  excluding post or pre-releases.
- ``Requires-Dist: zope.interface (==3.1)``: equivalent to ``Requires-Dist:
  zope.interface (3.1)``.
- ``Requires-Dist: zope.interface (3.1.0)``: any version that starts with
  3.1.0, excluding post or pre-releases. Since that particular project
doesn't
  use more than 3 digits, it also means "only the 3.1.0 release".
- ``Requires-Python: 3``: Any Python 3 version, no matter wich one,
excluding
  post or pre-releases.
- ``Requires-Python: >=2.6,<3``: Any version of Python 2.6 or 2.7, including
  post releases of 2.6, pre and post releases of 2.7. It excludes pre
releases
  of Python 3.
- ``Requires-Python: 2.6.2``: Equivalent to ">=2.6.2,<2.6.3". So this
includes
  only Python 2.6.2. Of course, if Python was numbered with 4 digits, it
would
  have include all versions of the 2.6.2 series.
- ``Requires-Python: 2.5.0``: Equivalent to ">=2.5.0,<2.5.1".
- ``Requires-Dist: zope.interface (3.1,!=3.1.3)``: any version that starts
with
  3.1, excluding post or pre-releases of 3.1 *and* excluding any version
that
  starts with "3.1.3". For this particular project, this means: "any version
  of the 3.1 series but not 3.1.3". This is equivalent to:
  ">=3.1,!=3.1.3,<3.2".

Environment markers
===================

An **environment marker** is a marker that can be added at the end of a
field after a semi-colon (";"), to add a condition about the execution
environment.

Here are some example of fields using such markers::

   Requires-Dist: pywin32 (>1.0); sys.platform == 'win32'
   Obsoletes-Dist: pywin31; sys.platform == 'win32'
   Requires-Dist: foo (1,!=1.3); platform.machine == 'i386'
   Requires-Dist: bar; python_version == '2.4' or python_version == '2.5'
   Requires-External: libxslt; 'linux' in sys.platform

The micro-language behind this is a simple subset of Python: it compares
only strings, with the ``==`` and ``in`` operators (and their opposites),
and with the ability to combine expressions. Parenthesis are supported
for grouping.

The pseudo-grammar is ::

    EXPR [in|==|!=|not in] EXPR [or|and] ...

where ``EXPR`` belongs to any of those:

- python_version = '%s.%s' % (sys.version_info[0], sys.version_info[1])
- python_full_version = sys.version.split()[0]
- os.name = os.name
- sys.platform = sys.platform
- platform.version = platform.version()
- platform.machine = platform.machine()
- platform.python_implementation = platform.python_implementation()
- a free string, like ``'2.4'``, or ``'win32'``
- extra = (name of requested feature) or None

Notice that ``in`` is restricted to strings, meaning that it is not possible
to use other sequences like tuples or lists on the right side.

The fields that benefit from this marker are:

- Requires-Python
- Requires-External
- Requires-Dist
- Setup-Requires-Dist
- Provides-Dist
- Obsoletes-Dist
- Classifier

(The `extra` variable is only meaningful for Requires-Dist.)

Summary of Differences From PEP 345
===================================

* Metadata-Version is now 1.3.

* Values are now expected to be UTF-8.

* A payload (containing the description) may appear after the headers.

* Added `extra` to environment markers.

* Most fields are now optional.

* Changed fields:

  - Description
  - Project-URL
  - Requires-Dist

* Added fields:

  - Extension
  - Provides-Extra
  - Setup-Requires-Dist

References
==========

This document specifies version 1.3 of the metadata format.
Version 1.0 is specified in PEP 241.
Version 1.1 is specified in PEP 314.
Version 1.2 is specified in PEP 345.

.. [1] reStructuredText markup:
   http://docutils.sourceforge.net/

.. _`Python Package Index`: http://pypi.python.org/pypi/

.. [2] PEP 301:
   http://www.python.org/dev/peps/pep-0301/

Appendix
========

Parsing and generating the Metadata 1.3 serialization format using
Python 3.3::

    # Metadata 1.3 demo
    from email.generator import Generator
    from email import header
    from email.parser import Parser
    from email.policy import Compat32
    from email.utils import _has_surrogates

    class MetadataPolicy(Compat32):
        max_line_length = 0
        continuation_whitespace = '\t'

        def _sanitize_header(self, name, value):
            if not isinstance(value, str):
                return value
            if _has_surrogates(value):
                raise NotImplementedError()
            else:
                return value

        def _fold(self, name, value, sanitize):
            body = ((self.linesep+self.continuation_whitespace)
                    .join(value.splitlines()))
            return ''.join((name, ': ', body, self.linesep))

    if __name__ == "__main__":
        import sys
        import textwrap

        pkg_info = """\
    Metadata-Version: 1.3
    Name: package
    Version: 0.1.0
    Summary: A package.
    Description: Description
        ===========

        A description of the package.

    """

        m = Parser(policy=MetadataPolicy()).parsestr(pkg_info)

        m['License'] = 'GPL'
        description = m['Description']
        description_lines = description.splitlines()
        m.set_payload(description_lines[0]
                + '\n'
                + textwrap.dedent('\n'.join(description_lines[1:]))
                + '\n')
        del m['Description']

        # Correct if sys.stdout.encoding == 'UTF-8':
        Generator(sys.stdout, maxheaderlen=0).flatten(m)

Copyright
=========

This document has been placed in the public domain.

..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   End:

Accept just PEP-0426

Andrew McNabb