[Python-checkins] peps: PEP 426 updates

nick.coghlan python-checkins at python.org
Sun Feb 17 09:15:08 CET 2013


http://hg.python.org/peps/rev/53537d0808d1
changeset:   4746:53537d0808d1
user:        Nick Coghlan <ncoghlan at gmail.com>
date:        Sun Feb 17 18:14:42 2013 +1000
summary:
  PEP 426 updates

files:
  pep-0426.txt        |  267 +++++++++++++++++++++++++------
  pep-0426/pepsort.py |  233 +++++++++++++++++++++++++++
  2 files changed, 442 insertions(+), 58 deletions(-)


diff --git a/pep-0426.txt b/pep-0426.txt
--- a/pep-0426.txt
+++ b/pep-0426.txt
@@ -46,13 +46,17 @@
 ``email.policy.Policy()``.  When ``metadata`` is a Unicode string,
 ```email.parser.Parser().parsestr(metadata)`` is a serviceable parser.
 
-There are two standard locations for these metadata files:
+There are three standard locations for these metadata files:
 
 * the ``PKG-INFO`` file included in the base directory of Python
   source distribution archives (as created by the distutils ``sdist``
   command)
-* the ``.dist-info/METADATA`` files in a Python installation database, as
-  described in PEP 376.
+* the ``{distribution}-{version}.dist-info/METADATA`` file in a ``wheel``
+  binary distribution archive (as described in PEP 425, or a later version
+  of that specification)
+* the ``{distribution}-{version}.dist-info/METADATA`` files in a local
+  Python installation database (as described in PEP 376, or a later version
+  of that specification)
 
 Other tools involved in Python distribution may also use this format.
 
@@ -102,8 +106,9 @@
 Version
 -------
 
-A string containing the distribution's version identifier. See `Version scheme`_
-below.
+The distribution's public version identifier. Public versions are designed
+for consumption by automated tools and are strictly ordered according
+to a defined scheme. See `Version scheme`_ below.
 
 Example::
 
@@ -120,6 +125,21 @@
     Summary: A module for collecting votes from beagles.
 
 
+Private-Version (optional)
+--------------------------
+
+An arbitrary private version label. Private version labels are intended
+for internal use by a project, and cannot be used in version specifiers.
+See `Compatibility with other version schemes`_ below.
+
+Examples::
+
+    Private-Version: 1.0.0-alpha.1
+    Private-Version: 1.3.7+build.11.e0f985a
+    Private-Version: v1.8.1.301.ga0df26f
+    Private-Version: 2013.02.17.dev123
+
+
 Description (optional, deprecated)
 ----------------------------------
 
@@ -263,6 +283,8 @@
 Each entry is a string giving a single classification value
 for the distribution.  Classifiers are described in PEP 301 [2].
 
+`Environment markers`_ may be used with this field.
+
 Examples::
 
     Classifier: Development Status :: 4 - Beta
@@ -299,6 +321,8 @@
 in `Version scheme`_. The distribution's version identifier will be implied
 if none is specified.
 
+`Environment markers`_ may be used with this field.
+
 Examples::
 
     Provides-Dist: ThisProject
@@ -360,6 +384,8 @@
 Package Index`_; often the same as, but distinct from, the module names
 as accessed with ``import x``.
 
+`Environment markers`_ may be used with this field.
+
 Version declarations must follow the rules described in
 `Version specifiers`_
 
@@ -404,6 +430,8 @@
 This field specifies the Python version(s) that the distribution is
 guaranteed to be compatible with.
 
+`Environment markers`_ may be used with this field.
+
 Version declarations must be in the format specified in
 `Version specifiers`_.
 
@@ -439,6 +467,8 @@
 dependency, optionally followed by a version declaration within
 parentheses.
 
+`Environment markers`_ may be used with this field.
+
 Because they refer to non-Python software releases, version identifiers
 for this field are **not** required to conform to the format
 described in `Version scheme`_:  they should correspond to the
@@ -542,12 +572,14 @@
 Version scheme
 ==============
 
-Version identifiers must comply with the following scheme::
+Public version identifiers must comply with the following scheme::
 
     N[.N]+[{a|b|c|rc}N][.postN][.devN]
 
 Version identifiers which do not comply with this scheme are an error.
 
+Version identifiers must not include leading or trailing whitespace.
+
 Any given version will be a "release", "pre-release", "post-release" or
 "developmental release" as defined in the following sections.
 
@@ -576,6 +608,12 @@
 in turn, with "component does not exist" sorted ahead of all numeric
 values.
 
+Date based release numbers are explicitly excluded from compatibility with
+this scheme, as they hinder automatic translation to other versioning
+schemes, as well as preventing the adoption of semantic versioning without
+changing the name of the project. Accordingly, a leading release component
+greater than or equal to ``1980`` is an error.
+
 While any number of additional components after the first are permitted
 under this scheme, the most common variants are to use two components
 ("major.minor") or three components ("major.minor.micro").
@@ -612,37 +650,6 @@
    above shows both styles, always including the ``.0`` at the second
    level and consistently omitting it at the third level.
 
-.. note::
-
-   While date based release numbers, using the forms ``year.month`` or
-   ``year.month.day``, are technically compliant with this scheme, their use
-   is strongly discouraged as they can hinder automatic translation to
-   other versioning schemes. In particular, they are completely
-   incompatible with semantic versioning.
-
-
-Semantic versioning
--------------------
-
-`Semantic versioning`_ is a popular version identification scheme that is
-more prescriptive than this PEP regarding the significance of different
-elements of a release number. Even if a project chooses not to abide by
-the details of semantic versioning, the scheme is worth understanding as
-it covers many of the issues that can arise when depending on other
-distributions, and when publishing a distribution that others rely on.
-
-The "Major.Minor.Patch" (described in this PEP as "major.minor.micro")
-aspects of semantic versioning (clauses 1-9 in the 2.0.0-rc-1 specification)
-are fully compatible with the version scheme defined in this PEP, and abiding
-by these aspects is encouraged.
-
-Semantic versions containing a hyphen (pre-releases - clause 10) or a
-plus sign (builds - clause 11) are *not* compatible with this PEP
-and are not permitted in compliant metadata. Use this PEP's deliberately
-more restricted pre-release and developmental release notation instead.
-
-.. _Semantic versioning: http://semver.org/
-
 
 Pre-releases
 ------------
@@ -898,6 +905,70 @@
 should be used in preference to the one defined in PEP 386.
 
 
+Compatibility with other version schemes
+----------------------------------------
+
+Some projects may choose to use a version scheme which requires
+translation in order to comply with the public version scheme defined in
+this PEP. In such cases, the `Private-Version`__ field can be used to
+record the project specific version as an arbitrary label, while the
+translated public version is given in the `Version`_ field.
+
+__ `Private-Version (optional)`_
+
+This allows automated distribution tools to provide consistently correct
+ordering of published releases, while still allowing developers to use
+the internal versioning scheme they prefer for their projects.
+
+
+Semantic versioning
+~~~~~~~~~~~~~~~~~~~
+
+`Semantic versioning`_ is a popular version identification scheme that is
+more prescriptive than this PEP regarding the significance of different
+elements of a release number. Even if a project chooses not to abide by
+the details of semantic versioning, the scheme is worth understanding as
+it covers many of the issues that can arise when depending on other
+distributions, and when publishing a distribution that others rely on.
+
+The "Major.Minor.Patch" (described in this PEP as "major.minor.micro")
+aspects of semantic versioning (clauses 1-9 in the 2.0.0-rc-1 specification)
+are fully compatible with the version scheme defined in this PEP, and abiding
+by these aspects is encouraged.
+
+Semantic versions containing a hyphen (pre-releases - clause 10) or a
+plus sign (builds - clause 11) are *not* compatible with this PEP
+and are not permitted in the public `Version`_ field.
+
+One possible mechanism to translate such private semantic versions to
+compatible public versions is to use the ``.devN`` suffix to specify the
+appropriate version order.
+
+.. _Semantic versioning: http://semver.org/
+
+
+DVCS based version labels
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Many build tools integrate with distributed version control systems like
+Git and Mercurial in order to add an identifying hash to the version
+identifier. As hashes cannot be ordered reliably such versions are not
+permitted in the public `Version`_ field.
+
+As with semantic versioning, the public ``.devN`` suffix may be used to
+uniquely identify such releases for publication, while the private
+version field is used to record the original version label.
+
+
+Date based versions
+~~~~~~~~~~~~~~~~~~~
+
+As with other incompatible version schemes, date based versions can be
+stored in the ``Private-Version`` field. Translating them to a compliant
+version is straightforward: the simplest approach is to subtract the year
+of the first release from the major component in the release number.
+
+
 Version specifiers
 ==================
 
@@ -1043,31 +1114,33 @@
 
 The pseudo-grammar is ::
 
-    EXPR [in|==|!=|not in] EXPR [or|and] ...
+    MARKER: EXPR [(and|or) EXPR]*
+    EXPR: ("(" MARKER ")") | (SUBEXPR [(in|==|!=|not in) SUBEXPR])
 
-where ``EXPR`` belongs to any of these:
+where ``SUBEXPR`` belongs to any of the following (the details after the
+colon in each entry define the value represented by that subexpression):
 
-- python_version = '%s.%s' % (sys.version_info[0], sys.version_info[1])
-- python_full_version = sys.version.split()[0]
-- os.name = os.name
-- sys.platform = sys.platform
-- platform.version = platform.version()
-- platform.machine = platform.machine()
-- platform.python_implementation = platform.python_implementation()
-- a free string, like ``'2.4'``, or ``'win32'``
-- extra = (name of requested feature) or None
+* ``python_version``: '%s.%s' % (sys.version_info[0], sys.version_info[1])
+* ``python_full_version``: sys.version.split()[0]
+* ``os.name````: os.name
+* ``sys.platform````: sys.platform
+* ``platform.version``: platform.version()
+* ``platform.machine``: platform.machine()
+* ``platform.python_implementation``: = platform.python_implementation()
+* ``extra``: (name of requested feature) or None
+* ``'text'``: a free string, like ``'2.4'``, or ``'win32'``
 
-Notice that ``in`` is restricted to strings, meaning that it is not possible
-to use other sequences like tuples or lists on the right side.
+Notice that ``in`` and ``not in`` are restricted to strings, meaning that it
+is not possible to use other sequences like tuples or lists on the right
+side.
 
 The fields that benefit from this marker are:
 
-- ``Requires-Python``
-- ``Requires-External``
-- ``Requires-Dist``
-- ``Setup-Requires-Dist``
-- ``Provides-Dist``
-- ``Classifier``
+* ``Requires-Python``
+* ``Requires-External``
+* ``Requires-Dist``
+* ``Provides-Dist``
+* ``Classifier``
 
 
 Optional features
@@ -1133,11 +1206,24 @@
 
 * Values are now expected to be UTF-8
 
-* Changed the version scheme (eliminating the dependency on PEP 386)
+* Changed the version scheme
+
+  * added the new ``Private-Version`` field
+  * changed the top level sort position of the ``.devN`` suffix
+  * allowed single value version numbers
+  * explicit exclusion of leading or trailing whitespace
+  * explicit criterion for the exclusion of date based versions
+  * incorporated the version scheme directly into the PEP
 
 * Changed interpretation of version specifiers
 
-* Explicit handling of ordering and dependencies across metadata versions
+  * implicitly exclude pre-releases unless explicitly requested
+  * treat post releases the same way as unqualified releases
+
+* Discuss ordering and dependencies across metadata versions
+
+* Clarify use of parentheses for grouping in environment marker
+  pseudo-grammar
 
 * Support for packaging, build and installation dependencies
 
@@ -1188,6 +1274,13 @@
 Changing the version scheme
 ---------------------------
 
+The new ``Private-Version`` field is intended to make it clearer that the
+constraints on public version identifiers are there primarily to aid in
+the creation of reliable automated dependency analysis tools. Projects
+are free to use whatever versioning scheme they like internally, so long
+as they are able to translate it to something the dependency analysis tools
+will understand.
+
 The key change in the version scheme in this PEP relative to that in
 PEP 386 is to sort top level developmental releases like ``X.Y.devN`` ahead
 of alpha releases like ``X.Ya1``. This is a far more logical sort order, as
@@ -1214,12 +1307,68 @@
 version specifiers and release numbers, rather than splitting the
 two definitions.
 
+The exclusion of leading and trailing whitespace was made explicit after
+a couple of projects with version identifiers differing only in a
+trailing ``\n`` character were found on PyPI.
+
+The exclusion of major release numbers that looks like dates was implied
+by the overall text of PEP 386, but not clear in the definition of the
+version scheme. This exclusion has been made clear in the definition of
+the release component.
+
 Finally, as the version scheme in use is dependent on the metadata
 version, it was deemed simpler to merge the scheme definition directly into
 this PEP rather than continuing to maintain it as a separate PEP. This will
 also allow all of the distutils-specific elements of PEP 386 to finally be
 formally rejected.
 
+The following statistics provide an analysis of the compatibility of existing
+projects on PyPI with the specified versioning scheme (as of 16th February,
+2013).
+
+* Total number of distributions analysed: 28088
+* Distributions with no releases: 248 / 28088 (0.88 %)
+
+* Fully compatible distributions: 24142 / 28088 (85.95 %)
+* Compatible distributions after translation: 2830 / 28088 (10.08 %)
+* Compatible distributions after filtering: 511 / 28088 (1.82 %)
+* Distributions sorted differently after translation: 38 / 28088 (0.14 %)
+* Distributions sorted differently without translation: 2 / 28088 (0.01 %)
+* Distributions with no compatible releases: 317 / 28088 (1.13 %)
+
+The two remaining sort order discrepancies picked up by the analysis are due
+to a pair of projects which have published releases ending with a carriage
+return, alongside releases with the same version number, only *without* the
+trailing carriage return.
+
+The sorting discrepancies after translation relate mainly to differences
+in the handling of pre-releases where the standard mechanism is considered
+to be an improvement. For example, the existing pkg_resources scheme will
+sort "1.1beta1" *after* "1.1b2", whereas the suggested standard translation
+for "1.1beta1" is "1.1b1", which sorts *before* "1.1b2". Similarly, the
+pkg_resources scheme will sort "-dev-N" pre-releases differently from
+"devN" releases when they occur within the same release, while the
+standard scheme will normalize both representations to ".devN" and sort
+them by the numeric component.
+
+For comparison, here are the corresponding analysis results for PEP 386:
+
+* Fully compatible distributions: 23874 / 28088 (85.00 %)
+* Compatible distributions after translation: 2786 / 28088 (9.92 %)
+* Compatible distributions after filtering: 527 / 28088 (1.88 %)
+* Distributions sorted differently after translation: 96 / 28088 (0.34 %)
+* Distributions sorted differently without translation: 14 / 28088 (0.05 %)
+* Distributions with no compatible releases: 543 / 28088 (1.93 %)
+
+These figures make it clear that only a relatively small number of current
+projects are affected by these changes. However, some of the affected
+projects are in widespread use (such as Pinax and selenium). The
+changes also serve to bring the standard scheme more into line with
+developer's expectations, which is an important element in encouraging
+adoption of the new metadata version.
+
+The script used for the above analysis is available at [3]_.
+
 
 A more opinionated description of the versioning scheme
 -------------------------------------------------------
@@ -1357,6 +1506,8 @@
 .. [2] PEP 301:
    http://www.python.org/dev/peps/pep-0301/
 
+.. [3] Version compatibility analysis script
+   http://hg.python.org/peps/file/default/pep-0426/pepsort.py
 
 Appendix
 ========
diff --git a/pep-0426/pepsort.py b/pep-0426/pepsort.py
new file mode 100755
--- /dev/null
+++ b/pep-0426/pepsort.py
@@ -0,0 +1,233 @@
+#!/usr/bin/env python3
+
+# Distribution sorting comparisons
+#   between pkg_resources, PEP 386 and PEP 426
+#
+# Requires distlib, original script written by Vinay Sajip
+
+import logging
+import re
+import sys
+import json
+import errno
+import time
+
+from distlib.compat import xmlrpclib
+from distlib.version import suggest_normalized_version, legacy_key, normalized_key
+
+logger = logging.getLogger(__name__)
+
+PEP426_VERSION_RE = re.compile('^(\d+(\.\d+)*)((a|b|c|rc)(\d+))?'
+                               '(\.(post)(\d+))?(\.(dev)(\d+))?$')
+
+def pep426_key(s):
+    s = s.strip()
+    m = PEP426_VERSION_RE.match(s)
+    if not m:
+        raise ValueError('Not a valid version: %s' % s)
+    groups = m.groups()
+    nums = tuple(int(v) for v in groups[0].split('.'))
+    while len(nums) > 1 and nums[-1] == 0:
+        nums = nums[:-1]
+
+    pre = groups[3:5]
+    post = groups[6:8]
+    dev = groups[9:11]
+    if pre == (None, None):
+        pre = ()
+    else:
+        pre = pre[0], int(pre[1])
+    if post == (None, None):
+        post = ()
+    else:
+        post = post[0], int(post[1])
+    if dev == (None, None):
+        dev = ()
+    else:
+        dev = dev[0], int(dev[1])
+    if not pre:
+        # either before pre-release, or final release and after
+        if not post and dev:
+            # before pre-release
+            pre = ('a', -1) # to sort before a0
+        else:
+            pre = ('z',)    # to sort after all pre-releases
+    # now look at the state of post and dev.
+    if not post:
+        post = ('a',)
+    if not dev:
+        dev = ('final',)
+
+    return nums, pre, post, dev
+
+def cache_projects(cache_name):
+    logger.info("Retrieving package data from PyPI")
+    client = xmlrpclib.ServerProxy('http://python.org/pypi')
+    projects = dict.fromkeys(client.list_packages())
+    failed = []
+    for pname in projects:
+        time.sleep(0.1)
+        logger.debug("Retrieving versions for %s", pname)
+        try:
+            projects[pname] = list(client.package_releases(pname, True))
+        except:
+            failed.append(pname)
+    logger.warn("Error retrieving versions for %s", failed)
+    with open(cache_name, 'w') as f:
+        json.dump(projects, f, sort_keys=True,
+                  indent=2, separators=(',', ': '))
+    return projects
+
+def get_projects(cache_name):
+    try:
+        f = open(cache_name)
+    except IOError as exc:
+        if exc.errno != errno.ENOENT:
+            raise
+        projects = cache_projects(cache_name);
+    else:
+        with f:
+            projects = json.load(f)
+    return projects
+
+
+VERSION_CACHE = "pepsort_cache.json"
+
+class Category(set):
+
+    def __init__(self, title, num_projects):
+        super().__init__()
+        self.title = title
+        self.num_projects = num_projects
+
+    def __str__(self):
+        num_projects = self.num_projects
+        num_in_category = len(self)
+        pct = (100.0 * num_in_category) / num_projects
+        return "{}: {:d} / {:d} ({:.2f} %)".format(
+                    self.title, num_in_category, num_projects, pct)
+
+SORT_KEYS = {
+    "386": normalized_key,
+    "426": pep426_key,
+}
+
+def main(pepno = '426'):
+    sort_key = SORT_KEYS[pepno]
+    print('Comparing PEP %s version sort to setuptools.' % pepno)
+
+    projects = get_projects(VERSION_CACHE)
+    num_projects = len(projects)
+
+    null_projects = Category("No releases", num_projects)
+    compatible_projects = Category("Compatible", num_projects)
+    translated_projects = Category("Compatible with translation", num_projects)
+    filtered_projects = Category("Compatible with filtering", num_projects)
+    sort_error_translated_projects = Category("Translations sort differently", num_projects)
+    sort_error_compatible_projects = Category("Incompatible due to sorting errors", num_projects)
+    incompatible_projects = Category("Incompatible", num_projects)
+
+    categories = [
+        null_projects,
+        compatible_projects,
+        translated_projects,
+        filtered_projects,
+        sort_error_translated_projects,
+        sort_error_compatible_projects,
+        incompatible_projects,
+    ]
+
+    sort_failures = 0
+    for i, (pname, versions) in enumerate(projects.items()):
+        if i % 100 == 0:
+            sys.stderr.write('%s / %s\r' % (i, num_projects))
+            sys.stderr.flush()
+        if not versions:
+            logger.debug('%-15.15s has no releases', pname)
+            null_projects.add(pname)
+            continue
+        # list_legacy and list_pep will contain 2-tuples
+        # comprising a sortable representation according to either
+        # the setuptools (legacy) algorithm or the PEP algorithm.
+        # followed by the original version string
+        list_legacy = [(legacy_key(v), v) for v in versions]
+        # Go through the PEP 386/426 stuff one by one, since
+        # we might get failures
+        list_pep = []
+        excluded_versions = set()
+        translated_versions = set()
+        for v in versions:
+            try:
+                k = sort_key(v)
+            except Exception:
+                s = suggest_normalized_version(v)
+                if not s:
+                    good = False
+                    logger.debug('%-15.15s failed for %r, no suggestions', pname, v)
+                    excluded_versions.add(v)
+                    continue
+                else:
+                    try:
+                        k = sort_key(s)
+                    except ValueError:
+                        logger.error('%-15.15s failed for %r, with suggestion %r',
+                                     pname, v, s)
+                        excluded_versions.add(v)
+                        continue
+                logger.debug('%-15.15s translated %r to %r', pname, v, s)
+                translated_versions.add(v)
+            list_pep.append((k, v))
+        if not list_pep:
+            logger.debug('%-15.15s has no compatible releases', pname)
+            incompatible_projects.add(pname)
+            continue
+        # Now check the versions sort as expected
+        if excluded_versions:
+            list_legacy = [(k, v) for k, v in list_legacy
+                                              if v not in excluded_versions]
+        assert len(list_legacy) == len(list_pep)
+        sorted_legacy = sorted(list_legacy)
+        sorted_pep = sorted(list_pep)
+        sv_legacy = [t[1] for t in sorted_legacy]
+        sv_pep = [t[1] for t in sorted_pep]
+        if sv_legacy != sv_pep:
+            if translated_versions:
+                 logger.debug('%-15.15s translation creates sort differences', pname)
+                 sort_error_translated_projects.add(pname)
+            else:
+                 logger.debug('%-15.15s incompatible due to sort errors', pname)
+                 sort_error_compatible_projects.add(pname)
+            logger.debug('%-15.15s unequal: legacy: %s', pname, sv_legacy)
+            logger.debug('%-15.15s unequal: pep%s: %s', pname, pepno, sv_pep)
+            continue
+        # The project is compatible to some degree,
+        if excluded_versions:
+            logger.debug('%-15.15s has some compatible releases', pname)
+            filtered_projects.add(pname)
+            continue
+        if translated_versions:
+            logger.debug('%-15.15s is compatible after translation', pname)
+            translated_projects.add(pname)
+            continue
+        logger.debug('%-15.15s is fully compatible', pname)
+        compatible_projects.add(pname)
+
+    for category in categories:
+        print(category)
+
+    # Uncomment the line below to explore differences in details
+    # import pdb; pdb.set_trace()
+    # Grepping the log files is also informative
+    # e.g. "grep unequal pep426sort.log" for the PEP 426 sort differences
+
+if __name__ == '__main__':
+    if len(sys.argv) > 1 and sys.argv[1] == '386':
+        pepno = '386'
+    else:
+        pepno = '426'
+    logname = 'pep{}sort.log'.format(pepno)
+    logging.basicConfig(level=logging.DEBUG, filename=logname,
+                        filemode='w', format='%(message)s')
+    logger.setLevel(logging.DEBUG)
+    main(pepno)
+

-- 
Repository URL: http://hg.python.org/peps


More information about the Python-checkins mailing list