[Distutils] Version Spec compatible with PEP 0440 as well as SemVer

Sat Nov 14 22:46:22 EST 2015

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

# Version Spec compatible with PEP 0440 as well as SemVer

## Short Version/TL;DR

I want to know if the following versions would abide by SemVer AND PEP
0440 as I would like both to be valid for my projects (order matters).

The idea is to create a formal specification based on the Semantic
Versioning Specification that is fully compatible with both Semantic
Versioning and PEP 0440.

As such, I'd like to know what you think.

### Examples:
    0.0.0       # Before the first public release, local iterations and
                  commits to the repository when a public API is still
                  not present.
    0.1.0       # First public release, since major = 0, public API is
                  expected to be unstable
    0.1.1       # First patch applied to the first public release
    0.2.0
    1.0.0-a0    # Alpha pre-release of version 1 (first stable release)
    1.0.0-a1
    1.0.0-b0
    1.0.0-rc0   # First release candidate
    1.0.0-rc1
    1.0.0
    1.0.1
    1.0.2

## Long version

### Intention

I got very interested in versioning and format of public version
identifiers a couple of days ago, and was studying different formats
used by projects. I really liked the two "big" propositions I found,
which are Semantic Versioning from semver.org (or "SemVer" from now on)
and PEP 0440 (or "the PEP" from now on).

I'm aware of the existence of forks to SemVer that "fix" it to be
compatible with the PEP, e.g. OpenStack Foundation PBR's "Linux/Python
Compatible Semantic Versioning 3.0.0" [1]. I'm also aware of discussions
in mailing lists involving the "fixing" the PEP, e.g. "module version
number support for semver.org" in python-ideas [2].

I have no intention to fix either, rather have a version specification
for the projects I lead or my personal projects that is compatible with
both. And by compatible with both I hope to have a simple spec that
still provides the determinism in downloads/upgrades introduced by the
PEP, even if that means forgoing "versatility" of some suffixes and
identifiers, for simplicity. As well as providing a meaning and
rationale for each element of a version-number/release (the "sem" in
SemVer)

Keywords are "determinism", "simplicity"; and hopefully not
"bikeshedding", nor "trolling", nor "flame-baiting".

### Compatible bits

I quote the section of the PEP that talks about compatibility with
SemVer [3]:

> The "Major.Minor.Patch" (described in this PEP as "major.minor.micro")
> aspects of semantic versioning (clauses 1-9 in the 2.0.0-rc-1
> specification) are fully compatible with the version scheme defined in
> this PEP, and abiding by these aspects is encouraged.
> 
> Semantic versions containing a hyphen (pre-releases - clause 10) or a
> plus sign (builds - clause 11) are not compatible with this PEP and
> are not permitted in the public version field.
> 
> One possible mechanism to translate such semantic versioning based
> source labels to compatible public versions is to use the .devN suffix
> to specify the appropriate version order.

So I know the first four versions in my example above are compatible
with both, as well as the last. The question arises when reading the
above quote after having studied the prescriptions of the PEP in
sections "Pre-releases" [4] and "Pre-release separators" [5] (under
"Normalization")

### Pre-releases

For the PEP, pre-releases are allowed to be separated by a dash from the
release component described as "major.minor.patch" in Semantic
Versioning, and as "major.minor.micro" in the PEP. So that part of
clause 10 of version 2.0.0-rc-1 [6] (quoted below, as this was the one
used when writing the PEP) on the Semantic Versioning format isn't the
problem, where the problem lies is that SemVer allows that pre-release
identifier to be multiple "sections" and to allow many more combinations
of characters.

> 10. A pre-release version MAY be denoted by appending a dash and a
>     series of dot separated identifiers immediately following the
>     patch version. Identifiers MUST be comprised of only ASCII
>     alphanumerics and dash [0-9A-Za-z-]. Pre-release versions satisfy
>     but have a lower precedence than the associated normal version.
>     Examples: 1.0.0-alpha, 1.0.0-alpha.1, 1.0.0-0.3.7, 1.0.0-x.7.z.92.

Question #1:
:   If restricted to being one of "a", "b", or "rc" followed by a
    numeric component from 0-9, and also restricted to only one section,
    as shown on the examples I presented, would this be still
    incompatible with the PEP after being normalized?

Question #2:
:   In the above-quoted version section of the PEP, pre-release versions
    aren't permitted to be part of the "public version field", what does
    this mean?

    I see that there are versions of software distributed using PyPI
    that have pre-release identifiers, e.g. Sphinx==1.3b3 [7]

    Semantically, a version "1.3b3" is the same as "1.3.0-b3" (an
    identifier compliant with SemVer, and the PEP after being normalized
    to "1.3.0.b3")

    But when saying "public version field", does the PEP prescribes that
    I "SHOULD NOT" use this identifier when publishing or what? Or was
    the intention that it not be part of the "final release" identifier
    (since it is after all a pre-release identifier).

When I wrote the above examples, the spirit was to abide by SemVer (all
clauses) while still being compatible with PEP 0440.

The limitation of the numeric component to be [0-9] is to avoid sorting
problems caused by a subtle difference in how these are sorted in SemVer
vs The PEP. Since my examples use letters and numbers for this section
identifier, SemVer prescribes lexical ordering by ASCII sort order,
while the PEP compares same pre-release stage versions in numerical
order. I.E. in SemVer "a1" < "a11" < "a2" as identifier components are
expected to be separated by dots if they have different meaning.

### Aside.

There was an update in this clause of SemVer from version 2.0.0-rc-1 as
used in the PEP to version 2.0.0 (current), this update specified that
the sections separated by dots must not be empty, must not include
leading zeroes, and a small blurb on the meaning of this identifier.
There was also a renumbering due to the deleting of a section before.
These changes are irrelevant for this discussion. See [8]

It's also important to note that I consider changes to documentation
between minor releases a patch (SemVer) or (micro) in the PEP, and this
number is updated when the code is "published" (using the definition in
PEP 0426 [9]). This mechanism could fit into PEP 0440's concept of
"post-release" but I'm choosing to forgo this concept to be able to
stick to SemVer and not alter sorting order by using '-post#', '-r#', or
'-rev#', which would force me to use '-c#' instead of '-rc#'. But it is
possible and I might consider it in the future.

Code pushed to the repository or merged from a pull-request doesn't fit
into this definition of publication (the one used in PEP 0426). I
consider part of this definition, however, the tagging a release in the
version control repository, uploading to index servers, and in doing
that providing the software to what PEP 0426 defines as "software
integrators".

As such, if after releasing a sample version "3.1.2", I were to push any
number of commits and/or merge pull requests, this code or otherwise
file changes are not considered to be published under the definition
above, nor is it part of 3.1.2. Code is considered published and part of
a version, when, as part of the development cycle; the version number on
the repository is updated. This update would entail updating the version
number in the code, and updating the changelog in the repository to the
new version number and with all changes that have been added since last
version was published".

The change in version number and update to changelog should happen in
the same commit, and this commit should not contain any other changes.
This commit is then tagged with the version id, prepended with the
character 'v'; and this new version is then to be uploaded to package
indexes and sent to integrators.

As such, and with the limitation of pre-release staged versions to only
10 (0-9) it is advised that the documentation be complete before
pre-releasing in alpha, and only bugfixes be submitted (these may also
be in the documentation) during a pre-release stage. The code and
documentation is to be considered "frozen" for those pre-releases, and
as such, changes should not be taken lightly and version bumps not be
done trivially.

End aside.

### Local versions / Build metadata

In relation to what I call "+identifier"s, both the PEP and SemVer allow
for "Local version identifier" or "build metadata" as they respectively
refer to the section after the "public version identifier" in the PEP,
and "normal version" in SemVer. This section is, on both documents,
allowed to be included after a separator, specifically, the character
"+" (plus sign).

There are two differences, however.

The first difference is syntactical, SemVer allows for letter, digit,
and '-' (hyphen) characters in subsections separated by '.' (dot) in
clause 10 of version 2.0.0 of the specification [8] (please don't
confuse the clause with the previous #10, that was for the rc1 of the
specification); while the PEP doesn't allow '-' (hyphen) on the segments
as it calls them.

The second difference is a semantic one, with an added "kicker". In
SemVer, this section has no restrictions and is considered part of the
public identifier. In contrast, it is prescribed in the PEP that this
section of the identifier SHOULD NOT be used when publishing, and MAY be
used to denote local versions built from source, and SHOULD be used by
downstream projects when releasing a version of the upstream project
(see section of the PEP about local version identifiers [10]). Also,
please note that the PEP has specific meanings for "MAY" "SHOULD", and
"SHOULD NOT". The "kicker" I mention, is that in SemVer two "build
metadata" identifiers are not considered when determining precedence,
while in the PEP they do.

In more human terms, according to the PEP, my project should not use the
"+identifier" for it's own releases, downstream projects can when they
release a new version of my project that is compatible as defined in
both SemVer and the PEP. And as long as those downstream projects don't
use '-' (hyphen) these versions released by them would still abide by
both SemVer and the PEP.

[1]:  http://docs.openstack.org/developer/pbr/semver.html
[2]:  http://grokbase.com/t/python/python-ideas/144e5x67tq/module-version-number-support-for-semver-org
[3]:  https://www.python.org/dev/peps/pep-0440/#semantic-versioning
[4]:  https://www.python.org/dev/peps/pep-0440/#pre-releases
[5]:  https://www.python.org/dev/peps/pep-0440/#pre-release-separators
[6]:  http://semver.org/spec/v2.0.0-rc.1.html
[7]:  https://pypi.python.org/pypi/Sphinx/1.3b3
[8]:  http://semver.org/spec/v2.0.0.html
[9]:  https://www.python.org/dev/peps/pep-0426/
[10]: https://www.python.org/dev/peps/pep-0440/#local-version-identifiers

- -- 
Jamiel Almeida <jamiel.almeida at gmail.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.29 (Darwin)

iQIcBAEBCgAGBQJWSAAOAAoJEHY8IXipLI+0uUwP/RLnCk62QAjBpmMsC/O5lk3c
FqL5fD5ZaDjX5paxy3aoKU83UY5D5KTWiMeBbpZ2ae6nf00UP5hKEjoi3eioQn6Q
I8ELE1vHXP2X9s+iJRzzbO6zMVtYQ/OHWtJq/YaGeOQdtN8mfl4e6esyMK6Tldl6
TstKLEPMQXL5zxgXPzcudvYW3wj7MXa1Pn9xcY06LgO5ERtLM4JCk2PJMzbZ8R8r
gS8+B5UbVLuu9aE5n9ZjDbP1A07jKSSmwFdoxatvG176Vp7ha4xl80Kc0liYl9ke
L1Zn3whpCBn7C5amO2RzmklniOgs56MCURo4wyqe+4V1egl9yio298YvrAvDRT4d
TEftliEsErsCQjDpPutkYYgynARktU/xQHRZUw7cXNeqNfTuzA7ZaDuFgO3X6fDf
A7Bd3BbMq9h3FsMTGq7Tb2F6cF8W4wVASTZTb3endRWN9X+khVr6Bkvhl9qr5cWh
XG3xXBhdAku9vUaqxEXVyUlKoFytt3UwFQpJgvgB/eKffVCcXa+e3RSsNaN5p5dO
yxCuqblHHjF9JHfEEybG3IWZzL3eyBgMPEGLC9LtHCrGwKXh0m+oQU8UOYrSDld7
RfcdoYzbrZ0k+rdiKPXiOx59vB3FWKKfZAeGempE6efhYlYAKsVpQ31S1sxQZH9w
6VBZmyFa+fGGYVfgToQs
=VeL1
-----END PGP SIGNATURE-----