[Distutils] PEP 386 status - last round here ?

Toshio Kuratomi a.badger at gmail.com
Fri Nov 27 20:55:08 CET 2009

Many many things in this thread.... Unfortunately, there's so many ways to
do versioning that it's almost a bikeshed topic and so there's a lot of
different ideas of what could be right.  Let me start by saying what I think
is "right" and then the rest of my message will be devoted to what things
seem like good compromises and what don't :-)

I'm in agreement with Ben Finney's idea::
I'd like to register, once again, the point that this would not *be* a
problem if PEP 386 described a version comparison scheme that simply
works without special keywords. Have each segment compared
alphanumerically, and it will not *need* translation to work with other
packaging systems.

Special keywords are not, I maintain, special enough to break the normal
version-comparison semantics.

However, at PyCon, this was only popular among the people who have to deal
with packaging of python projects.  Everyone else wants the ability to sort
some things as coming before a release -- for instance, alphas, betas, release
candidates.  (Note that snapshots don't need to be pre-releases although
this PEP makes them that way).

On Fri, Nov 27, 2009 at 01:24:14PM +0100, Tarek Ziadé wrote:
> On Fri, Nov 27, 2009 at 11:39 AM, Piotr Ozarowski <ozarow at gmail.com> wrote:
> > [Tarek Ziadé, 2009-11-26]
> >> On Thu, Nov 26, 2009 at 8:55 PM, Floris Bruynooghe
> >> <floris.bruynooghe at gmail.com> wrote:
> >> [..]
> >> >> since the .dev versions are really only snapshots leading up to
> >> >> some release, i.e. 1.0.dev456 is a snapshot leading up to the
> >> >> first pre-release of the 1.0 :-)
> >> >
> >> > But in this case if I want to make a pre-release of 1.0 but after the
> >> > last rc then I can't, I can only make a post-release of the last rc.
> >> > That's almost more un-intuitive that forcing your first pre-release to
> >> > be '1.0a0.dev456' instead of just '1.0.dev456'.
> >>
> >> It seems to me that the number of development versions of rc releases
> >> is very low compared to the number of development snapshots done for
> >> 1.0, before the pre-release cycle starts.
> >>
> >> (I don't think I have ever needed a dev snapshot of a rc version)
> >>
> >> I am +1 for keeping the intuitive writing for the pre-release cycle.
> >>
> >> e.g.
> >>
> >>  1.0.dev456
> >>  < 1.0a1.dev456
> >>  < 1.0a1
> >>  < 1.0rc1.dev456
> >>  < 1.0rc1
> >>  < 1.0rc1.post123
> >>  < 1.0

Note, what I "intuitivly" see from the given version numbers is more
like this::
  < 0.9.dev456
  < 1.0rc1
  < 1.0rc1.dev456
  < 1.0rc1.post123
  < 1.0
  < 1.0.dev456
  < 1.0a1
  < 1.0a1.dev456

I've been trained by many projects to see "rc" as "release candidate" but the
definition of "a" and "b" differs from project to project.  Many projects
use them as "patchlevels" for minor post-releases instead of abbreviations
for alpha and beta.  Also, if you have a directory of revisions being served
up by apache it's going to put them in strcmp() order, so 1.0 is going to sort
after the rc's, the .dev's, and the a1's.  Much confusion to overcome here.

Is this a bikeshed?  In some ways it is because some people will
"intuitively" see prereleases and others will see postreleases. However,
versioning impacts end users so we should strive to make it as easy as
possible for people to see the real meaning of our versions.  At pycon I
suggested that everything should be treated as a postrelease except for
certain specific words like "alpha", "beta", and "rc" and that we should strive to
keep that list of specific words as small as possible to avoid confusion.

This might be a good place to say that I agree with part of what
Marc-Andre Lemburg says:
The pre-release marker would then be interpreted in alphabetical
order, ie. 'alpha' < 'beta' < 'rc'.

This minor change would broaden the scope of the scheme somewhat
and make it more compatible to what's being used outside the
python-dev sphere (esp. with respect to 'c' standing for release
candidate... unless you happen to read it as gamma ;-).

Like Marc-Andre I think alpha, beta, rc contain the information to remove
confusion while "a", "b", and "c" do not.  However, I do not like having
aliases (ie: "a"== "alpha").  Aliases have several problems:

1) How do you sort: 1.0a1 and 1.0alpha1 ?  Using an alias these are the same
release but since the developer of the package did this intentionally, they
obviously meant them to mean something different...  we just don't know
what that something is.

2) Eventually, someone will think that "a", "b", and "c" are for
patchlevels.  Then we'll have someone releasing code that they expect to
work like this::

The version checking code will okay it since each one is correct by itself.
It's the meaning that the author attributes to them that will be wrong.

Floris Bruynooghe points these issues out although he uses "dev" and "post"
which have distinct meanings in the PEP:

I also object to the alternatives for the 'dev' and 'post' markers as
they make it more confusing for me.  While someone might prefer one
word over the other their meaning does not change to decide their
ordering, that just seems like uneeded complexity (there should be
only one obvious way to do a thing?).

When I now see the versioning number of a project I need to go and
look up the pep to know if it's compliant but just using one of the
alternatives that I'm not used to.  If there's only one choice it's a
lot easier.

(This same argument goes for 'a' == 'alpha' 'b' == 'beta' and 'c' ==
'rc' but those at least are mnemonic so easier to remember)

By pointing these out, I don't want to stop progress since other people seem
very attached to using "a", "b", "c" instead of fully spelled out alpha, beta
and rc but I defintely see that as a wart.  (Emphasis on "instead of", not
in addition to.)

> > why not simply use "-" and "+" where "-" is before zero-length string
> > and "+" is after any other string... and then sort the rest
> > alphabetically? f.e.
> >
> >  1.0-a1-dev456
> > < 1.0-a1
> > < 1.0-a1+dev456
> > < 1.0-dev456
> > < 1.0-rc1-dev456
> > < 1.0-rc1
> > < 1.0-rc1+post123
> > < 1.0
> > < 1.0+post123
> >
> > don't worry about Debian, we'll simply replace "-" with "~" (we use "~"
> > and "+" right now[0]). I'm not sure about rpm, but I bet it has
> > something similar and it will be much easier for us to simply handle two
> > characters instead of recognizing that dev1 < a1 < b1 < c1 == rc1 ...
> It's different from RPMs, since they use a strcmp(), segment  by segment,
> so I think they have to extract the dev/post suffixes and to put them in front
> as an epoch marker maybe ? (ccing Toshio)
There's a few things to think about here.  First, there's the underlying tool.
In Debian, that's dpkg and in Fedora, it's rpm (Note: www.rpm.org, not
www.rpm5.org, the latter is a fork.)  The tool enforces things like "-" is a
separator between fields, "." as a separator inside fields, and ordering of
packages based on what's contained in the fields.  In rpm we have this to
work with::

:Name: Name of a package.  This makes "python" different from
   "python-docs" different from "gcc"
:Epoch: This is the trump card of the version sorting.  We seldom use this
    as it doesn't show up in our filenames or other end-user visible
    interfaces.  It is used for correcting mistakes in versioning or times
    when we decide that we absolutely have to revert a package rather than
    continuing to package the current release.
:Version: This is generally the upstream release.  However, upstream will
    frequently make releases that won't order correctly (for instance,
    1.0alpha1 => 1.0;  strcmp() will order these as 1.0 => 1.0alpha1)
    All distributions have to transform the version in some way to make
    ordering work correctly with the upstream's versions but they have
    different methods.  Additionally, we want the upstream version to be
    apparent from the version and release fields since end users will see
    these two fields in the filenames of the packages and we want them to
    know they're getting, for instance,  the upstream 2.6.1 release of
:Release: The release of the package within the distribution.  If upstream
    version ordering exactly fit the rpm algorithm, this would just be an
    integer that incremented everytime we built the same upstream version of
    the package.  It's the least significant of the fields on ordering.

Each distribution has its own rules for dealing with the upstream version so
that rpm will order things correctly.  In Fedora, they involve taking the
initial version string that consists of [0-9.] and leaving that in version.
Anything beyond that initial portion is put into the release after our
portion of the release.  For instance if upstream makes the following

Upstream            Version  Release
--------            -------  -------
libjpeg-2.0alpha1   2.0      0.1.alpha1
libjpeg-2.0         2.0      1
libjpeg-2.0a        2.0      2.a

Gory details:
(Version and Release section)

Here's how we would map each of the examples in Fedora.  Note that at the
moment this is a manual process as we can't depend on upstreams to only
follow this.  I'll separate our version and release with a "-" as that's how
rpm will show it to our users:

  1.0.dev456            1.0-0.1.dev456
  < 1.0a1.dev456        1.0-0.2.a1.dev456
  < 1.0a1               1.0-0.3.a1
  < 1.0rc1.dev456       1.0-0.4.rc1.dev456
  < 1.0rc1              1.0-0.5.rc1
  < 1.0rc1.post123      1.0-0.6.rc1.post123
  < 1.0                 1.0-1


  1.0-a1-dev456         1.0-0.1.a1.dev456
  < 1.0-a1              1.0-0.2.a1
  < 1.0-a1+dev456       1.0-0.3.a1+dev456
  < 1.0-dev456          1.0-0.4.dev456
  < 1.0-rc1-dev456      1.0-0.5.rc1.dev456
  < 1.0-rc1             1.0-0.6.rc1
  < 1.0-rc1+post123     1.0-0.7.rc1+post123
  < 1.0                 1.0-1
  < 1.0+post123         1.0-2.post123

We can work with whatever is given to us... just the number of hoops and
special cases that we have to remeber what the position is when we do so.
That said, I don't see the reason for this change.

1) current post is similar to (+post and +dev)
   current dev is similar to (-post and -dev)
   Why duplicate functionality that's already in the PEP?

2) If we remove the "post" version string, then you are substituting a nice,
spelled out word ("post") with a cryptic symbol ('+').

3) The position of the -dev456 in your list satisfies neither set of people
who want to define the meaning of dev.  It's showing up between a1 and
rc1 whereas one camp wants it to sort before a1 and the other camp wants it
to sort just before 1.0.

?) Where does 1.0-post123 fall in your list?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20091127/a30fec55/attachment-0001.pgp>

More information about the Distutils-SIG mailing list