Here-in my recollections of the Pycon distutils versioning discussions... but only up to the part where reasoning for the ".dev" and ".post" parts are added to the scheme.
Hopefully the following will be helpful for reference in the current version discussions.
# Goals
- as self-explanatory and clear versioning as reasonable. A whole dog's breakfast of versions is just a pain. - deterministic translation btwn version string and version tuple. Bonus points if two reasonable humans would sort them the same way. - capable dependency specs and reasonably readable
# Preliminaries
When I say things like "this isn't common on PyPI" below, this is from analysing a dump of all the version strings currently in use on PyPI. This list was produced by Martin von Loewis during Pycon.
# Super simple
A very simple versioning scheme for released packages would be:
major.minor.patch
where those are all numeric fields. Update the "patch"-level when making a small change without compatibility issues. Update the "minor" field when adding a feature. Update the "major" field when making backward incompat changes. Easy.
Incidentally this is what Ruby suggests for Gem authors ("Gems" are Ruby packages), though they call the last field "build":
[http://rubygems.org/read/chapter/7]
Any "public" release of a gem should have a different version. Normally that means incrementing the build number. This means a developer can generate builds all day long for himself, but as soon as he/she makes a public release, the version must be updated.
# "But I want to do an alpha release!"
I don't think I'm overstating in saying that most of us (those that care to help in defining Python packaging tools) would want to allow alpha/beta releases. Certainly this was true at the Pycon discussions. This gives us:
N.N.NaN # e.g. "1.0.0a2" N.N.NbN # e.g. "2.6.0b1"
Alternatives like the following were discarded:
- "1.0.0.a2" The '.' before the 'a' separator, while nice for parsing is not common practice at all. - "1.0.0alpha2" While this *does* appear on PyPI, it is less common than just using the single character. As well, Python itself uses 'a'. - "1.0.0a" No alpha version. The concensus was to not support this. Reasonable people disagreed on whether this would imply "a0" or "a1". It was felt that explicit was better than implicit here. As well, Python itself always has a version on the alpha/beta.
# To "c", or not to "rc"?
Doing a release candidate is reasonable too, it was felt. That gives us:
N.N.NaN # e.g. "1.0.0a2" N.N.NbN # e.g. "2.6.0b1" N.N.NcN # e.g. "2.6.0c1"
Why "c" instead of "rc"? All of "c", "rc", and "candidate" are in use on PyPI (I don't have percentages right now). "c" won because:
- that's what Python itself uses - the one-character symmetry with 'a' and 'b' is nice - it was felt that 'c' clearly enough indicated "release candidate"
So far I don't expect anything to be too controversial (but I'm probably wrong :).
# Just three N's?
The current PyPI versions include quite a few versions with just two "N's" -- e.g. "0.1", "3.5a2" -- as well as a some, though fewer, with four N's -- e.g. "1.5.4.3". This gives us (this is just a pseudo-pattern):
N.N[.N]*[(abc)N]
It was felt that just a single N -- e.g. "1" -- should be disallowed. However, the upper limit was left unbounded, i.e. this is allowed:
1.2.3.4.5.6.7.8.9a3
An example of where more N's is useful is for a Python module that wraps a third-party library. Say that library ("libfoo") version is 2.5.2, a reasonable version for "python-libfoo" might be "2.5.2.1.0" where the first three bits track the "libfoo" version.
# Multiple N's after the "abc".
"1.2.0a3.4" or in the pseudo-pattern I've been using:
N.N[.N]*[(abc)N[.N]*]
This was discussed and added. I don't recall who supported this.
Personally I've not had a need for this. 29 out of 4975 PyPI versions (in MvL's list generated during Pycon) use this -- and in 7 of those that last ".N" is a date stamp (e.g. "1.0a2.20070215") where I think the datestamp is meant as a sort of ".dev" or "pre-release" tag or build number.
# ... the rest
I'll try to post tomorrow night with my recollections and current understanding of the rational for the ".dev" and ".post" parts of the proposed version scheme.
Cheers, Trent
Great post, thanks a lot Trent !
Maybe you could push the summary in http://wiki.python.org/moin/Distutils/VersionComparison to decouple it from the Mailing List
Regards Tarek
On Tue, Jun 9, 2009 at 9:58 AM, Trent Micktrentm@gmail.com wrote:
Here-in my recollections of the Pycon distutils versioning discussions... but only up to the part where reasoning for the ".dev" and ".post" parts are added to the scheme.
Hopefully the following will be helpful for reference in the current version discussions.
# Goals
- as self-explanatory and clear versioning as reasonable. A whole dog's
breakfast of versions is just a pain.
- deterministic translation btwn version string and version tuple. Bonus
points if two reasonable humans would sort them the same way.
- capable dependency specs and reasonably readable
# Preliminaries
When I say things like "this isn't common on PyPI" below, this is from analysing a dump of all the version strings currently in use on PyPI. This list was produced by Martin von Loewis during Pycon.
# Super simple
A very simple versioning scheme for released packages would be:
major.minor.patch
where those are all numeric fields. Update the "patch"-level when making a small change without compatibility issues. Update the "minor" field when adding a feature. Update the "major" field when making backward incompat changes. Easy.
Incidentally this is what Ruby suggests for Gem authors ("Gems" are Ruby packages), though they call the last field "build":
[http://rubygems.org/read/chapter/7]
Any "public" release of a gem should have a different version. Normally that means incrementing the build number. This means a developer can generate builds all day long for himself, but as soon as he/she makes a public release, the version must be updated.
# "But I want to do an alpha release!"
I don't think I'm overstating in saying that most of us (those that care to help in defining Python packaging tools) would want to allow alpha/beta releases. Certainly this was true at the Pycon discussions. This gives us:
N.N.NaN # e.g. "1.0.0a2" N.N.NbN # e.g. "2.6.0b1"
Alternatives like the following were discarded:
- "1.0.0.a2" The '.' before the 'a' separator, while nice for parsing
is not common practice at all.
- "1.0.0alpha2" While this *does* appear on PyPI, it is less common than
just using the single character. As well, Python itself uses 'a'.
- "1.0.0a" No alpha version. The concensus was to not support this. Reasonable
people disagreed on whether this would imply "a0" or "a1". It was felt that explicit was better than implicit here. As well, Python itself always has a version on the alpha/beta.
# To "c", or not to "rc"?
Doing a release candidate is reasonable too, it was felt. That gives us:
N.N.NaN # e.g. "1.0.0a2" N.N.NbN # e.g. "2.6.0b1" N.N.NcN # e.g. "2.6.0c1"
Why "c" instead of "rc"? All of "c", "rc", and "candidate" are in use on PyPI (I don't have percentages right now). "c" won because:
- that's what Python itself uses
- the one-character symmetry with 'a' and 'b' is nice
- it was felt that 'c' clearly enough indicated "release candidate"
So far I don't expect anything to be too controversial (but I'm probably wrong :).
# Just three N's?
The current PyPI versions include quite a few versions with just two "N's" -- e.g. "0.1", "3.5a2" -- as well as a some, though fewer, with four N's -- e.g. "1.5.4.3". This gives us (this is just a pseudo-pattern):
N.N[.N]*[(abc)N]
It was felt that just a single N -- e.g. "1" -- should be disallowed. However, the upper limit was left unbounded, i.e. this is allowed:
1.2.3.4.5.6.7.8.9a3
An example of where more N's is useful is for a Python module that wraps a third-party library. Say that library ("libfoo") version is 2.5.2, a reasonable version for "python-libfoo" might be "2.5.2.1.0" where the first three bits track the "libfoo" version.
# Multiple N's after the "abc".
"1.2.0a3.4" or in the pseudo-pattern I've been using:
N.N[.N]*[(abc)N[.N]*]
This was discussed and added. I don't recall who supported this.
Personally I've not had a need for this. 29 out of 4975 PyPI versions (in MvL's list generated during Pycon) use this -- and in 7 of those that last ".N" is a date stamp (e.g. "1.0a2.20070215") where I think the datestamp is meant as a sort of ".dev" or "pre-release" tag or build number.
# ... the rest
I'll try to post tomorrow night with my recollections and current understanding of the rational for the ".dev" and ".post" parts of the proposed version scheme.
Cheers, Trent
-- Trent Mick trentm@gmail.com _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
2009/6/9 Trent Mick trentm@gmail.com:
Here-in my recollections of the Pycon distutils versioning discussions... but only up to the part where reasoning for the ".dev" and ".post" parts are added to the scheme.
Hopefully the following will be helpful for reference in the current version discussions.
# Goals
+1 to all this
# Preliminaries
When I say things like "this isn't common on PyPI" below, this is from analysing a dump of all the version strings currently in use on PyPI. This list was produced by Martin von Loewis during Pycon.
OK
# Super simple
A very simple versioning scheme for released packages would be:
major.minor.patch
[...]
# "But I want to do an alpha release!"
I don't think I'm overstating in saying that most of us (those that care to help in defining Python packaging tools) would want to allow alpha/beta releases. Certainly this was true at the Pycon discussions. This gives us:
N.N.NaN # e.g. "1.0.0a2" N.N.NbN # e.g. "2.6.0b1"
[...]
# To "c", or not to "rc"?
Doing a release candidate is reasonable too, it was felt. That gives us:
N.N.NaN # e.g. "1.0.0a2" N.N.NbN # e.g. "2.6.0b1" N.N.NcN # e.g. "2.6.0c1"
[...]
So far I don't expect anything to be too controversial (but I'm probably wrong :).
All +1 from me.
# Just three N's?
The current PyPI versions include quite a few versions with just two "N's" -- e.g. "0.1", "3.5a2" -- as well as a some, though fewer, with four N's -- e.g. "1.5.4.3". This gives us (this is just a pseudo-pattern):
N.N[.N]*[(abc)N]
It was felt that just a single N -- e.g. "1" -- should be disallowed. However, the upper limit was left unbounded, i.e. this is allowed:
1.2.3.4.5.6.7.8.9a3
I see no issue with this - I'd expect it to be a relatively rarely used generalisation, but conceptually it costs nothing, so I'm fine with it. Actually, I do know of (non-Python) cases where a single N is used - less is probably the best known - but I certainly don't care enough to insist one way or the other.
An example of where more N's is useful is for a Python module that wraps a third-party library. Say that library ("libfoo") version is 2.5.2, a reasonable version for "python-libfoo" might be "2.5.2.1.0" where the first three bits track the "libfoo" version.
Purely theoretical example, I assume? I doubt I'd do this, but again, who cares? No conceptual cost.
# Multiple N's after the "abc".
"1.2.0a3.4" or in the pseudo-pattern I've been using:
N.N[.N]*[(abc)N[.N]*]
This was discussed and added. I don't recall who supported this.
Personally I've not had a need for this. 29 out of 4975 PyPI versions (in MvL's list generated during Pycon) use this -- and in 7 of those that last ".N" is a date stamp (e.g. "1.0a2.20070215") where I think the datestamp is meant as a sort of ".dev" or "pre-release" tag or build number.
-1
29/4975 is hardly anything, and I don't believe this should be defined on a "someone uses it so we should allow it" basis.
If someone supports this, they should be presenting a good use case, with an explanation of why it is of value *to the end user* (ie, not just to the project developers).
For something to consider - my view is that if you've done 2.2a1 and you want a new release, it should be 2.2a2 - not 2.2a1.1 or any such thing. If it's not good enough to be an a2, then why are you releasing it? Note: there's an assumption implicit in this that a "version" is something attached to a release - I have little sympathy with the idea that every single Subversion revision (or Mercurial changeset, or whatever) should have a unique "version" number. Unreleased versions should be identified differently (and nobody should be specifying dependencies on unreleased versions, before anybody suggests that!)
# ... the rest
I'll try to post tomorrow night with my recollections and current understanding of the rational for the ".dev" and ".post" parts of the proposed version scheme.
Thanks for posting this. So far, it's relatively uncontroversial, but it still makes a great summary of the arguments and conclusions.
Paul.
On Tue, Jun 09, 2009 at 11:10:48AM +0100, Paul Moore wrote:
2009/6/9 Trent Mick trentm@gmail.com:
# Preliminaries
When I say things like "this isn't common on PyPI" below, this is from analysing a dump of all the version strings currently in use on PyPI. This list was produced by Martin von Loewis during Pycon.
OK
FYI, the mail is here:
http://mail.python.org/pipermail/distutils-sig/2009-March/011194.html
And the list here:
http://www.dcl.hpi.uni-potsdam.de/home/loewis/versions
Regards Floris
An example of where more N's is useful is for a Python module that wraps a third-party library. Say that library ("libfoo") version is 2.5.2, a reasonable version for "python-libfoo" might be "2.5.2.1.0" where the first three bits track the "libfoo" version.
Purely theoretical example, I assume? I doubt I'd do this, but again, who cares? No conceptual cost.
Something similar I do for my "python-markdown2" module: http://code.google.com/p/python-markdown2/source/browse/trunk/lib/markdown2....
N.N[.N]*[(abc)N[.N]*]
...
-1
29/4975 is hardly anything, and I don't believe this should be defined on a "someone uses it so we should allow it" basis.
If someone supports this, they should be presenting a good use case, with an explanation of why it is of value *to the end user* (ie, not just to the project developers).
Yes, I'd like to send a separate email to this list (perhaps to python-list) that asks if anyone can pipe in with a good use case/justification for this.
Anyway, in this email I'm just trying to put down my current understanding of the proposal and the debate.
Note: there's an assumption implicit in this that a "version" is something attached to a release - I have little sympathy with the idea that every single Subversion revision (or Mercurial changeset, or whatever) should have a unique "version" number. Unreleased versions should be identified differently (and nobody should be specifying dependencies on unreleased versions, before anybody suggests that!)
Ah... I'll get into that with the ".dev" stuff. :) Hopefully tonight.
Trent
Trent Mick:
Thank you very much for the summary emails.
On Jun 9, 2009, at 1:58 AM, Trent Mick wrote:
# Super simple
...
major.minor.patch
...
# "But I want to do an alpha release!"
I don't think I'm overstating in saying that most of us (those that care to help in defining Python packaging tools) would want to allow alpha/beta releases. Certainly this was true at the Pycon discussions. This gives us:
N.N.NaN # e.g. "1.0.0a2" N.N.NbN # e.g. "2.6.0b1"
This is where we branch between the two ways that people do it. Some people count up to a future release, other people count away from past releases. (Of course probably some people do both.)
In Tahoe (and I think in Twisted, and Nevow, and Foolscap), we typically don't count up to a release so that $REL-$SOMETHING is a predecessor to the $REL release. Instead we count away from $REL, so that $REL-$SOMETHING is a successor to the $REL release.
Here's an example:
http://allmydata.org/source/tahoe/tarballs/
There are files in that directory named allmydata-tahoe-1.4.1- r3904.tar.gz, allmydata-tahoe-1.4.1-r3905.tar.gz, allmydata- tahoe-1.4.1-r3908.tar.gz, etc. Each of these is newer than the previous one, and all of them are newer than the v1.4.1 release.
The -r$NUMBER is a count of patches in our revision control repository (so it is pretty much like the SVN revision numbers that Twisted and Nevow use in their version numbers, and it is like the SVN revision numbers that setuptools can append to a version string for you).
When the time comes for the Tahoe v1.5 release (Real Soon Now!), we will eventually have a release numbered 1.4.1-r3948 (assuming that it takes us 40 more patches from now to get ready for the next stable release), and then the next tarball after that will be named allmydata-tahoe-1.5.0.tar.gz. (Technically, we could also name it allmydata-tahoe-1.5.0-r3949.tar.gz and everything would work, but we leave the -r$COUNT component off when this is the first release of a new version number.)
So, we don't use the "a/b/c" indicators, but we do use what you are calling "post-release" indicators. Currently that is spelled "-r", which is how Tahoe, Twisted, and setuptools do it. At PyCon I agreed that it wouldn't hurt to change the spelling to "-post" for clarity and for parallelism with "-pre". (I don't have the authority to agree to anything on behalf of the Twisted or setuptools projects -- I was just agreeing to stop arguing about it. :-))
Regards,
Zooko
2009/6/11 Zooko Wilcox-O'Hearn zooko@zooko.com:
This is where we branch between the two ways that people do it. Some people count up to a future release, other people count away from past releases. (Of course probably some people do both.)
In Tahoe (and I think in Twisted, and Nevow, and Foolscap), we typically don't count up to a release so that $REL-$SOMETHING is a predecessor to the $REL release. Instead we count away from $REL, so that $REL-$SOMETHING is a successor to the $REL release.
Here's an example:
http://allmydata.org/source/tahoe/tarballs/
There are files in that directory named allmydata-tahoe-1.4.1-r3904.tar.gz, allmydata-tahoe-1.4.1-r3905.tar.gz, allmydata-tahoe-1.4.1-r3908.tar.gz, etc. Each of these is newer than the previous one, and all of them are newer than the v1.4.1 release.
So just go with 1.4.1.3908
So, we don't use the "a/b/c" indicators, but we do use what you are calling "post-release" indicators. Currently that is spelled "-r", which is how Tahoe, Twisted, and setuptools do it. At PyCon I agreed that it wouldn't hurt to change the spelling to "-post" for clarity and for parallelism with "-pre". (I don't have the authority to agree to anything on behalf of the Twisted or setuptools projects -- I was just agreeing to stop arguing about it. :-))
If it's just arguing about which spelling to use, there should be nothing wrong with "." rather than "-r", and it has the huge advantages of simplicity and consistency.
Actually, using Ben's (slightly) extended definition, you could even use ".r" if you're wedded to the idea of having something other than a plain number.
Paul.
Zooko Wilcox-O'Hearn wrote:
Trent Mick:
Thank you very much for the summary emails.
Indeed!
This gives us:
N.N.NaN # e.g. "1.0.0a2" N.N.NbN # e.g. "2.6.0b1"
This is where we branch between the two ways that people do it. Some people count up to a future release, other people count away from past releases. (Of course probably some people do both.)
I think this is an excellent point, and I think that both ways of doing it need to be supported. The "counting up" method is especially useful if you're moving toward 2 different versions, simultaneously. Think about Python itself moving to 3.0. What should development versions of Python 3.0 have been called, when both 2.6 and 3.0 were being developed? If only "counting away" were supported, would all 3.0 build have been called "alpha"? Certainly calling 2.6 and 3.0 "2.5 plus something" would have been wrong, and a nightmare.
[From another thread]
P.J. Eby wrote:
PyPI uploads aren't a suitable basis for analyzing "dev" use cases, since the whole point of having a "dev" tag is for *non-released* versions. (E.g., in-progress development via SVN.) Dev tags are so that while you're doing development, your locally-installed versions can be distinguished from one another.
True. When I'm "sneaking up" on an alpha release, I need to have version comparisons work in my test environment, and I'd prefer to use "-dev". These releases would never be published to PyPI or anywhere else, other than my internal servers.
Eric.
2009/6/11 Eric Smith eric@trueblade.com:
I think this is an excellent point, and I think that both ways of doing it need to be supported. The "counting up" method is especially useful if you're moving toward 2 different versions, simultaneously. Think about Python itself moving to 3.0. What should development versions of Python 3.0 have been called, when both 2.6 and 3.0 were being developed? If only "counting away" were supported, would all 3.0 build have been called "alpha"? Certainly calling 2.6 and 3.0 "2.5 plus something" would have been wrong, and a nightmare.
They weren't called anything (OK, internally all such releases have version number 2.6a0 and 3.0a0 respectively, but individual Subversion commits had no distinct version). And that's precisely right, as they are internal commits, and were not released.
Why is there a need for every internal commit to have a distinct "version number"? I'm not saying there's no need to be able to identify the code base, but why must everything be lumped into the version? In my experience (and in the practice used by Python core) version numbers are for released versions. Snapshots, nightly builds, personal developer builds, etc, don't have version numbers.
True. When I'm "sneaking up" on an alpha release, I need to have version comparisons work in my test environment, and I'd prefer to use "-dev". These releases would never be published to PyPI or anywhere else, other than my internal servers.
I'm sorry, but why? How does your test environment work so that you are testing 2 versions at once, and need to compare "version numbers" programmatically? And why isn't something *additional* to the version number, like the core's sys.subversion, suitable?
Please note, in case I seem excessively aggressive - my assumption is that the standard defined by the PEP doesn't have to be used by everyone, in all cases. You presumably have code that works at the moment, and you are perfectly entitled to continue using it. If you switch to the APIs proposed in the PEP, then presumably you see some benefits in following the standard. There's clearly a cost in doing so - one part of the cost is changing your code, and I don't think it's unreasonable for another part of that cost to be changing your version numbering scheme (at least for your released versions). Whether that cost is too high is a decision you have to make yourself. Designing the PEP is a compromise between all the various aspects involved.
Paul.