Hello,
I have gathered all the documentation for the versioning lib in a single document,
http://bitbucket.org/tarek/distutilsversion/src/tip/README.txt
It explains how the existing tools work and how "verlib", that we built during Pycon, works.
It still needs to be completed by some people that were present, but a new feedback round is more than welcome.
The goal here is to provide a version comparison standard to be included in Distutils.
We do need such a standard to be able to include the "install_requires" metadata as planned (that will be a change in PEP 345) to be able to provide a standard for version comparisons.
Cheers Tarek
Tarek Ziadé ziade.tarek@gmail.com writes:
http://bitbucket.org/tarek/distutilsversion/src/tip/README.txt
The goal here is to provide a version comparison standard to be included in Distutils.
I laud this goal, and thank you for soliciting feedback on this draft.
One thing I haven't seen discussed: Why have such baroque version comparison semantics been accepted? Surely one of the overarching goals for such a standard is that it be simple to comprehend, and to behave unsurprisingly to most users.
Yet the discussion around these non-obvious semantics, trying to have components interpreted as “pre-release” and “post-release” and “development release” and so on seem to underline the fact that they're *not* something that there's any consensus on. So why are they being foisted into a standard for version strings?
Rather than trying to force non-alphanumeric comparison semantics for alphabetic sequences, why not simply say that alphanumeric comparison semantics apply for components? That would, at a stroke, end all this turmoil and IMO futile seeking of some other consensus, when the best consensus is already what most would expect: alphanumeric comparison.
At the least, I would expect there would need to be a demonstrated, significantly unified consensus for some specific *other* semantic before discarding straightforward alphanumeric comparison as the standard.
2009/6/5 Ben Finney ben+python@benfinney.id.au:
Tarek Ziadé ziade.tarek@gmail.com writes:
http://bitbucket.org/tarek/distutilsversion/src/tip/README.txt
The goal here is to provide a version comparison standard to be included in Distutils.
I laud this goal, and thank you for soliciting feedback on this draft.
One thing I haven't seen discussed: Why have such baroque version comparison semantics been accepted? Surely one of the overarching goals for such a standard is that it be simple to comprehend, and to behave unsurprisingly to most users.
Yet the discussion around these non-obvious semantics, trying to have components interpreted as “pre-release” and “post-release” and “development release” and so on seem to underline the fact that they're *not* something that there's any consensus on. So why are they being foisted into a standard for version strings?
Rather than trying to force non-alphanumeric comparison semantics for alphabetic sequences, why not simply say that alphanumeric comparison semantics apply for components? That would, at a stroke, end all this turmoil and IMO futile seeking of some other consensus, when the best consensus is already what most would expect: alphanumeric comparison.
At the least, I would expect there would need to be a demonstrated, significantly unified consensus for some specific *other* semantic before discarding straightforward alphanumeric comparison as the standard.
+1. The historical baggage of setuptools (which *had* to cater for every bizarre convention currently in existence, as its goal was to "embrace and extend" - something it achieved spectacularly) doesn't seem to apply when creating a standard Python module.
I'd rather the PEP said "This is how version numbers work (simple, non-controversial spec here), and this is the standard API for manipulating them", and then projects that want to conform to the standard migrate if needed.
But I acknowledge that I have no personal requirement for any of this, so the only interest I have is an aesthetic one of *not* seeing overcomplicated, difficult to understand, specifications become part of the Python stdlib.
Paul.
On Fri, Jun 5, 2009 at 2:43 PM, Paul Moorep.f.moore@gmail.com wrote:
I'd rather the PEP said "This is how version numbers work (simple, non-controversial spec here), and this is the standard API for manipulating them", and then projects that want to conform to the standard migrate if needed.
That's exactly the goal.
Maybe there's a missing introduction on how version numbers works Not a standard at first, but the principles of releasing a dev release, an alpha, and so on..
But I acknowledge that I have no personal requirement for any of this, so the only interest I have is an aesthetic one of *not* seeing overcomplicated, difficult to understand, specifications become part of the Python stdlib.
Well if the specification is difficult to understand or overcomplicated it'll fail for sure.
But so far, besides that very specific case for the post-release dev tag, I don't find it complicated at all. Another win I can see is that it will help developers use better versions numbers for their projects imho.
Regards Tarek
2009/6/5 Tarek Ziadé ziade.tarek@gmail.com:
But I acknowledge that I have no personal requirement for any of this, so the only interest I have is an aesthetic one of *not* seeing overcomplicated, difficult to understand, specifications become part of the Python stdlib.
Well if the specification is difficult to understand or overcomplicated it'll fail for sure.
But so far, besides that very specific case for the post-release dev tag, I don't find it complicated at all. Another win I can see is that it will help developers use better versions numbers for their projects imho.
And yet, in spite of repeated requests for specific examples of projects using this post-dev stuff, which haven't been forthcoming, why is nobody saying "it's an unnecessary complication, we'll drop it"?
OK, I'll say it: Having post and dev specifications is unnecessary. Please remove it from the spec.
As an alternative, allow a trailing '-' followed by arbitrary alphanumeric (a-zA-Z0-9) data. If this is all numeric, sort as numbers, otherwise sort textually. Numeric vs non-numeric is sorted as text. That covers
-vendorid (as used by such things as cygwin, RPMs, etc - 1.2.1-3) -revision (e.g. subversion revision numbers 1.2-32471) -changeset (e.g. DVCS changeset id 1.3-0c30df5c527b) ... and anything else people want to invent (dates, developer's age, ...)
I repeat - I don't have any experience in this area. But surely the people who do should be able to explain to me *why* they need something more complex than this?
(Note: I'm assuming that the point of setting a standard is that people move to it - arguments like "Project XXX uses version numbers like so" don't count unless there's a justification why switching version numbering format isn't an option when they are planning to switch version parsing code in the first place!)
Or I'm completely confused. (But I don't seem to be the only one).
Paul.
Paul Moore schrieb:
2009/6/5 Tarek Ziadé ziade.tarek@gmail.com:
But I acknowledge that I have no personal requirement for any of this, so the only interest I have is an aesthetic one of *not* seeing overcomplicated, difficult to understand, specifications become part of the Python stdlib.
Well if the specification is difficult to understand or overcomplicated it'll fail for sure.
But so far, besides that very specific case for the post-release dev tag, I don't find it complicated at all. Another win I can see is that it will help developers use better versions numbers for their projects imho.
And yet, in spite of repeated requests for specific examples of projects using this post-dev stuff, which haven't been forthcoming, why is nobody saying "it's an unnecessary complication, we'll drop it"?
Not everybody is on this list. At the discussion at PyCon, one participant (IIRC it was Zooko) specifically asked for this option to be included.
Georg
2009/6/6 Georg Brandl g.brandl@gmx.net:
Paul Moore schrieb:
And yet, in spite of repeated requests for specific examples of projects using this post-dev stuff, which haven't been forthcoming, why is nobody saying "it's an unnecessary complication, we'll drop it"?
Not everybody is on this list. At the discussion at PyCon, one participant (IIRC it was Zooko) specifically asked for this option to be included.
FWIW, Zooko is on the list. If it wasn't him, someone who knows who it was needs to ask them to supply their justification for the list - given that there are a number of objections being raised here which need to be addressed.
I don't think "someone said they need it" is sufficient justification by itself. I'm not trying to criticise the work done at PyCon, but I think the discussions (and not just the conclusions) need to be publicised better.
Paul.
Paul Moore p.f.moore@gmail.com writes:
I don't think "someone said they need it" is sufficient justification by itself. I'm not trying to criticise the work done at PyCon, but I think the discussions (and not just the conclusions) need to be publicised better.
Yes, that's exactly what I'm asking for. Before making it more complex than it apparently needs to be, the justifications need to be subjected to public scrutiny and, if they don't survive, discarded.
That doesn't pre-judge the outcome of such scrutiny: precisely the opposite, I *don't* want such additional complexity pre-judged as acceptable.
On Sun, Jun 7, 2009 at 9:48 AM, Paul Moorep.f.moore@gmail.com wrote:
2009/6/6 Georg Brandl g.brandl@gmx.net:
Paul Moore schrieb:
And yet, in spite of repeated requests for specific examples of projects using this post-dev stuff, which haven't been forthcoming, why is nobody saying "it's an unnecessary complication, we'll drop it"?
Not everybody is on this list. At the discussion at PyCon, one participant (IIRC it was Zooko) specifically asked for this option to be included.
FWIW, Zooko is on the list. If it wasn't him, someone who knows who it was needs to ask them to supply their justification for the list - given that there are a number of objections being raised here which need to be addressed.
Yes, my memory as well is that it was Zooko. IIRC, at the time I asked for an example of a project that used "post-releases" [^1] and the example was Twisted.
I tried to look at the details of how Twisted's versioning works a little bit (I'm not a Twisted developer), but before I discuss that a little bit I think I should cover my understanding of why ".dev" and ".post" were deemed useful at the time.
(/me writes for too long). I'll send a separate email mostly attempting to give my recollections of the Pycon discussions in answer to:
[Paul Moore]
I'm not trying to criticise the work done at PyCon, but I think the discussions (and not just the conclusions) need to be publicised better.
My email only covers up (but excluding) the ".dev" and ".post" parts of the version scheme. I'll try to get to the those tomorrow night.
Trent
Tarek Ziadé ziade.tarek@gmail.com writes:
On Fri, Jun 5, 2009 at 2:43 PM, Paul Moorep.f.moore@gmail.com wrote:
I'd rather the PEP said "This is how version numbers work (simple, non-controversial spec here), and this is the standard API for manipulating them", and then projects that want to conform to the standard migrate if needed.
That's exactly the goal.
If so, that goal is being missed with all the attempts to wedge in complicated semantics for pre-, post-, “devpost”, and so on. Without all that special treatment requiring long explanation, the specification becomes much more simple, and the comparisons obvious:
N.N[.X]+
where ‘N’ is the set [0-9], and ‘X’ is the set [0-9A-Za-z].
>>> from verlib import RationalVersion as V >>> (V('1.0') ... < V('1.0.a1') ... < V('1.0.a2') ... < V('1.0.a2.1') ... < V('1.0.b2') ... < V('1.0.c1') ... < V('1.0.dev456post623') ... < V('1.0.post456')) True
Are there many packages out there that don't follow this scheme? Of course; that's trivially true whatever scheme is picked, which is precisely the problem being addressed by choosing a simple, obvious standard.
Choose anything *but* simple alphanumeric comparison for each component, and you've expressly chosen something less obvious and more complicated to describe.
To depart from simplicity of specification and obvious semantics, there has to be something pretty big to gain — big enough to convince the mass of packages out there that it's worth switching from a home-brew version scheme to something still complicated and non-obvious. What is that? I haven't seen it demonstrated, hence this thread.
Well if the specification is difficult to understand or overcomplicated it'll fail for sure.
Not only that, but I think the level of acceptance will be inversely proportional to how non-obvious and complicated the specification is.
But so far, besides that very specific case for the post-release dev tag, I don't find it complicated at all.
Then why not drop that part and stick to simple alphanumeric comparison of each component?
Ben Finney ben+python@benfinney.id.au writes:
>>> from verlib import RationalVersion as V >>> (V('1.0') ... < V('1.0.a1') ... < V('1.0.a2') ... < V('1.0.a2.1') ... < V('1.0.b2') ... < V('1.0.c1') ... < V('1.0.dev456post623') ... < V('1.0.post456')) True
+1
With a caveat (or maybe this is the way it already works?) that every series of adjacent digits in the version be compared as an integer occupying a single character's worth of string, regardless of how many digits compose the integer - which, I personally think, is how strings should always be compared, and is the first thing I'd fix if I could go back in time and improve something about Unix on the PDP in 1970 so that it would have followed as the default string comparison mechanism through the rest of time. The fact that "log.7" sorts after "log.10" is simply stupid, and trying to fix the problem with leading zeros is a silly hack that breaks as soon as you get more log files that you thought you'd need.
In other words:
- Take the strings you are comparing: "1.0.post45pre12" "1.0.post9pre12"
- Consider adjacent series of digits to be integers, not strings: (1, '.', 0, '.post', 45, 'pre', 12) (1, '.', 0, '.post', 9, 'pre', 12)
- Compare the results, making (in this case) the number 9 come before the number 45, correcting the normal problem that the string "9" is lexicographically after the string "45".
Brandon Craig Rhodes brandon@rhodesmill.org writes:
With a caveat (or maybe this is the way it already works?) that every series of adjacent digits in the version be compared as an integer occupying a single character's worth of string, regardless of how many digits compose the integer
Right. This is what I meant, but failed to express in my drive to simplicity. Thanks for picking up on it.
On Fri, Jun 5, 2009 at 11:19 AM, Ben Finneyben+python@benfinney.id.au wrote:
Yet the discussion around these non-obvious semantics, trying to have components interpreted as “pre-release” and “post-release” and “development release” and so on seem to underline the fact that they're *not* something that there's any consensus on. So why are they being foisted into a standard for version strings?
There's a consensus on this in most packaging system out there, and the goal is to have a rational version system that is understandable by most packagers so they can work with python projects versions.
For instance, most packaging systems out there will reject your project if you use "FooBar" as its version number, then "ZooBar" for its second release.
Rather than trying to force non-alphanumeric comparison semantics for alphabetic sequences, why not simply say that alphanumeric comparison semantics apply for components? That would, at a stroke, end all this turmoil and IMO futile seeking of some other consensus, when the best consensus is already what most would expect: alphanumeric comparison.
I don't think alphanumeric comparison is what most would expect.
For example if you use dates for your version, it'll work perfeclty with alphanumeric comparison but Fedora packagers will fail at sorting your versions properly.
Tarek
Tarek Ziadé ziade.tarek@gmail.com writes:
There's a consensus on this in most packaging system out there, and the goal is to have a rational version system that is understandable by most packagers so they can work with python projects versions.
Have you actually read the version comparison specifications of other packaging systems? How do you think they compare to the current draft specification here?
Those that do specify are *much* more obvious and simple in their comparison of individual components than what is currently in the ‘distutilsversion’ specification. Usually it's a simple case of “once we've got the individual components of a version string, they compare alphanumerically”.
I don't think alphanumeric comparison is what most would expect.
I think it's far more likely to be what someone would expect than any other *specific* arbitrary ordering of words that is chosen. You make this point yourself in the current draft:
… it is much preferable if the versioning spec is such that a human can make a reasonable attempt at that sorting without having to run it against some code.
What evidence is there that some arbitrary ordering of specific sets of words is going to be correctly guessed by such a person than a simple “alphanumeric ordering” specification?
For example if you use dates for your version, it'll work perfeclty with alphanumeric comparison but Fedora packagers will fail at sorting your versions properly.
I don't see how this is relevant. I'm not talking about removing the need for dividing a version string into separately-comparable components. That's clearly useful, and anyway is pretty uncontroversial and widespread.
I'm arguing that, having got those components, and needing to compare them *within an individual component position* for ordering, the best candidate for “simple to describe and remember, and obvious to guess” is that the components are compared alphanumerically with no special exceptions.
On Fri, Jun 5, 2009 at 3:55 PM, Ben Finneyben+python@benfinney.id.au wrote:
Tarek Ziadé ziade.tarek@gmail.com writes:
There's a consensus on this in most packaging system out there, and the goal is to have a rational version system that is understandable by most packagers so they can work with python projects versions.
Have you actually read the version comparison specifications of other packaging systems? How do you think they compare to the current draft specification here?
This is very upsetting ! This specification is not something I just pulled out of my hat.
We worked during two evenings during Pycon with people from Fedora and Ubuntu on that (Toshio and Matthias Klose). Those people are packaging Python projects for their systems and have problems because of the lack of proper versioning sometimes.
We were like +10 people the first night brainstorming on that topic.
The proposal is something that everyone agreed was "good enough" there for packagers to work with.
The only complexity we (I in fact) added later is the use case Philip brought up for the post-releases, and this something we are currently discussing in details in the ML/
But with your "N.N[.X]+" proposal, you are missing some things this PEP tries to adress, such as dealing with alpha, beta and candidate, or development versions.
I don't see how this is relevant. I'm not talking about removing the need for dividing a version string into separately-comparable components. That's clearly useful, and anyway is pretty uncontroversial and widespread. I'm arguing that, having got those components, and needing to compare them *within an individual component position* for ordering, the best candidate for “simple to describe and remember, and obvious to guess” is that the components are compared alphanumerically with no special exceptions.
The RationalVersion class takes your string and split it in several components, does a bit of processing, and then compare the components alphanumerically.
But there are extra rules for some of those components, (like alpha/beta/candidate restriction)
Tarek Ziadé ziade.tarek@gmail.com writes:
On Fri, Jun 5, 2009 at 3:55 PM, Ben Finneyben+python@benfinney.id.au wrote:
Tarek Ziadé ziade.tarek@gmail.com writes:
There's a consensus on this in most packaging system out there, and the goal is to have a rational version system that is understandable by most packagers so they can work with python projects versions.
Have you actually read the version comparison specifications of other packaging systems? How do you think they compare to the current draft specification here?
This is very upsetting ! This specification is not something I just pulled out of my hat.
I didn't intend to imply anything like that.
We worked during two evenings during Pycon with people from Fedora and Ubuntu on that (Toshio and Matthias Klose). Those people are packaging Python projects for their systems and have problems because of the lack of proper versioning sometimes.
This is the burden of conferences, and the danger of thinking that a room full of ten people can somehow determine the consensus of the whole community. Ideas that seem great when everyone is face to face still need to be justified at length to the larger community.
But with your "N.N[.X]+" proposal, you are missing some things this PEP tries to adress, such as dealing with alpha, beta and candidate, or development versions.
Then please, point to (or marshal people to write convincingly) the justification for why these complicating factors are needed, and why what they buy is worth the price.
The specification and discussion here acknowledges that accommodating the weight of traditional version numbering schemes isn't a primary goal: specifying something useful, obvious, and simple to explain and remember is more important. I'm attempting to cast these special complicating exceptions in that harsh light and see why they should survive.
The RationalVersion class takes your string and split it in several components, does a bit of processing, and then compare the components alphanumerically.
But there are extra rules for some of those components, (like alpha/beta/candidate restriction)
And it is exactly that extra baggage that is in question here.
On Fri, Jun 5, 2009 at 4:20 PM, Ben Finneyben+python@benfinney.id.au wrote:
We worked during two evenings during Pycon with people from Fedora and Ubuntu on that (Toshio and Matthias Klose). Those people are packaging Python projects for their systems and have problems because of the lack of proper versioning sometimes.
This is the burden of conferences, and the danger of thinking that a room full of ten people can somehow determine the consensus of the whole community. Ideas that seem great when everyone is face to face still need to be justified at length to the larger community.
I know I won't reach a consensus for the whole community. And there's no need at all to reach it. It's impossible anyway.
As a matter of fact, there are already version comparison systems in Distutils, you can read the PEP to have their descriptions. I have extracted their doc.
I think the StrictVersion in Distutils is quite similar to what you are describing.
No one will force people to use the one we are defining, like no one forced people to use StrictVersion or LooseVersion.
But at some point, if you package is being distributed to a wide audience you need an undertsandable versioning scheme, this new version can help.
Not because it provides a better version scheme, but just because your consumers will say : "Hey, I know how to read this version number" without having to decrypt it everytime, depending on the projects and the teams.
And the problem is that the one that currently exists in Distutils and Setuptools, or the one you have described, doesn't fill that requirement because they are not strict enough from an os-packager point of view (some reasons are mentioned in the PEP)
Let's state it differently : this PEP is a proposal for a *strict* versioning comparison tool that what we already have, for an easier understanding of packagers.
Maybe we need to write that "how to use version numbers" document at first. Not to say yours is bad or mine is better, but just to write down how release cycles work and how versions increments occurs. Then how each proposal would fit in there. The need of development versions will eventually become obvious for instance.
Regards Tarek
2009/6/5 Tarek Ziadé ziade.tarek@gmail.com:
No one will force people to use the one we are defining, like no one forced people to use StrictVersion or LooseVersion.
But it's being defined via a PEP (rather than hidden in the code, as with Strict/LooseVsersion) so it has a higher level of visibility and authoritativeness. So it should be held to higher standards. Maybe I won't be forced to use it, but I suspect I will be *expected* to. And quite possibly disadvantaged if I don't.
Paul.
On Fri, Jun 5, 2009 at 4:59 PM, Paul Moorep.f.moore@gmail.com wrote:
2009/6/5 Tarek Ziadé ziade.tarek@gmail.com:
No one will force people to use the one we are defining, like no one forced people to use StrictVersion or LooseVersion.
But it's being defined via a PEP (rather than hidden in the code, as with Strict/LooseVsersion) so it has a higher level of visibility and authoritativeness. So it should be held to higher standards. Maybe I won't be forced to use it, but I suspect I will be *expected* to. And quite possibly disadvantaged if I don't.
Probably so yes, if you use the install_requires field PEP 345 introduces, where you will define dependencies with their versions.
And if you don't use dev. flags for example, you will have to deal with them if they are present in other projects you depend on. It's a real need.
We could use the setuptools standard here, because it's the de-facto standard for this metadata today, (and that's what I have proposed first during the Pycon sessions), but some use cases were raised and the work+proposal you see in PEP 386 followed.
I see only advantages on having a strict, well-documented standard for versions numbers., that we know is good enough for non-python packagers.
Let's see what is going to come out of the other threads (with Phillip and Trent on the edge case). I do believe we can have something that'll reach consensus at some point when these edge cases are resolved, because the whole thing will probably work with much simpler cases like you have shown;
At least I hope we all agree that : 2009.05.12 is not a good version number, and that we all want a major/minor scheme.
Regards Tarek
2009/6/5 Tarek Ziadé ziade.tarek@gmail.com:
At least I hope we all agree that : 2009.05.12 is not a good version number, and that we all want a major/minor scheme.
Well, Twisted and Ubuntu might both disagree :-) But it's no more of a problem than 1.7.5 - the numbers are just a bit bigger, and there are gaps.
Frankly, I prefer that to 1.2post5.dev6-r1234. Or is that 1.2dev6-post5_r1234??? :-)
Paul.
Tarek Ziadé ziade.tarek@gmail.com writes:
On Fri, Jun 5, 2009 at 4:20 PM, Ben Finneyben+python@benfinney.id.au wrote:
We worked during two evenings during Pycon with people from Fedora and Ubuntu on that (Toshio and Matthias Klose). Those people are packaging Python projects for their systems and have problems because of the lack of proper versioning sometimes.
This is the burden of conferences, and the danger of thinking that a room full of ten people can somehow determine the consensus of the whole community. Ideas that seem great when everyone is face to face still need to be justified at length to the larger community.
I know I won't reach a consensus for the whole community. And there's no need at all to reach it. It's impossible anyway.
You seem to think “consensus” means “unanimous agreement”; that's not what it means. For consensus, it's only necessary that the group as a whole acts on the belief that agreement has been reached; it doesn't need to extend to every individual member.
That doesn't make getting the consensus of the Python community *easy*; but it's certainly not impossible.
And the problem is that the [version comparison algorithm] that currently exists in Distutils and Setuptools, or the one you have described, doesn't fill that requirement because they are not strict enough from an os-packager point of view (some reasons are mentioned in the PEP)
The one I've described is exactly as strict as the one in the current draft (though I haven't formally described it). The difference is not in strictness, but simplicity.
Let's state it differently : this PEP is a proposal for a *strict* versioning comparison tool that what we already have, for an easier understanding of packagers.
Yes, I agree that strictness is essential to being able to apply such an algorithm automatically. That's not in dispute.
Ben Finney ben+python@benfinney.id.au writes:
Those that do specify are *much* more obvious and simple in their comparison of individual components than what is currently in the ‘distutilsversion’ specification. Usually it's a simple case of “once we've got the individual components of a version string, they compare alphanumerically”.
[…]
I'm arguing that, having got those components, and needing to compare them *within an individual component position* for ordering, the best candidate for “simple to describe and remember, and obvious to guess” is that the components are compared alphanumerically with no special exceptions.
As pointed out, I'm oversimplifying: the consensus I've seen includes the semantic that sequences of digits in any component are to be treated as integers and compared accordingly. I think that should be part of the specification (and it was so obvious in my mind that I forgot to express it :-)