[Python-Dev] PEP 407 / splitting the stdlib

Thu Jan 19 02:03:15 CET 2012

On Thu, Jan 19, 2012 at 7:31 AM, fwierzbicki at gmail.com
<fwierzbicki at gmail.com> wrote:
> On Wed, Jan 18, 2012 at 9:56 AM, Brett Cannon <brett at python.org> wrote:
>
>> Doing a release every 6 months that includes updates to the stdlib and
>> bugfixes to the language/VM also benefits other VMs by getting compatibility
>> fixes in faster. All of the other VM maintainers have told me that keeping
>> the stdlib non-CPython compliant is the biggest hurdle. This kind of switch
>> means they could release a VM that supports a release 6 months or a year
>> after a language change release (e.g. 1 to 2 releases in) so as to get
>> changes in faster and lower the need to keep their own fork.
> As one of the other VM maintainers I agree with everything Brett has
> said here. The proposal sounds very good to me from that perspective.

Yes, with the addition of the idea of a PEP 3003 style language change
moratorium for interim releases, I've been converted from an initial
opponent of the idea (since we don't want to give the wider community
whiplash) to a supporter (since some parts of the community,
especially web service developers that deploy to tightly controlled
environments, aren't well served by the standard library's inability
to keep up with externally maintained standards and recommended
development practices).

It means PEP 407 can end up serving two goals:

1. Speeding up the rate of release for the standard library, allowing
enhanced features to be made available to end users sooner.
2. Slowing down (slightly) the rate of release of changes to the core
language and builtins, providing more time for those changes to filter
out through the wider Python ecosystem.

Agreeing with those goals in principle then leaves two key questions
to be addressed:

1. How would we have to update our development practices to make such
a dual versioning scheme feasible?
2. How can we best communicate a new approach to versioning without
unduly confusing developers that have built up certain expectations
about Python's release cycle over the past 20+ years?

For the first point, I think having two active development branches
(one for stdlib updates, one for language updates) will prove to be
absolutely essential. Otherwise all language updates would have to be
landed in the 6 month window between the last stdlib release for a
given language version and the next language release, which seems to
me a crazy way to go about things. As a consequence, I think we'd be
obliged to do something to avoid conflicts on Misc/NEWS (this could be
as simple as splitting it out into NEWS and NEWS_STDLIB, but if we're
restructuring those files anyway, we may also want to do something
about the annoying conflicts between maintenance releases and
development releases).

That then leaves the question of how to best communicate such a change
to the rest of the Python community. This is more a political and
educational question than it is a technical one. A few different
approaches have already been suggested:

1. I believe the PEP currently proposes just taking the "no more than
9" limit off the minor version of the language. Feature releases would
just come out every 6 months, with every 4th release flagged as a
language release. This could even be conveyed programmatically by
offering "sys.lang_version" and "sys.lang_version_info" attributes
that define the *language* version of a given release - 3.3, 3.4, 3.5
and 3.6 would all have something like sys.lang_version == '3.3', and
then in 3.7 (the next language release) it would be updated to say
sys.lang_version == '3.7'.

This approach would require that some policies (such as the
deprecation cycle) by updated to refer to changes in the language
version (sys.lang_version) rather than change in the stdlib version
(sys.version).

I don't like this scheme because it tries to use one number (the minor
version field) to cover two very different concepts (stdlib updates
and language updates). While technically feasible, this is
unnecessarily obscure and confusing for end users.

2. Brett's alternative proposal is that we switch to using the major
version for language releases and the minor version for stdlib
releases. We would then release 3.3, 3.4, 3.5 and 3.6 at 6 month
intervals, with 4.0 then being released in August 2014 as a new
language version.

Without taking recent history into acount, I actually like this scheme
- it fits well with traditional usage of major.minor.micro version
numbering. However, I'm not confident that the "python" name will
refer to Python 3 on a majority of systems by 2014 and accessing
Python 4.0 through the "python3" name would just be odd.

It also means we lose our ability to signal to the community when we
plan to make a backwards incompatible language release (making the
assumtion that we're never going to want to do that again would be
incredibly naive). On a related note, we'd also be setting ourselves
to have to explain to everyone that "no, no, Python 3 -> 4 is like
upgrading from Python 3.2 -> 3.3, not 2.7 -> 3.2". I expect the
disruptions of the Python 3 transition will still be fresh enough in
everyone's mind at that point that we really shouldn't go there if we
don't have to.

3. Finally, we get to my proposal: that we just leave sys.version and
sys.version_info alone. They will still refer to Python language
versions, the micro release will be incremented every 6 months or so,
the minor release once every couple of years to indicate a language
update and the major release every decade or so (if absolutely
necessary) to indicate the introduction of backwards
incompatibilities.

All current intuitions and expectations regarding the meaning of
sys.version and sys.version_info remain completely intact.

However, we would still need *something* to indicate that the stdlib
has changed in the interim releases. This should be a monotically
increasing value, but should also be clearly distinct from the
language version. Hence my proposal of a date based sys.stdlib_version
and sys.stdlib_version_info.

That way, nobody has to *unlearn* anything about current Python
development practices and policies. Instead, all people have to do is
*learn* that we now effectively have two release streams: a date-based
release stream that comes out every 6 months (described by
sys.stdlib_version) and an explicitly numbered release stream
(described by sys.version) that comes out every 24 months.

So in August this year, we would release 3.3+12.08, followed by
3.3+13.02, 3.3+13.08, 3.3+14.02 at 6 month intervals, and then the
next language release as 3.4+14.08. If someone refers to just Python
3.3, then the "at least stdlib 12.08" is implied. If they refer to
Python stdlib 12.08, 13.02, 13.08 or 14.02, then it is the dependency
on "Python 3.3" that is implied.

Two different rates of release -> two different version numbers. Makes
sense to me.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia