
On 02:17 am, benjamin@python.org wrote:
Backwards compatibility seems to be an issue that arises a lot here. I think we all have an idea of it is, but we need some hard place to point to. So here's my attempt:
PEP: 387 Title: Backwards Compatibility Policy
Thanks for getting started on this. This is indeed a bikeshed that there would be less discussion around if there were a clearer PEP to point to. However, this draft PEP points to the problem rather than the solution. In order to really be useful this needs to answer a whole slew of very very specific questions, because the real problem is that everybody thinks they know what they mean but in fact we all have slightly different definitions of these words. I've taken a stab at this myself in the past while defining a policy for Twisted, which is here: <http://twistedmatrix.com/trac/wiki/CompatibilityPolicy>. I think we might be a bit stricter than Python, but I believe our attempt to outline these terms specifically is worth taking a look at. The questions that follow might seem satirical or parodic but I am completely serious and each of these terms really does have a variable definition.
* The behavior of an API *must* not change between any two consecutive releases.
What is "behavior"? Software is composed of behavior. If no behavior changes, then nothing can ever happen. What is an "API"? Are you talking about a public name in a Python module? A name covered in documentation? A name covered by a specific piece of documentation? Is "subclassing" considered an API, or just invoking? What about isinstance()? What about repr()? Are string formats allowed to change? For example, certain warnings have caused the Twisted test suite to fail because with the introduction of warnings at the C level it becomes progressively harder for a Python process to control its own stderr stream. I can't remember the specifics right now, but on more than one occasion a python upgrade caused our test suite to fail because the expectation was that a process would be able to successfully terminate without putting anything on any FD but stdout. In order for any changes to be possible, there needs to be a clearly labeled mine-field: don't touch or depend on these things in your Python application or it *will* break every time Python is released. What are "consecutive" releases? If we have releases 1, 2, 3, 4, and behavior can't change 1->2, 2->3, or 3->4, then no change may ever be made. What does "must not" mean here? My preference would be "once there is agreement that an incompatible change has occurred, there should be an immediate revert with no discussion". Unless of course the change has already been released.
* A feature cannot be removed without notice between any two consecutive releases.
Presumably removal of a feature is a change in behavior, so isn't this redundant with the previous point?
* Addition of a feature which breaks 3rd party libraries or applications should have a large benefit to breakage ratio, and/or the incompatibility should be trival to fix in broken code.
How do you propose to measure the benefit to breakage ratio? How do you propose to define "trivial" to fix? Most projects, including Python, Twisted, Django, and Zope, have an ever-increasing bug count, which means that trivial issues fall off the radar pretty easily. One of the big problems here in detecting and fixing "trivial" changes is that they can occur in test code as well. So if the answer is "run your tests", that's not much help if you can't call the actual APIs whose behavior has changed. The standard library would need to start providing a lot more verified test stubs for things like I/O in order for this to be feasible.
Making Incompatible Changes ===========================
It's a fact: design mistakes happen. Thus it is important to be able to change APIs or remove misguided features. This is accomplished through a gradual process over several releases:
1. Discuss the change. Depending on the size of the incompatibility, this could be on the bug tracker, python-dev, python-list, or the appropriate SIG. A PEP or similar document may be written. Hopefully users of the affected API will pipe up to comment.
There are a very large number of users of Python, the large percentage of whom do not read python-dev. A posting on python-list is likely to provoke an unproductive pile-on. I suggest a dedicated list, which should ideally be low traffic, "python-incompatible-announce", or something like that, so that *everyone* who maintains Python software can subscribe to it, even if they're not otherwise interested in Python development.
2. Add a warning [#warnings_]_. If behavior is changing, a the API may gain a new function or method to perform the new behavior; old usage should raise the warning. If an API is being removed, simply warn whenever it is entered. DeprecationWarning is the usual warning category to use, but PendingDeprecationWarning may be used in special cases were the old and new versions of the API will coexist for many releases.
Why is PendingDeprecationWarning the special case? As outlined in the Twisted compatibility proposal, I'd prefer it if every warning went through multiple phases: pending, deprecated, gone: so that careful developers could release new versions of their software to quash noisy warnings *before* the new version of Python that actually made them user-visible was released.
3. Wait for a release.
How does this affect the parallel 2.x/3.x release cycle?
4. See if there's any feedback. Users not involved in the original discussions may comment now after seeing the warning. Perhaps reconsider.
At this point, many developers may already have been porting over to the new, non-deprecated API. I realize that there may be no way around that, but it's worth considering. Thanks for tackling this thorny problem. Good luck!