[Twisted-Python] resolving release management conflict

Hello Twisted community! Recently we've had some emotional language around the release process. On https://github.com/twisted/twisted/pull/1464 <https://github.com/twisted/twisted/pull/1464>, Craig said several times that he found Adi's work and process changes "very aggressive". Obviously, to have suggested this, Craig would have found some things Adi said to be emotionally loaded as well. I think we need to resolve this conflict in order to move forward productively and avoid wasting more volunteer energy on negative feelings. In order to do that, I think we need to centralize the discussion here. At this point I think we need to all begin with a presumption of good faith. Adi, Thomas Grainger, Tom Most, Maarten, Povilas and Craig are all trying to address big problems in Twisted. Nobody's trying to be aggressive, even if it's coming across that way. (Personally, I should acknowledge that I don't see Adi's actions as aggressive, but I think we have to address Craig's perception that they are. Part of the presumption of good faith is presuming that everyone has valid reasons for their reactions.) Here's the status as I understand it: Adi, Thomas, and Maarten are trying to address release automation issues to make future releases go faster. They've merged several changes, including https://github.com/twisted/twisted/pull/1464 <https://github.com/twisted/twisted/pull/1464>, to that end. Craig is trying to get out a release according to the current process, which means addressing https://twistedmatrix.com/trac/ticket/10069 <https://twistedmatrix.com/trac/ticket/10069> as it is a release blocker. He has a fix for this, but that fix is blocked. While we've been waiting on #2, the changes in #1 proceed apace. Craig's also noted several times that I've given permission for him to do the release and so he feels that breaking things mid-way is reneging on a promise from the project. I believe the proximate cause of the conflict here is that we're dealing with this regression incorrectly. When a regression is introduced, there's a process for dealing with it, originally documented here: https://twistedmatrix.com/trac/wiki/ReviewProcess#Revertingachange <https://twistedmatrix.com/trac/wiki/ReviewProcess#Revertingachange>. [1] The principle we need to outline clearly is this: if we have a regression, the change that introduced it should be reverted as soon as possible. If we can't revert it directly because of changes to trunk in the meanwhile that introduce conflicts, ideally the change that we would make would be the closest possible thing to reverting it, even if that reintroduces a known bug that we have to reopen. In the case where a lot of work has happened in the interim, and it's less work to just "fix forward" on top of trunk, it's fine to do a small fix as an end run around this process. The goal of this process is to minimize the work required to restore trunk to a releasable state. Right now we have three competing fixes which have all been flagged as incorrect in various ways: from Tom: https://github.com/twisted/twisted/pull/1499 <https://github.com/twisted/twisted/pull/1499> Craig rejected this one by saying it will break "certain versions" of Python, but I don't understand why it was closed rather than reviewed / reassigned as normal. (Or for that matter which versions are the problems, or how it breaks them.) from Adi: https://github.com/twisted/twisted/pull/1501 <https://github.com/twisted/twisted/pull/1501> Craig rejected this one by saying "please stop working on this" but it also doesn't have any technical reason that this solution is incorrect. from Craig: https://github.com/twisted/twisted/pull/1502 <https://github.com/twisted/twisted/pull/1502> This was originally given a passing review by Povilas from the Buildbot project, but twm added an additional blocking review shortly thereafter explaining that it introduced an additional regression. It seems as though this one is now actively being worked on? Craig: you closed two of these issues on the basis that you were "already working on a fix", but that's not a valid reason to close a PR. Or, for that matter, to give it a failing review. In the future, please comment on PRs that you don't want merged by pointing out their technical insufficiency, so that we can take the opportunity to learn for the benefit of future maintenance. In this case, the underlying issue is clearly quite nuanced and requires expertise that touches platform differences in Python's core APIs, so having multiple people working on different fixes is totally fine; this effort is not wasted if everybody learns a little bit about the code in question, and it's definitely not wasted if we have to synthesize our approaches in the end anyway because they address different cases. If you are pressed for time and don't have the ability to enumerate a large number of subtle problems, it's fine to post a changes-required review saying something high-level, such as "this does not address a number of issues that I've taken into account in my fix in #NNNN; please see the code on that PR for more detail, I can answer specific questions later". All of these fixes have each been outstanding for weeks, which strongly suggests that this is way too hard to fix forward and we should have stopped trying a while ago. We should fall back to the previous one if in any way possible, then reintroduce the changes which caused it. If we're really just about to land 1502, then feel free to complete the work there, but in the future we should regard this kind of delay in reverting as a process failure. If there are too many conflicts to address in a revert, and reverting means we need to revert 3 or 4 tickets which have built on top of that work, then let's just do that and get those re-merged later as well. It's unfortunate to ship without features but it's better than multi-month release hold-ups. (Note: a straight revert doesn't require a full code review. If you can revert a merge and CI is happy with it, feel free to get an administrator to smash the red button to merge immediately.) I think a secondary problem is that we have not traditionally been very clear about the role of "release manager". My intent for the role was originally that it last 1-2 weeks at most, and the responsibility is supposed to begin just a bit before the first prerelease is uploaded, and end when the final release is completed and the release announcement is written. Given that it's the release manager's responsibility to ensure no regressions are present in the release, this can come along with a little bit of soft power where we should all promptly honor the current RM's request to revert things or to prioritize review of fixes for release blockers which might not be straightforward code regressions and therefore can't be fixed by a revert (for example, 3rd party services like readthedocs or github actions breaking our integrations or CI due to no change of our own). However the RM has no special authority to circumvent processes, land code, prevent or code from landing, and should ideally never be making changes to the process in-line with the manual process of making the release. The release manager really isn't supposed to do a bunch of development on trunk. This has been confounded in the past by the fact that in the past Amber has been both a prolific contributor and the release manager, often at the same time; and, as a way to make the manual process easier, she often would do a bunch of automation work right before a relase. But the role, inasmuch as it exists, ought to be about executing whatever manual steps remain after we've automated as much as we can. Similarly, Amber's faithful execution of this role for years was a great benefit to everyone working on the project, but it also obscured that the RM is supposed to be a hat that one person wears for one to two weeks. My personal feeling (it certainly would overstate this to call it a "process") is that if someone takes on the release manager role and can't get a prerelease out within a week, then they should step down and we should have a bigger community discussion about why we're stuck. My follow-up questions are: Can we revert the change that introduced 10069 so that we can follow the process to re-introduce it after a release has been made, and have the conversation about which PR to accept and why post-release? Do we need to revert the release process changes from https://github.com/twisted/twisted/pull/1464 <https://github.com/twisted/twisted/pull/1464> for this release? If so, what is the specific technical need this addresses? Is there some problem with the release process changes that ought to make this revert more permanent? They look really good to me, but I don't want to discount that possibility either. A reviewer can always request that code be pulled back out of trunk if there's a problem that was not forseen during the review process? -g [1]: I'm not sure if this has been propagated forward to our newer process docs in git; it would be good to delete most of this doc and replace it with a bunch of links so we don't have duplicate and outdated docs.

On Mon, Feb 1, 2021 at 1:00 PM Glyph <glyph@twistedmatrix.com> wrote:
Yes, that is correct. Over the summer, while investigating failures on Azure CI caused by to non-ASCII characters in a submitter's GitHub user ID. I spent a lot of time testing in a Windows environment on Python 3.5, 3.6, 3.7, and 3.8. I found that there were subtle differences in the default encoding on the console between each Python version, and that affected the logic in _checkProcessArgs(). The fix that I did over the summer addressed the problems and fixed all the CI issues. There were two corner cases that were uncovered. One by me, and one by Tom Most. CORNER CASE 1: In Buildbot, they patch sys.stdout with io.StringIO in their tests which affects their tests which call the Twisted Reactor. io.StringIO is not a 100% proper replacement for sys.stdout which is of type _io.TextIOWrapper on Python 3+. However, Buildbot's tests have been around for a long time, so I decided CORNER CASE 2: In a Windows GUI Application, sys.stdout is None. So you cannot get the encoding from sys.stdout.encoding In https://github.com/twisted/twisted/pull/1502, I have a very small patch which addresses those two corner cases. I need to write a good test for it, but I think it is good enough to go into the release. Post-release: (1) I would like to drop Python 3.5 (2) Look at removing the _checkProcessArgs() function. The complexity of bytes/unicode logic in there was more useful in the Python 2 days when dealing with the differences between how Python on Windows handled args/env vs. how Python on POSIX handled args/env. I think this function can completely go in a Python 3.6+ world, and will greatly simplify the code. However, this needs to be carefully tested and reviewed.
but I did a fix for this, as described above.
OK, thanks for the feedback, and sorry for any process violations or hurt feelings from my actions.
No, the fixes have been outstanding for weeks mainly due to the fact that certain personal circumstances came up in my life, such that I had to refocus my time away from Twisted for a few months, compared to the time I had over the summer of 2020. I now have more time, and am committed to getting this release out.
I would rather not do this. I would rather proceed with https://github.com/twisted/twisted/pull/1502 , which is minimal, and addresses the corner cases. Going through the whole revert, resubmit, retest is going to add more delay to the release.
I do not agree with the changes in PR 1464. I think a better solution would have been to have a separate release.yml workflow, which could be triggered. By clicking the "Create a New Release" link on GitHub. I had a working prototype here: https://github.com/twisted/twisted/pull/1423 , and I successfully pushed some releases to https://test.pypi.org However, even though I disagree with the approach in PR 1464, since multiple people have decided that this is what they want, I'll just go with this. -- Craig

On Mon, Feb 1, 2021 at 1:00 PM Glyph <glyph@twistedmatrix.com> wrote:
Yes, that is correct. Over the summer, while investigating failures on Azure CI caused by to non-ASCII characters in a submitter's GitHub user ID. I spent a lot of time testing in a Windows environment on Python 3.5, 3.6, 3.7, and 3.8. I found that there were subtle differences in the default encoding on the console between each Python version, and that affected the logic in _checkProcessArgs(). The fix that I did over the summer addressed the problems and fixed all the CI issues. There were two corner cases that were uncovered. One by me, and one by Tom Most. CORNER CASE 1: In Buildbot, they patch sys.stdout with io.StringIO in their tests which affects their tests which call the Twisted Reactor. io.StringIO is not a 100% proper replacement for sys.stdout which is of type _io.TextIOWrapper on Python 3+. However, Buildbot's tests have been around for a long time, so I decided CORNER CASE 2: In a Windows GUI Application, sys.stdout is None. So you cannot get the encoding from sys.stdout.encoding In https://github.com/twisted/twisted/pull/1502, I have a very small patch which addresses those two corner cases. I need to write a good test for it, but I think it is good enough to go into the release. Post-release: (1) I would like to drop Python 3.5 (2) Look at removing the _checkProcessArgs() function. The complexity of bytes/unicode logic in there was more useful in the Python 2 days when dealing with the differences between how Python on Windows handled args/env vs. how Python on POSIX handled args/env. I think this function can completely go in a Python 3.6+ world, and will greatly simplify the code. However, this needs to be carefully tested and reviewed.
but I did a fix for this, as described above.
OK, thanks for the feedback, and sorry for any process violations or hurt feelings from my actions.
No, the fixes have been outstanding for weeks mainly due to the fact that certain personal circumstances came up in my life, such that I had to refocus my time away from Twisted for a few months, compared to the time I had over the summer of 2020. I now have more time, and am committed to getting this release out.
I would rather not do this. I would rather proceed with https://github.com/twisted/twisted/pull/1502 , which is minimal, and addresses the corner cases. Going through the whole revert, resubmit, retest is going to add more delay to the release.
I do not agree with the changes in PR 1464. I think a better solution would have been to have a separate release.yml workflow, which could be triggered. By clicking the "Create a New Release" link on GitHub. I had a working prototype here: https://github.com/twisted/twisted/pull/1423 , and I successfully pushed some releases to https://test.pypi.org However, even though I disagree with the approach in PR 1464, since multiple people have decided that this is what they want, I'll just go with this. -- Craig
participants (2)
-
Craig Rodrigues
-
Glyph