The Night’s Watch is Fixing the CIs in the Darkness for You
Hi,
tl; dr Maybe the 1090 pending pull requests will have to be rebased on master to be able to be merged, unless we enhance our CI to always test PRs after rebasing them on the master branch. I'm not sure at this point.
--
FYI the Night’s Watch is still fixing the CIs in the darkness for you :-) In short, most CIs should now be green in all branches (especially CIs running on pull requests). I'm only aware of a Refleak issue on 3.x (test_threading, more about that above).
Last days were more busy than usual. Here are issues in a random order.
I had to update the OpenSSL version in the CI configuration which lives in the same Git repository than Python code base. Since our CI currently runs on unmodified PRs, maybe the 1090 pending pull requests must now be rebased manually on the master branch to get these CI configuration updates. Otherwise, the CI would fail which prevents to merge a PR. I'm not sure at this point.
At least, old PRs with a green CI can be merged. But problems are likely to arise if a reviewer requires changes and the CI is re-run on the update PR. The PR may pick the old CI configuration (old OpenSSL version) and fail badly...
Having to rebase 1090 PRs would be very unfortunate... Maybe one solution would be to redesign CIs running on PRs to always rebase them on the up-to-date master branch. There are services like https://mergify.io/ which implement such workflow. Currently, we test "outdated" master: there is a risk that tests fail once the change is rebased on top of the up-to-date master, whereas tests passed on the old master...
Note: fixing multitestssl.py to update the URL of old OpenSSL tarballs doesn't work: again, old PRs don't get magically fixes pushed to master.
--
concurrent.futures no longer uses daemon threads. It causes a reference leak in test_asyncio. Kyle Stanley managed to fix it in test_asyncio: https://bugs.python.org/issue40115
--
Adding a module state to the _abc module triggered a reference leak in test_threading: https://bugs.python.org/issue40149
This leak is very unlikely to be a new bug. It's just that previously it was hidden by the fact that all interpreters used the same module instance. regrtest -R 3:3 ignores the 18550 references that Python leaks at exit: it only checks if the total references number changes before/after a test run. When we *fix* modules to get isolated module instances, we start to see old leaks as "new" leaks.
I have a pending workaround, but I'm first trying to investigate why two garbage collections are needed to fix the leak...
--
The URL of the OpenSSL 1.1.1d moved which broke GitHub actions jobs, Azure Pipelines and Travis CI. Not all CIs failed at once because of different caches with different lifetime.
- https://bugs.python.org/issue40146 Azure
- https://bugs.python.org/issue40162 Travis CI
- https://bugs.python.org/issue40125 multissltests
I also proposed to modify multissltests to attempt to get the tarball from a different URL. Christian Heimes proposed a fix: https://bugs.python.org/issue40163
--
I enhanced tests to detect when a child process does crash. Previously, only FreeBSD *indirectly* detected such crash by the creation of coredump files.
test_builtin started to fail on Solaris because the main test_builtin process was killed by SIGHUP. I already fixed a similar bug in test_pty last December by registering a signal handle to ignore SIGHUP. I copied the fix to test_builtin: https://bugs.python.org/issue40140
The commit also fix an AIX issue in the same test.
--
After Python 3.9.0 alpha 5 release, I fixed a reference leak in test_importlib which was introduced by the recent work on converting C extension modules to multiphase initialization (PEP 489): https://bugs.python.org/issue40050
It's a subtle issue between importlib and _weakref. See the issue for the technical analysis.
I also fixed os.getgrouplist(): https://bugs.python.org/issue40014
The macOS job on GitHub PRs failed because of this issue. In fact, the bug existed for a long time, but macOS 10.15 (Catalina) added more system groups to users which triggered the bug.
os.getgrouplist() only failed if the user has more than ... 17 groups :-) Funny issue.
--
To be able to backport security fixes to Pyhon 3.5 and Python 3.6, I had to fix the "Codecov patch" job which failed: https://bugs.python.org/issue40156
I copied the Codecov configuration from master.
Sadly, all 3.5 and 3.6 PRs have to be rebased to retrieve these fixes, otherwise, they cannot be merged.
--
To finish, you can find general info about Python CIs at: https://pythondev.readthedocs.io/ci.html
Victor
Night gathers, and now my watch begins. It shall not end until my death.
On Fri, Apr 3, 2020 at 11:32 AM Victor Stinner <vstinner@python.org> wrote:
I had to update the OpenSSL version in the CI configuration which lives in the same Git repository than Python code base. Since our CI currently runs on unmodified PRs, maybe the 1090 pending pull requests must now be rebased manually on the master branch to get these CI configuration updates. Otherwise, the CI would fail which prevents to merge a PR. I'm not sure at this point.
In general CI systems run on merge(PR-head, target-branch-head), rather than directly on PR-head. So this may not be an issue at all.
-n
-- Nathaniel J. Smith -- https://vorpus.org
Le 03/04/2020 à 20:48, Nathaniel Smith a écrit :
On Fri, Apr 3, 2020 at 11:32 AM Victor Stinner <vstinner@python.org> wrote:
I had to update the OpenSSL version in the CI configuration which lives in the same Git repository than Python code base. Since our CI currently runs on unmodified PRs, maybe the 1090 pending pull requests must now be rebased manually on the master branch to get these CI configuration updates. Otherwise, the CI would fail which prevents to merge a PR. I'm not sure at this point.
In general CI systems run on merge(PR-head, target-branch-head), rather than directly on PR-head. So this may not be an issue at all.
That was my impression as well.
Regards
Antoine.
On 03Apr2020 1948, Nathaniel Smith wrote:
On Fri, Apr 3, 2020 at 11:32 AM Victor Stinner <vstinner@python.org> wrote:
I had to update the OpenSSL version in the CI configuration which lives in the same Git repository than Python code base. Since our CI currently runs on unmodified PRs, maybe the 1090 pending pull requests must now be rebased manually on the master branch to get these CI configuration updates. Otherwise, the CI would fail which prevents to merge a PR. I'm not sure at this point.
In general CI systems run on merge(PR-head, target-branch-head), rather than directly on PR-head. So this may not be an issue at all.
The CI configuration is loaded from the PR head, at least on GitHub Actions (where you can validate config changes in PR) and Azure Pipelines. I just validated this by re-running an old PR and watched it choose OpenSSL 1.1.1d instead of 1.1.1f (and then fail to check out a reference - see the logs at https://github.com/python/cpython/pull/18909/checks?check_run_id=564079685)
So I think it's an issue, and probably does require PRs to merge in master to get back in sync. But probably the best we can do is post a message in older PRs warning that they need to do it. Maybe we close them at the same time?
We could also move some of these CI configuration variables out of the CI files and into separate scripts - many already are. This is annoying for anyone who forks and runs their own build, but it should mean that updates like the OpenSSL version get merged in during CI even for older PRs.
Cheers, Steve
Le lun. 6 avr. 2020 à 13:31, Steve Dower <steve.dower@python.org> a écrit :
The CI configuration is loaded from the PR head, at least on GitHub Actions (where you can validate config changes in PR) and Azure Pipelines. I just validated this by re-running an old PR and watched it choose OpenSSL 1.1.1d instead of 1.1.1f (and then fail to check out a reference - see the logs at https://github.com/python/cpython/pull/18909/checks?check_run_id=564079685)
So I think it's an issue, and probably does require PRs to merge in master to get back in sync. But probably the best we can do is post a message in older PRs warning that they need to do it. Maybe we close them at the same time?
That's bad :-(
We could also move some of these CI configuration variables out of the CI files and into separate scripts - many already are. This is annoying for anyone who forks and runs their own build, but it should mean that updates like the OpenSSL version get merged in during CI even for older PRs.
That sounds like a sane plan.
The OpenSSL version is just one example of our CI flaw. Yet another example: new Sphinx breaks Python 3.8 CI. https://bugs.python.org/issue40204
The Sphinx version is not pinned in 3.8 (it is in master).
Victor
Night gathers, and now my watch begins. It shall not end until my death.
On 06Apr2020 1750, Victor Stinner wrote:
Le lun. 6 avr. 2020 à 13:31, Steve Dower <steve.dower@python.org> a écrit :
The CI configuration is loaded from the PR head, at least on GitHub Actions (where you can validate config changes in PR) and Azure Pipelines. I just validated this by re-running an old PR and watched it choose OpenSSL 1.1.1d instead of 1.1.1f (and then fail to check out a reference - see the logs at https://github.com/python/cpython/pull/18909/checks?check_run_id=564079685)
So I think it's an issue, and probably does require PRs to merge in master to get back in sync. But probably the best we can do is post a message in older PRs warning that they need to do it. Maybe we close them at the same time?
That's bad :-(
So is leaving them there when we know they can't be merged :(
We could also move some of these CI configuration variables out of the CI files and into separate scripts - many already are. This is annoying for anyone who forks and runs their own build, but it should mean that updates like the OpenSSL version get merged in during CI even for older PRs.
That sounds like a sane plan.
Provided there are options/environment variables available to override these without having to modify source files, I agree that it's sane.
Though we then need an explicit policy in our CI configurations to not put information in those - I've already noticed that people contribute changes to them that don't follow the existing conventions, so we'll have to be more explicit about it.
Cheers, Steve
On 4/3/2020 2:32 PM, Victor Stinner wrote:
Hi,
tl; dr Maybe the 1090 pending pull requests will have to be rebased on master to be able to be merged, unless we enhance our CI to always test PRs after rebasing them on the master branch.
I hope you mean doing an update merge rather than a rebase.
I am otherwise not sure what you are saying. Sometimes when I request an up-to-date webpage for a PR, it has a notice that there is a merge conflict and that the PR cannot be merged even though all tests, when last run, were green. The tests do not have to be rerun for this to happen.
The latest occurrence was for a PR that modified a blurb file that was deleted after the blurb was moved to the master log for 3.9.0a5. (This was not nice to fix, so I will avoid in the future by not commit such edits until otherwise reade to merge.) So my impression is that github tries to merge the PR into master on every page request.
I very much appreciate the work being done to prevent spurious CI failures.
participants (5)
-
Antoine Pitrou
-
Nathaniel Smith
-
Steve Dower
-
Terry Reedy
-
Victor Stinner