[python-committers] Wrongly stopping merges discourages merging.

Victor Stinner vstinner at redhat.com
Sun Jun 3 17:46:30 EDT 2018


2018-06-03 22:23 GMT+02:00 Terry Reedy <tjreedy at udel.edu>:
> Exhibit 1. For at least a couple of weeksin may, faults in the asyncio test
> (and another) caused the asyncio test to randomly fail about half the time.
> With one retest, each CI bot failed about 1/4 the time.  At least one bot of
> the two bots failed about 1/2 the time.  The AppVeyor queue ballooned.

I only told a few core developers: in February, I had a burn out and I
stopped everything related to Python during 3 months. I only restarted
slowly to contribute to Python in May. I moved to a new team inside
Red Hat, and my job is now to maintain Python for Red Hat.

When I saw that Ned Deily had troubles to get a release two weeks ago,
I looked a the status of the CI. As I expected, the CIs became again
very unstable. A CI is a puppy, if nobody cares of it, it dies slowly
:-( I'm sure that many core devs fixed dozens of bugs, but the
annoying part are tests which only fail randomly. It's hard to spot
them and hard to debug them. If you have a single flaky test, it's
fine. When you have two or three of them, slowly the failure rate
becomes larger than 25% and then 50%...


> One could decrease the frustration and time to success (but only partly)  by
> only re-starting the bot that failed.  Doing so for Travis is fairly easy.
> Doing so with AppVeyor is obscure and error prone.

Ah? From a GitHub PR, I click on the failed AppVeyor job (click on
"details"): at the top, there is a "[Re-build PR]".

I logged into App Veyor one or two weeks ago, thanks to a cookie,
AppVeyor now remembers me :-)

It's just two clicks once you are logged in, no?

Are you confused by the [New build] button? I never used that one :-)


> I twice requested that the randomly failing tests be disabled.  Victor said
> he wanted to keep monitoring what they did.  I think he overly discounted
> the pain and frustration of having good merges blocked.  I think either 1)
> bad tests should be disabled, or 2. the CI code should be able to ignore
> failures by bad tests, or 3. responsible core devs should be able to.

My rationale is that once a test is disabled: everybody quickly
forgets and we will keep the test as disabled for the next 5 years.

Just one example: I skipped test_ftplib.test_check_hostname() 5 months
ago... Ok, who looked at this *failing* test?

There is an open issue, right:
https://bugs.python.org/issue32706

But who cares of this issue?

Ok, let's come back to asyncio: one asyncio test started to fail,
right. Yury Selivanov, Andrew Sveltov and me spend a lot of time to
look at these tests. The sendfile tests were unable and have been
fixed. But the SSL test was weird. In fact, it wasn't a bug in the
test, but in asyncio directly!

https://bugs.python.org/issue33674

So the test helped us to spot a very tricky race condition. I prefer
that we suffered a few days than an user had such bug in production...


> Exhibit 2. AppVeyor is badly broken.
>
> This morning Cheryl Sabella submitted a nice patch fixing an annoying
> behavior of IDLE's editor/shell/output windows.  The CI tests passed, the
> patch worked great, it only needed expansion of the placeholder blurb.  I
> was really excited.
>
> With some trepidation, I made the edit.  Unfortunately, both CI bots rerun
> the code tests even when the code is unchanged.  Blurb edits should be
> treated as doc-only changes, with only the blurb rechecked.
>
> My trepidation turned out to be well-founded.  My excitement is gone. After
> an error, AppVeyor just quit without reporting any failure cause.
> https://ci.appveyor.com/project/python/cpython/build/3.8build16869

It's the first time that I see such weird behaviour on AppVeyor.

Would you mind to open a new issue to track it, please?


> Guido once asked what is off-putting about being a core developer.  This is one thing.

There are different options:

* Make AppVeyor faster: can we pay to get parallel builds? can we just
ask to get more parallel builds? run less tests? (ex: as Zach wrote,
disable the "largefile" resource)
* Make AppVeyor non blocking on pull requests
* Remove AppVeyor

If possible, I would prefer to keep a blocking CI on pull requests.
Each time we disabled AppVeyor, Windows quickly became broken.

VSTS might be a solution here, but I simply ignored this CI, since I
was busy enough to fix bugs on the other CIs.

Fixing CI failures is not a funny job and it's not rewarding. But it's
the price to pay to get an excellent quality.

If you ask my opinion, I prefer that everybody stop working until the
CI is repaired. Skipped tests only slowly reduce the quality.

Victor


More information about the python-committers mailing list