On Jun 30, 2016, at 17:36, Jean-Paul Calderone <exarkun@twistedmatrix.com> wrote:

On Thu, Jun 30, 2016 at 6:25 PM, Glyph Lefkowitz <glyph@twistedmatrix.com> wrote:

On Jun 30, 2016, at 04:13, Jean-Paul Calderone <exarkun@twistedmatrix.com> wrote:

On Thu, Jun 30, 2016 at 6:43 AM, Adi Roiban <adi@roiban.ro> wrote:

Recently we have introduced a hard check of 100% coverage for all changes.
This is done via coverage + codecov + github protected branches.

Now, if your patch is not 100% covered github will not let you merge it.

The errback is there to help with test failures ... but the test should never fail, so that errback is never called... and that line is not covered.

It doesn't always make sense to require 100% execution of all test code.  It's not at all uncommon to only have code in a test suite that runs when a test fails.  Historically, Twisted has never had a requirement of 100% execution of test code.  The only test suite coverage requirements that have commonly been requested or enforced is for coverage of implementation code.

I'd suggest removing the coverage enforcement for test suite code.

I am inclined to disagree, albeit mildly.

When one is writing a deliberately un-covered path in test code, presumably, one is writing either a test helper - a mock, fake, or utility for setting up a real implementation - or an assertion method.  Historically, I believe that when we've invested more heavily in making these utilities "real" themselves, and not just throwaway stuff inline in a test method or module, the benefits have far outweighed the costs.  In fact the availability of proto_helpers is one of the selling points of Twisted as opposed to other competing engines.

I mostly agree with this.  However, I was thinking of a slightly different pattern when I wrote my earlier email.  Here's are a couple (fictional) examples of that pattern one might find in unit tests for application code (and there's nothing Twisted-specific here):

if foo:

    self.fail("Foo :(")

Hm.  This pattern is exactly what I was thinking of though - as you point out, these examples did get generalized :-).

In principle, I agree that a one-off example like this does not benefit from extensive refactoring to facilitate general use.  But... in practice, every example of this that I've seen in a long-lived codebase eventually metastasizes into a repeated copy/paste pattern, or a big gross mixin that all tests practically need.  Forcing everyone to deal with the problem sooner rather than later seems to have been a win on the few projects where I've gotten to practice it.

It's not exactly that this can't be code that's executed in a passing run of the test suite.  It's more a question of what the right balance point is.  If someone wants to generalize logic like this (and, fortunately, someone did generalize these particular examples - they're assertFalse and assertRaises, respectively) then that's great and the result is a higher level of confidence resulting from a successful run of the test suite.  I'd suggest that if tests like these exercise all of the implementation code (on a successful run), though, then you've still achieved a pretty high level of test coverage and maybe further efforts are more productively directed elsewhere (increasing the coverage level of other implementation code in Twisted, for example :).

Speaking only from my direct experience here, adding good test helpers is a net reduction in effort very quickly; they pay for themselves on only the third test you write with them :).  So I don't feel like this is much of a burden, but I'd be interested in hearing others' feedback here.

If folks want a higher bar than this, I'm not going to argue (at least not much, at least not now).  The bar hasn't been this high in the past though (and there are many many such cases to be found in Twisted's test suite right now and I don't have the impression this has ever been much of a source of problems).

I wish it were easier to search my own reviews, because I think I've prodded people to factor this kind of stuff out, but I can't find any handy examples.

But, I can see that right after this, I'm going to have to answer Craig's email, so perhaps we'll get to see a specific example :).

Therefore, I think that asking folks to add independent test coverage to verify their fakes and ensure that the failure-reporting of their assertion messages are helpful in the event a test fails is a generally good idea, and we should keep the requirement for 100% coverage on both test and implementation coverage.

However, if there is contention around this, I'd much rather get a ratchet in place for implementation code that's reliable and everyone is happy with, so I'm OK with disabling coverage reporting for our *.test.* packages as a step towards that.

I completely agree that fakes should be verified.  So much so that I'm not even sure I believe in fakes in general anymore.  Instead, you should just have easy to use interfaces and ship inexpensive implementations alongside whatever other implementations you also need.


And all those implementations should have great test coverage.  I also completely agree that when tests fail, they should do so in a meaningful way.  I suspect slightly the implication that automated test coverage for the failure case demonstrates the failure is reported meaningfully, though. :)  I think we're still stuck with relying on humans (authors, reviewers) to verify that property.

Sure, rich and meaningful semantic error reporting is a human's job.  But "doesn't just raise an exception and lose all the information about what happened due to an attribute spelling error on the rare race-y test that fails 1/10000 times" is a good first approximation, and tests are good at that sort of thing ;-).