On Nov 26, 2017, at 4:48 PM, Jean-Paul Calderone <exarkun@twistedmatrix.com> wrote:

On Sun, Nov 26, 2017 at 5:30 PM, Glyph <glyph@twistedmatrix.com> wrote:

Quite often—as I believe is the case for the patch you’re referring to here—patches are adjusting behaviors which would be difficult to cleanly integration-test on the platform in question anyway, and the appropriate thing to do is to make some mock match the observed behavior of the platform in question.

Just because there's enough misunderstanding about how to do this kind of thing and the words that describe it, I wanted to point out that this isn't really a "mock" - and it's certainly not something you would do with a library with "mock" in its name.

Martin Fowler has the authoritative glossary on this - https://www.martinfowler.com/bliki/TestDouble.html. I have my own entry in this genre which echoes and expands upon some of his definitions: https://thoughtstreams.io/glyph/test-patterns/.

However, I eventually acquiesced to the colloquial meaning of "mock" to mean "test double"; in most contexts where people are talking about "mocks" they're really talking about fakes, and the real actual mock-module-style mocks are increasingly universally understood to be technical debt.

For those who don't know (not Glyph), a mock is an object that somehow records an exact sequence of operations (like "my foo method was called with bar" or "my baz attribute was accessed"). These tend to result in very fragile tests because they operate at such a specific level. Quite often entirely legitimate implementation changes invalidate the expectations of some mock and force corresponding test suite maintenance.

(As far as I know, we don't use the 'mock' module anywhere in Twisted, and we shouldn't start :).)

A better approach is to make sure the platform-specific interface being tested has been reduced to its smallest possible scope and then create a verified fake which implements that interface. The verification is performed by a test suite which can run against both implementations (the fake and the real - with the real version only being tested in the necessary environment). All other related tests can use the verified fake instead of the real implementation and will run on all platforms.

Completely verified fakes are unfortunately almost impossible to build. If the faked API has good enough testing support that you can programmatically provoke all of its failure modes, then the only reason to build a fake is performance, which is rarely our most pressing problem :-). So quite often even the best verified fakes will only be partially verified. Even partial verification of things like kernel APIs can be a major challenge. So it's often necessary to accept a fake which is not directly verified against the real API.

However, ad-hoc one-off fakes are at extremely high risk of drift. If you are trying to verify the system under test's response to behavior X in a test, and you build a fake that implements X, Y, and Z for the sake of the test's set-up, naturally that new fake's Y and Z will be flaky and half-implemented (at best: its X may be quite half-baked as well). Moreover, if the real X, Y or Z changes in the future, it's highly unlikely anyone will come along and update the mock for just that one test.

I believe the most important principle here is that test code is just code like any other code, and following generally good code-organization principles like well-defined interfaces, planning for maintenance over time, "don't repeat yourself", clear locus of responsibility, etc etc, will prevent test suites and their attendant doubles from decaying into a sprawling and unmaintainable mess. We have generally been making good progress on this over time, but it's important for everybody involved with Twisted to continue trying to reduce duplication in fakes, and improve the verification of the fakes we have, as we maintain our test suite.