[Tutor] OT: "Your tests are only as good as your mocks." Comments?

Sun Jul 25 23:14:09 EDT 2021

Fantastic write up on the philosophy of work where plan C remains
unplanned. 👏

On Mon, Jul 26, 2021, 8:33 AM dn via Tutor <tutor at python.org> wrote:

> On 26/07/2021 05.14, boB Stepp wrote:
> > From
> >
> https://swizec.com/blog/what-i-learned-from-software-engineering-at-google/#stubs-and-mocks-make-bad-tests
> >
> >
> > The author of this article notes an example from his practice where his
> > mock
> > database that he used in his tests passed his tests when the actual code
> in
> > production no longer had a database column that was in his mock
> > database.  As
> > I have begun to play around with databases recently and how to test code
> > relying on them, this really caught my attention.
> >
> > The overall article itself is a recap of what he read in a book about how
> > Google does things ("Software Engineering at Google").  In this situation
> > Google advocates for using "fakes" in place of mocks, where these fakes
> are
> > simplified implementations of the real thing maintained by the same team
> to
> > ensure API parity.  How would the development and maintaining of these
> > fakes
> > be done so that the fakes don't drift from coding reality like the mocks
> > might?  It is not clear to me exactly what is going on here.  And a more
> > Python-specific question:  Does the Python ecosystem provide tools for
> > creating and managing fakes?
>
>
> It's an amusing story, and one which the author identified as
> particularly relevant in larger organisations - but (almost) irrelevant
> in a one-man band.
>
> Thus, there seem to be two components to the question(s) and the
> thinking behind them. Firstly, the way teams and corporations operate,
> and secondly Python tools which support something recommended by an
> organisation (which may/not be used by their Python teams). Faking is
> newer than the more established techniques of stubs and mocks.
> Accordingly, it is generating a lot of light, but we have yet to see if
> there will be much heat! We'll get to that, but start with your interest
> in moving beyond the sole-coder into a professional dev.team environment:-
>
>
> Issue 1: Pride (as in, "...goeth before...")
> There is a mystique to working for a "FAANG company" (per article), that
> somehow translates into a Queen sound-track ("We are the champions"). In
> the ?good, old, days we referred to "Blue Chip companies" and thought
> them good, eye-catching content for one's resume (been there, done that,
> t-shirt too ragged to wear). However, the reality is, they (and their
> work-methods) are indeed unlike most others'. Whether they are better,
> or not, is up for debate... (further assumption: that all components of
> 'the organisation' are equal - and magnificent. The reality is that
> departments/projects differ widely from each-other - ranging from those
> which do shine-brightly, to those which challenge the proverbial pig-sty
> for churned-up mud and olfactory discomfort) Just because their
> employees think they're 'great' doesn't mean that their approach will
> suit any/all of the rest of us.
>
> Issue 2: Arrogance
> Within an organisation certain team-leaders attempt to build 'unity'
> through a them-and-us strategy. Which like the above, tends to engender
> a 'we are better than them' attitude. This in-turn amplifies any point
> of difference, often to the point of interfering with or preventing
> inter-communication. These days I'd probably be pilloried (are HR
> allowed to use 'cruel and unusual punishment'?) for it, but (with my
> Project Rescue PM-hat on (Project Manager)) have walked into situations
> like this; and all other efforts failing, mandated that teams to get
> themselves into the same room and 'hash things out' (professionally), or
> ... I would start "banging heads together" (unprofessionally) - or worse...
> - and yes, I've suffered through scenarios involving the DB-team not
> speaking with the dev.teams attempting to 'read' or 'write'. Sigh! (in
> fact thinking of the wasted time/money: BIG SIGH!) See also the author's
> comment about "Hyrum's Law" - which can only be said to be magnified
> when team inter-communication dwindles.
>
> Issue 3: Metrics
> There is a rule of human nature, that if some measurement is being used,
> work-practice will adapt to maximise on that point. Some of us will
> remember the idea that 'good programmers' wrote more LoC ("Lines of
> Code") per day, than others. If you were being measured on that, would
> it be better to write a three~five-line for-loop block or a single-line
> list-comprehension? How many of us really think that shipping our
> working "minutely" (per the article) is even remotely a good-idea?
> Perhaps we value our reputations? Tell me again: who claims to "do no
> harm"? Is there an emphasis on ensuring and assuring tests (at all
> levels) if there is a rush to 'production'? (this attitude to testing
> has been a problem, in many and varied forms, for as long as there has
> been programming)
>
> Issue 4: the Future is a rush!
> Whilst it is undeniably exciting, the problem with racing-forwards is
> that it will be difficult to anticipate where (future) problems lie. It
> is fair to say: no-one can predict the future (least-wise not with
> '20/20 vision'). However, Santayana's aphorism also applies: "Those who
> cannot remember the past are condemned to repeat it". In this case, "The
> Mythical Man-Month" (Brooks). That lesson recounts how adding more
> personnel to a 'late' project actually had the opposite effect to that
> intended. Which applies to the author's descriptions of adding too many
> new staff (to anything) in an uncontrolled, indeed uncontrollable,
> fashion. People don't know each other, responsibilities keep shifting,
> communication fractures, and becoming difficult/impossible, dries-up! Is
> this a technical problem or a management failing?
>
> Issue 5: Solving social problems with 'technical solutions'
> Which neatly ties-together much of the above: yes, we've probably all
> experienced the 'please upgrade' issue, and the laggards' 'can I upgrade
> from so-many-versions-ago to the-new-version all at-once?' plea.
> However, just as the author comments (earlier in the article) about
> 'engineers losing control', so too will users! People's feelings
> represents an huge proportion of their willingness/decision to install,
> use, and continue to use, an application. There are ways to talk to
> people - nay, to make the point: "ways to talk WITH people"!
> Accordingly, 'here' at Python, we have a current version of 3.9... yet
> cheerfully engage with folk using earlier releases - even (albeit with
> some alarm) those who are somehow compelled to stay with Python 2!
>
>
> With such critique in-mind, let's look at practicalities:-
>
> You (and the author) are quite right, such faults will not be discovered
> in what is often 'the normal course' of a team's/an individual's
> work-flow! However, remember that one should not (unit) test to
> stubs/mocks/interfaces; but test to values! If your code is to divide
> two numbers, it had better test for 'that zero problem'; but who-cares
> from where the data has been drawn?
>
> If I happen to be doing the DB work, using 'my' (project's) Git repo;
> and you are writing app.code within 'your' Git repo, we can both unit
> test "until the cows come home" - and never, ever find such an 'error'
> as the author described. Unit tests, must by-definition come up short.
> Such investigation is (a part of) the province of "Integration Testing".
> Who manages that part of the author's CI/CD process? Answer: not the
> DB-team, not the app.devs, ... Who then? Oops!
>
>
> FYI At one time I considered a technical answer to this issue and
> thought that MySQL's in-memory DB-engine might serve, ie by putting
> tables/table-stubs into memory, by which the speed-increase might
> counter the disadvantage of using a 'real' DB. It wasn't suitable,
> largely because of the limitations on which data-types which can be
> handled - and thus it failed to enable a realistic replacement of the
> 'real data'.
> (https://dev.mysql.com/doc/refman/8.0/en/memory-storage-engine.html)
>
>
> There is always going to be a problem with modelling - you'd think that
> as we do this all-day, every-day, we-computer-people would consider
> this. Do we? Adequately?
>
> A mock is a mimic - similar but not the same, and quite possibly even
> over-emphasising certain aspects (even at the risk of minimising others).
>
> A stub is an abbreviated form - 'stuff' has, by definition, been left-out.
>
> These are 'judgement calls'. Do we sometimes get these 'wrong'?
>
> Remember that should you stub your toe there may be others prepared to
> mock your clumsiness. (Yuk!)
>
> Any tool can be assumed to be offering us 'more' than it really is - it
> is easy to assume that we have 'everything covered' when we don't -
> after all, isn't that our fate: that no matter how much testing we
> perform, there will always be one user who can find some combination of
> circumstances we did not foresee...
>
> Finally, in this cynical observation of 'real life', there was talk of
> "fakes are simplified implementations of the real thing maintained by
> the same team to ensure API parity". Which is "the same team"? The
> DB-guys who are only interested in their work, and who work to no metric
> which involves helping 'you'? Your team - the ones who have no idea that
> the DB-team have 'shifted the goal-posts'? Whither "API parity"? It may
> be that instead of discovering that the information used to build the
> mock is now out-of-date, all that happens is that you (belatedly)
> discover that the "fake" is too-fake...  The deck-chairs have been
> rearranged and given new labels ("mock", "fake"), but they are still on
> SS Titanic!
>
>
> So, then, what is the answer? (I'm not sure about the "the"!)
>
> Once again, back in the ?good, old, days (oh, no, here he goes again...)
> - which included the "waterfall approach" to systems development, we
> called it "Change Control". No-one was allowed to make 'changes' to a
> system without appropriate documentation to provide (all-concerned) notice!
>
> Today, imagine a stand-up SCRUM/meeting, and I (ever so grandly)
> announce that today I shall be coding a change to the database, adding a
> new field, and it will all be so-wonderful. You prick-up your ears and
> ask for more info - will it affect your application-code? We agree to
> 'meet afterwards', and the change is reviewed, and either reversed or
> accommodated.
>
> No nasty surprise AFTER we both thought 'job done'! How well we work
> together!
>
>
> Some people think of (unit- and integration-) testing as 'extra work'.
> After all, it is the application-code which 'gets the job done'! One of
> the contributions of TDD is that testing is integral to development, to
> proving, and to maintenance/refactoring. Accordingly, as much care
> should be invested in the testing routines as is into the application's!
>
> Whether we use mocks, stubs, fakes, or you-name-it, there are no
> guarantees. Each must be used with care. Nothing can be taken for-granted!
>
> A tool being used at one 'layer' of the testing process cannot really be
> expected to cross-layers - even if some 'higher' layer of testing uses
> the same tool. How one layer is tested is quite different to the
> objectives of testing an higher/lower layer!
>
> NB I don't see one of these as 'the tool to rule them all'. Each has its
> place - use 'the best tool for the job'.
>
>
> A number of references/allusions have been included here, because I know
> the OP likes to 'read around' and consider issues wider than mere syntax.
>
> Having survived this far, if you enjoy our favorite language's
> occasional allusion to the Monty Python series, you will likely also
> enjoy Terry Pratchett's (fictional) writings. Herewith a couple of
> pertinent quotes, the first about 'planning' (planning testing?), and
> the second referencing our industry's habit of making great promises
> (and assumptions) but being a little short on delivery against others'
> (users') expectations (as per the article?):-
>
> "Plan A hadn't worked. Plan B had failed.  Everything depended on Plan
> C, and there was one drawback to this: he had only ever planned as far
> as B."
>
> "Crowley had been extremely impressed with the warranties offered by the
> computer industry, and had in fact sent a bundle Below to the department
> that drew up the Immortal Soul agreements, with a yellow memo form
> attached just saying:‘Learn, guys.'"
> (both from "Good Omens")
> --
> Regards,
> =dn
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>