[Tutor] Test Drive Development, DocTest, UnitTest

Steven D'Aprano steve at pearwood.info
Thu Sep 23 00:48:13 CEST 2010


On Wed, 22 Sep 2010 09:44:17 pm Tino Dai wrote:
> > The *primary* purpose of doctests are to be executable examples.
> > When you write documentation, including example code is the most
> > natural thing in the world. doctest lets you execute those
> > examples, to ensure that they work. They're certainly not meant
> > as an exhaustive test of every single feature in the program, but
> > as *documentation* that happens to also work as tests.
> >
> > Unit tests can be a little heavyweight, but they're designed for
> > exhaustive tests of the *entire* program, not just the parts with
> > user-documentation. You should write whitebox tests, not just
> > blackbox tests. That means don't just write tests for the
> > published interface, but write tests for the unpublished internal
> > details as well.
>
> So, the gist is write tests for everything and the "external
> testing" should be
> handled by unit tests and the "internal testing" by doctests. Is
> that correct?

I'm not sure I understand what you mean by "external" and "internal", 
but if I do understand, you should swap them around.

What I mean is this:

Functions and classes have (or at least should have) a published 
interface, the API, which tell the user what they can expect the 
function to do and what arguments they need to provide. This is the 
external interface. It could be as simple as:

    parrot.colour
    Read-only attribute giving the colour of the parrot. The 
    default colour is "blue".

or it could be thirty pages of tightly-written documentation. Either 
way, a small number of examples are useful, and having those examples 
be executable is even better:

    >>> p = Parrot()
    >>> p.colour
    'blue'

This should go in the function's __doc__ string, where it is easily 
discoverable by users using the interactive help() function, as well 
as documentation discovery tools.

But there's no need to give an example of *every* tiny facet of 
behaviour, even if your documentation specifically mentions it. 
There's very little gained from having *both* these additional tests, 
as far as documentation is concerned:

    >>> Parrot('light green').colour
    'light green'
    >>> Parrot('   LIGHT GREEN   ').colour
    'light green'

If you find yourself saying "Yes, yes, I get the point!" while reading 
documentation, then you've got too many examples and some tests need 
to be removed.

But you do need to have those tests *somewhere*, otherwise, how do you 
know they work as expected? You need to verify the promises you have 
made, that is, test the external interface. This is "blackbox 
testing": you can't see inside the function, only the outside 
behaviour:

    >>> Parrot().colour
    'blue'
    >>> Parrot(None).colour
    'blue'
    >>> Parrot('').colour
    'blue'
    >>> Parrot('BLUE').colour
    'blue'
    >>> Parrot(' \\v\\r  red   \\n\\t   ').colour
    'red'
    >>> Parrot('rEd').colour
    'red'
    >>> Parrot('yellow').colour
    'yellow'
    >>> p = Parrot()
    >>> p.resting = False
    >>> p.colour
    'blue'
    >>> p.resting = True
    >>> p.colour
    'blue'
    >>> Parrot(state='transparent').colour
    Traceback (most recent call last):
      ...
    ParrotError: invisible parrots don't have colour
    >>> Parrot(species='Norwegian Blue').colour  # are we bored yet?
    'blue'
    >>> Parrot(species='Giant Man-eating Kakapo').colour
    'green'


Nobody needs to see *all* of that in a docstring, but it does need to 
be tested somewhere. You can use doctest for that, in an external 
file rather than the __doc__ string, but often unit tests give you 
more power and flexibility.

Blackbox testing is good, but it's not complete. You should also use 
whitebox testing, where you are expected to consider all the paths 
the code might take, and ensure that each and every one of those are 
tested. (At least that's the ideal, in practice sometimes you can't 
test *every* corner of the code. But you should try.) This is testing 
the internal implementation, and it certainly doesn't belong in the 
external documentation!

Real world example: Python's list.sort() method and sorted() functions 
use a custom-made, high-performance sort algorithm written by Tim 
Peters, but it only kicks in for sufficiently large lists. For small 
lists, they use a simpler sort algorithm. Whitebox testing needs to 
have tests for both cases, but you don't want to make the cutoff 
between the two part of your published external interface:

    >>> # test a small list that uses algorithm 1
    ... sorted([1, 5, 3, 5, 9])
    [1, 3, 5, 5, 9]
    >>> # and test a big list that uses algorithm 2
    ... sorted([1, 5, 3, 5, 9, 8])
    [1, 3, 5, 5, 8, 9]


What if the cutoff changes? You will have to revise your manuals! Why 
do you think the users of sort() need to know exactly where the 
cutoff is? Nobody cares what the exact value is. (It's not 5 items, 
by the way.)

For whitebox testing, doctests are often too limiting, and they're 
certainly too noisy to include in the documentation, unless a 
specific implementation is part of the published specification. For 
every if...else in your code, you should (in an ideal world) have a 
test which takes the if branch and a second test which takes the else 
branch. This gets huge, quick, but you should at least test the major 
paths in the code, and ensure that all the subroutines are tested,  
special values are tested, etc. Again, unit tests are often better 
suited for this.

Finally, regression tests -- every time you find a bug, you write a 
test that demonstrates that bug. That test will fail. Now fix the 
bug, and the test will pass. If your code ever accidentally 
regresses, allowing the bug to return, your regression tests will 
start failing and you will know it immediately. It goes without 
saying that these shouldn't go in your external documentation!

The lines between doc tests, blackbox testing, whitebox testing, and 
regression testing is blurry. People may legitimately disagree on 
whether a specific test is documentation, testing the interface, 
testing the implementation, or all three.



-- 
Steven D'Aprano 
Operations Manager 
Cybersource Pty Ltd, ABN 13 053 904 082 
Level 1, 130-132 Stawell St, Richmond VIC 3121 
Tel: +61 3 9428 6922   Fax: +61 3 9428 6944 
Web: http://www.cybersource.com.au 


More information about the Tutor mailing list