Data-driven testing

Terry Reedy tjreedy at
Fri Apr 25 17:46:34 CEST 2003

"Max M" <maxm at> wrote in message
news:058qa.30797$y3.2515229 at
> Actually I write a LOT of "throwaway" scripts, that I know I will
> need again, or that if I will use them again, I will just rewrite
> It is far more important for me that these scripts are up and
> fast than that they are correct.
> I use them for massaging large amount of text. If ie. I need to make
> website for a customer, and I have a large amount of similar texts.
> could be something like a list of employees that is keept in a flat
> file, text document, excell sheet or similar.

This sort of thing is what I first learned Python for also.
In my case, it was been 'flat', rectangular data table files.

> Usually these files are pretty similar, but I know they are full of
> errors, misspellings etc. So I have to edit them manually anyway.

Example throwaway scripts for error discovery:

for line in file('path.ext', 'r').readlines:
    if len(line) != N: print line # N = expected line length
    if line[M] not in ' mf': print line # M = 'sex' data column

This is followed by maual editing to try to fix offending lines.
Such a script *is* a test script.  Should I write a test for the

(Having said that, if I were to pull such snippets together to write a module, having checked that none exists, I should and
probably now would write test data with errors and corresponding  unit

> I write a small script that can save me a lot of copy-pasting,
> files etc. by automating most of the process. These scripts only
have to
> be "good enough" as it is faster to manually edit minor errors in
> resulting output than to rewrite the script.

Or the fix may require human judgement that is hard to program, like
finding the extra/missing space to delete/add or it may require
physical action like referring back to original data sheet or even
making a phone call to correct a mis-entered value.

*To paraphrase Socrates/Plato: "Who tests the testers?"

My Python2.2/Lib/test directory seems *not* to have a
'' file!  -- though perhaps it should -- and could (see
Kent Beck's book).  In any case, there are no automated tests of all
the other test files.  Test bugs and omissions are
found, when they are, the old-fashioned way -- re-inspection, perhaps
by another pair of eyes,  or investigation of anomalous experience.

My point: test code is useful when it is easier to verify than the
payoff code it is verifying.  It is less useful or even a nuisance
when it it not.  Expected revision frequency of payoff code versus
test code factors into 'easier'.

Terry J. Reedy

More information about the Python-list mailing list