[SciPy-Dev] How to get "correct" values for unit tests

Wed May 30 13:54:56 EDT 2012

30.05.2012 18:46, Andreas Hilboll kirjoitti:
> while working on the tests for the new
> scipy.interpolate.{Smooth,LSQ}SphereBiavariateSpline classes, I'm
> wondering how to come up with sensible TRUE example values to test against.
> 
> In the case mentioned (see https://github.com/scipy/scipy/pull/192), I
> simply wrapped a routine (sphere.f) from FITPACK. So I could write a
> direct FORTRAN program using sphere.f to calculate some "TRUE" values.
> However, that would just check that the wrapping actually works.
>
> Is this considered enough? Ultimately, I would like a test to assure
> that the results are correct. But for that, wouldn't it be "better"
> (whatever that means) to use a different library to calculate the TRUE
> results?

I think an useful philosophy for the tests should be about ensuring that
the code, as a whole, does what is promised. (Not all decades-old
Fortran code is reliable...) So, more in the direction of functional
tests than unit tests, and this works also as a QA step...

Testing interpolation is a bit more difficult to do than for other types
of code, since what is a "good" result is more fuzzily defined there,
and the "correct" results are not fully well-defined.

If I had to manually verify that the interpolation on a sphere works,
what I'd try at first would be: generate a random dataset (with fixed
random seed) and check (plot) that the result looks reasonable:

- interpolant at points maps to original data values

- continuity across "edges" of the sphere

- checks for the flat derivative options

- that the interpolant is "nice" in some sense

The first three can be converted to assert_allclose style tests with
some amount of work. The last relies on the eyeball-norm, but I could
just pick a few data points out from a plot I think is reasonable, and
write a small test that checks against those (as a statement that
someone actually looked at the output).

I'd guess the above would also catch essentially all possible problems
in the wrapping.

IMO testing just the wrapper is not very useful --- the above sort of
tests are not much more difficult to write, and should catch a wider
range of problems.

	Pauli