[SciPy-User] peer review of scientific software

Matt Newville newville at cars.uchicago.edu
Tue May 28 16:58:44 EDT 2013


Hi,

As others have said, I find the low average programming skill level
among scientists frustrating,  but I also found this article quite
frustrating.

>From my perspective, the authors main complaint seems to be that there
is not enough independent checking of specialized scientific software
written by scientists.  They seem particularly unhappy about the
tendency to use existing packages written by other scientists based on
"trust", "reputation", "previous citations" and without independent
checking.  They also say:

      A "well-respected" end-user developer will almost certainly have
earned that respect
      through scientific breakthroughs, perhaps not for their software
engineering skills
      (although agreement on what constitutes "appropriate" scientific
software engineering
      standards is still under debate).

On this point in particular, and indeed in this whole line of
argument, I think the authors are misguided, perhaps even to the point
of fatality damaging their whole argument.   I believe much more
common case is for the "well-respected" end-user developer to be known
for the programs written and supported, and less so for the scientific
breakthroughs (unless you count new programs as new instrumentation,
and so, well, breakthroughs, but it's pretty clear that the authors
are making a distinction).    It's too often the case that spending
any significant time on such programs is career suicide, as it takes
time and attention away from such breakthroughs.   It's perfectly
believable that the programming skills of such a scientific developer
may be incomplete, but I think it's fair to say that most supported
and well-used programs are likely the effort of people with
above-average programming skills and the interest and intent to
support such programs.   Indeed, I would argue that instead of being
unhappy about the reliance on trusted programs and developers, the
authors would better serve the scientific community by arguing that
the authors of such programs should be better supported, and given
access to tools and resources (ie, fund them) to improve their work
rather than treat them as untrustworthy programmers.

I should admit to being one such author of a "well-respected" and
"trust" package for a very small scientific discipline, and with the
proverbial "many citations etc" because of this.  So I would admit to
being the just sort of person the authors are unhappy about.  I
suspect many people on this mailing list are in the same category.   I
would like to think the trust and respect for certain packages have
been earned, and that people use such packages because they are "known
to work", both in the sense of actually having been tested on
idealized cases, and in producing verifiable results in real cases
(where "testing" would not always be possible).   Indeed, the small,
decentralized group of scientific programmers that I work with (mostly
trained as physicists, and learning to program in Fortran -- some of
us still use mostly Fortran, in fact) do test and verify such codes,
precisely because we know other people use them.   Of course errors
occur, and of course testing is important.   Modern techniques like
distributed version control and unit testing are very good tools to
use.   I agree they should be used more thoroughly, and that one
should always be willing to question the results of a computer
program.

Then again, when was the last time I tested the correctness of results
from my handheld HP calculator?    Hmm, a very, very long time ago.
That's software.  I tend to believe the messages I read in my inbox
are actually the message sent, and hardly ever do a checksum on it.
But that's software.  Indeed, all science is a social enterprise and
so "trust", "reputation", and reliance on the literature (aka "past
experience") are not merely unfortunate outcomes of laziness, but an
important part of the process.

I am certainly am happy to support the notion that "more scientists
should be able to program better", so  I am not going to say the
entire article is wrong, and I don't disagree with their main
conclusions.  But I think they have a fatal flaw in their assumptions
and arguments.

--Matt Newville



More information about the SciPy-User mailing list