[SciPy-User] peer review of scientific software

Bill Carithers wccarithers at lbl.gov
Tue May 28 18:44:05 EDT 2013


As a scientist who spends the majority of my time writing code to analyze
data, I've found this discussion fascinating. Early in my career, I actually
coded in assembly language (remember index registers?), then Fortran for a
couple of decades, then biting the bullet and moving to object-oriented
languages (Java, Python, Objective C). Now I use mostly Python. I hope the
following comments from this perspective will be useful.

1. There is no "one size fits all" . Sometimes I use Python as a BASIC-style
calculator, sometimes Python as procedural like Fortran, most of the time
Python as fully OO. The level of documentation, testing, and version control
need to be tailored to the problem.

2. In terms of getting scientists into the "modern world" of writing
maintainable, re-useable code, I think the most useful tool is a really good
IDE. Then much of the documentation, version control, de-bugging tools are
seamlessly there. I don't think I could have written acceptable Java without
Eclipse, and I'm absolutely positive that I couldn't write Objective C
without Xcode. I use IDLE for Python, but it is no where near the level of
these others.

Hope these help and keep up the good work,
Bill


On 5/28/13 3:05 PM, "Matthew Brett" <matthew.brett at gmail.com> wrote:

> Hi,
> 
> On Tue, May 28, 2013 at 2:52 PM, John Hassler <hasslerjc at comcast.net> wrote:
>> 
>> On 5/28/2013 4:58 PM, Matt Newville wrote:
> 
> <snip>
> 
>>> I am certainly am happy to support the notion that "more scientists
>>> should be able to program better", so  I am not going to say the
>>> entire article is wrong, and I don't disagree with their main
>>> conclusions.  But I think they have a fatal flaw in their assumptions
>>> and arguments.
>>> 
>>> --Matt Newville
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-use
>> 
>> Exactly!   There is actually a question here that hasn't been made
>> explicit.  For whom is this advice intended?  There are all levels of
>> programming/programmers in STEM.  Some of my colleagues use Excel for
>> everything.  (As in, EVERYTHING.)  Some fewer use Matlab.  Still fewer
>> use C/Fortran/Java/C#/whatever.  So far as I know, I'm the one lone
>> Pythonista.  Each group uses programming differently.
>> 
>> I've been programming for more than 50 years.  I've taught programming
>> to engineers in several contexts over the years.  For a time, I really
>> wanted to 'do it right.'  (I even taught 'structured programming' and
>> 'Warnier-Orr' at one point, but realized that it was worse than useless
>> for the particular audience.)  I've come to realize that most engineers
>> just want an answer.  They are not interested in how gracefully the
>> answer was arrived at.  MOST programs written by MOST engineers are
>> small, short, simple, and intended to solve one problem one time.  (The
>> deficiency I've most often seen is the lack of error checking for the
>> answer, and better programming techniques would not generally help much.)
>> 
>> The problem is that nobody sets out to write a "well respected"
>> program.  Someone sets out to scratch a particular itch ('one problem
>> one time').  It expands.  Others find it useful.  It becomes widely
>> used.  The original author, however, was solving his/her own particular
>> problem, and was not at all interested in "proper" programming.  So, I
>> guess my question is, how do we find that person who is going to write
>> the "well respected" program and convince him/her to take time out and
>> learn proper programming first? Because we are certainly not going to
>> convince everybody to do it.
> 
> You might find this reference interesting :
> 
> Basili, Victor R., et al. "Understanding the
> High-Performance-Computing Community." (2008).
> 
> I found it from the Joppa article :
> http://blog.nipy.org/science-joins-software.html
> 
> The take home message seems to be - "we tell scientists to use our
> fancy stuff, they tell us no, and now we realize they were often
> right".
> 
> That article is about high-level programming tools, but it must be
> entirely different for version control, testing, code review, in
> particular.  I believe these tools are very fundamental in controlling
> error.
> 
> The point about error is the central, for me.  As I proceed further
> down my scientific career, I slowly begin to realize the number of
> errors we make, and how easy we find it to miss them:
> 
> http://blog.nipy.org/unscientific-programming.html
> 
> That, for me, is the key argument - we will make fewer mistakes and do
> better science if we use the basic tools to help us control error and
> to help others find our errors.
> 
> Most scientists (myself included) tend to believe this error is not
> very important.
> 
> I believe that's is wrong, but as scientists we don't believe
> everything we think, and so we need data.  I wonder how we should get
> it...
> 
> Cheers,
> 
> Matthew
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user





More information about the SciPy-User mailing list