"Strong typing vs. strong testing"

Thu Sep 30 15:45:46 EDT 2010

In article <7xr5gbxfry.fsf at ruckus.brouhaha.com>,
 Paul Rubin <no.email at nospam.invalid> wrote:

> RG <rNOSPAMon at flownet.com> writes:
> > Yes, I know I could have used lint.  But that misses the point.  For any 
> > static analyzer, because of the halting problem, I can construct a 
> > program that either contains an error that the analyzer will not catch, 
> > or for which the analyzer will produce a false positive.
> 
> Can you describe any plausible real-world programs where the effort of
> complicated static is justified, and for which the halting problem gets
> in the way of analysis?  By "real world", I meanI wouldn't consider
> searching for counterexamples to the Collatz conjecture to qualify as
> sufficiently real-world and sufficiently complex for fancy static
> analysis.  And even if it did, the static analyzer could deliver a
> partial result, like "this function either returns a counterexample to
> the Collatz conjecture or else it doesn't return".  
> 
> D. Turner wrote a famous paper arguing something like the above, saying
> basically that Turing completeness of programming languages is
> overrated:
> 
>   http://www.jucs.org/jucs_10_7/total_functional_programming
> 
> The main example of a sensible program that can't be written in a
> non-complete language is an interpreter for a Turing-complete language.
> But presumably a high-assurance application should never contain such a
> thing, since the interpreted programs themselves then wouldn't have
> static assurance.

There are only two possibilities: either you have a finite-state 
machine, or you have a Turning machine.  (Well, OK, you could have a 
pushdown automaton, but there are no programming languages that model a 
PDA.  Well, OK, there's Forth, but AFAIK there are no static type 
checkers for Forth.  Besides, who uses Forth? ;-)

If you have a finite state machine everything is trivial.  If you have a 
Turing machine everything is generally impossible.  This is an 
oversimplification but not far from the fundamental underlying truth.

My favorite practical example is the square root function.  The standard 
C library defines a square root function on floats (actually on 
doubles), which is to say, over a finite-state model with 2^64 states.  
The model is not well defined over half of that range (negative 
numbers), which the static type checker cannot catch because there is no 
such thing as an unsigned double.

But the fun doesn't stop there.  Doubles >= 0.0 are not the only thing 
one might reasonably want to take a square root of, and indeed C++ 
overloads sqrt to work on complex and valarray types in addition to 
floats of various lengths (though you still have to manually keep track 
of whether or not the argument to sqrt might be a negative real).  But 
what if I want an exact integer square root?  Or a square root of a data 
type that represents irrational numbers not as floating point 
approximations but as exact symbolic representations?  I haven't worked 
out the details, but I'd be surprised if none of these variations turned 
out to be Turing complete.

The Turner paper is right on point: there's a fundamental distinction 
between the (known) finite and the (potentially) infinite.  In my 
experience most of the cool interesting stuff has been found in the 
latter domain, and trying to shoehorn the latter into the former is more 
trouble then it's worth.

rg