Hardware take on software testing.

Sun Jun 8 12:17:23 EDT 2003

Tom Verbeure wrote:
>>We can find hard to get at bugs with this technique in HW design,
>>that's why I'd also like it added to the software arsenal.
> 
> 
> There are some major differences between hardware and software develoment
> that make directed random testing (DRT) much less efficient for software.
> 
> Conceptually, a software program is basically one HUGE statemachine where
> every variable in the data memory influences the progress of the state
> machine (even if you don't usually think like it that way.)
> 
> In hardware, you also have a whole bunch of explicitly designed state
> machines that work in parallel. 
> 
> This technique is not very useful for software, especially if it's single
> threaded: if the software isn't interactive, there isn't much chance of
> problems due to arbitration between parallel events, since there aren't any.
> :-)  (A lot of DRTs are written to find arbitration errors between
> statemachines.) 

Random testing is also good at exploring a large combination of input 
states.

> 
> In addition, to make DRT really useful, you need a golden model to verify
> that the results generated by randomly generated data is correct. This means
> you need to write a certain piece of software twice, with exactly the same
> higher level behaviour.
> How would you verify, say, a Python interpreter with random testing? By
> writing one in C and another one in Java, throwing random instructions at
> them and checking the behaviour, right? Well, you just created 2 highly
> complex multi-man-year projects. :-)

I know this is done developing assemblers together with the RTL for the 
CPU. You set up a constrained random generator of assembly code. The 
constraints may do things like ensure that the same register is not used 
twice in a MUL instruction, or that a number is not divided by zero 
(very often).
In Python you could randomly generate valid programs, where the 
constraints ensured for example, that a set of known generators of 
exceptions, were always trapped.
You could then leave the generator to generates thousands of python 
statements that should not throw an untrapped exception. If one is 
thrown then either there is a problem in the interpreter, or the 
generator needs further constraining. over time, the Random program 
genearator would get better at generating correct Python programs, and 
the Python interpreter would get less buggy.
Yes the Random generator would evolve into a complex program in itself, 
but you could be finding bugs along the way even with the partially 
constrained version.

> 
> Due to the huge up front costs and turn around time, it makes sense for
> hardware to make a golden model. 
> 
> In other words, for software, it's not cost effective, since it's usually
> quite easy to make bugfixes in later revisions and tell a customer not to
> use a particular feature for a while or quickly ship a patch. It's much
> harder to tell the user of an IP router chip to avoid packets that contain a
> certain sequence of bytes and to wait 6 months for a new revision of the
> chip that can then be replaced on the PCB.
Can't argue against that. Every second someone opens some shrink-wrapped 
CD of software passing up the license that says 'Oh by-the-way, if it 
don't work correctly: Tough-titty'! And then goes on with 'We would like 
you to pay twice forthe features that didn't work by paying for the (so 
called) upgrade'.
As long as us dopes keep buying it then the developers of software have 
no incentive to improve.

> 
> 
>>I don't doubt that XP and TDD can be an advance on previous methods,
>>I'm just wary of the lack of metrics. And I'm unsure of how you
>>alleviate the problem of only testing how one person or close team
>>think it should work. 
>>Is TDD enough?
> 
> 
> Apparently it is good enough for most people, since virtually nobody will go
> farther than that.
> 
> Tom