[Tutor] Machine Vs. Human Parsing (Was: XML: Expletive Deleted) (Way OT!)

Carroll, Barry Barry.Carroll at psc.com
Fri Jun 16 19:33:37 CEST 2006


Greetings:

> -----Original Message-----
> Date: Fri, 16 Jun 2006 00:05:41 +0100
> From: "Alan Gauld" <alan.gauld at btinternet.com>
> Subject: Re: [Tutor] XML: Expletive Deleted (OT)
> To: tutor at python.org
> Message-ID: <e6sp4f$5df$1 at sea.gmane.org>
> 
> Just picked this up after being out for most of the week...
> 
> "Carroll, Barry" <Barry.Carroll at psc.com> wrote in message
> 
> > One reason to for choosing a human-readable format is the desire to
> > visually confirm the correctness of the stored data and format.
> 
> Thats a very dangerous asumption, how do you detect unprintable
> characters, tabs instead of spaces, trailing spaces on a line etc etc.
> Whole text representations are helpful you should never rely on the
> human eye to validate a data file.
> 
> > can be invaluable when troubleshooting a bug involving stored data.
> > If
> > there is a tool between the user and the data, one must then rely
> > upon
> > the correctness of the tool to determine the correctness of the
> > data.
> 
> Or the correctness of the eye. I know which one i prefer - a tested
> tool.
> The human eye is not a dta parser, but it flatters to deceive by being
> nearly good enough.
> 
> > In a case like this, nothing beats the evidence of one's eyes, IMHO.
> 
> Almost anything beats the human eye IME :-)
> Actually if you must use eyes do so on a hex dump of the file, that
> is usually reliable enough if you can read hex...
> 
<<snip>>
> 
> Alan g.

If I gave the impression that the human eye is the only useful means of
examining and verifying stored data, I apologize.  I indented to say
that the human eye, and the brain that goes with it, is an invaluable
tool in evaluating data.  I stand by that statement.  

The most sophisticated tool is only as good as the developer(s) who made
it.  Since software is ultimately written by humans, it is fallible.  It
contains mistakes, gaps, holes, imperfections.  When the tool gives bad,
or erroneous, or incomplete results, what do you do?  You look at the
data.  

A case in point.  I used to test audio subsystems on PC motherboards.
The codec vendor released a new version of their chip with new driver
software.  Suddenly our tests began to fail.  The vendor insisted their
HW and SW were correct.  The test owner insisted his SW was correct.
Somebody was mistaken.  

I used an audio editing program to display the waveform of the data in
the capture buffer.  The first several hundred samples were not a
waveform, but apparently random noise.  I now knew why the test is
failing, but why was there noise in the buffer? It wasn't there before.
The  editing SW was no help there.  So I switched tools, and displayed
the capture buffer with simple file dump program.  The first 2K of the
buffer was filled with text!

The story goes on, but that's enough to illustrate my point.  Neither
the audio driver, nor the test SW, nor the editing tool could show the
real problem.  All of them were 'tested tools', but when presented with
data they were not designed to handle, they produced incorrect or
incomplete results.  I could cite other examples from other disciplines,
but this one suffices:  no SW tool should be relied upon to be correct
in all cases.  

I trust my eyes to see things tools can't, not because they can detect
nonprintable characters in a HEX dump (I can read HEX dumps, and binary
when necessary, but I usually just print out '\t', '[SP]', etc) but
because they are not bound by anyone else's rules as to what is correct
and incorrect.  The programmer's two most valuable tools are her/his
eyes and brain.  They are always useful, sometimes indispensable.  

Regards,
 
Barry
barry.carroll at psc.com
541-302-1107
________________________
We who cut mere stones must always be envisioning cathedrals.

-Quarry worker's creed





More information about the Tutor mailing list