[Tutor] the art of testing

Wed Nov 25 04:41:32 CET 2009

Serdar Tumgoren wrote:
>> That's a good start.  You're missing one requirement that I think needs to
>> be explicit.  Presumably you're requiring that the XML be well-formed.  This
>> refers to things like matching <xxx>  and </xxx> nodes, and proper use of
>> quotes and escaping within strings.  Most DOM parsers won't even give you a
>> tree if the file isn't well-formed.
>>     
>
> I actually hadn't been checking for well-formedness on the assumption
> that ElementTree's parse method did that behind the scenes. Is that
> not correct?
>
> (I didn't see any specifics on that subject in the docs:
> http://docs.python.org/library/xml.etree.elementtree.html)
>
>   
I also would assume that ElementTree would do the check.  But the point 
is:  it's part of the spec, and needs to be explicitly handled in your 
list of errors:
     file xxxxyyy.xml  was rejected because .....

I am not saying you need to separately test for it in your validator, 
but effectively it's the second test you'll be doing.  (The first is:  
the file exists and is readable)
>> But most importantly, you can divide the rules where you say "if the data
>> looks like XXXX" the file is rejected.   Versus "if the data looks like
>> YYYY, we'll pretend it's actually ZZZZ, and keep going.  An example of that
>> last might be what to do if somebody specifies March 35.  You might just
>> pretend March 31, and keep going.
>>     
>
> Ok, so if I'm understanding -- I should convert invalid data to
> sensible defaults where possible (like setting blank fields to 0);
> otherwise if the data is clearly invalid and the default is
> unknowable, I should flag the field for editing, deletion or some
> other type of handling.
>
>   

Exactly.  As you said in one of your other messages, human intervention 
required.  Then the humans may decide to modify the spec to reduce the 
number of cases needing human intervention.  So I see the spec and the 
validator as a matched pair that will evolve.

Note that none of this says anything about testing your code.  You'll 
need a controlled suite of test data to help with that.  The word "test" 
is heavily overloaded (and heavily underdone) in our industry.

DaveA