detecting containers during object introspection

Tue Jul 22 00:00:35 EDT 2003

Steven Taschuk wrote:

> Quoth David C. Fox:
> 
>>Is there a reasonable way to tell whether a variable is a container 
>>(either a mapping or a sequence) and whether it is a mapping or a 
>>sequence?  [...]
> 
> 
> Depends what you mean by "reasonable".
> 
> In the usual scenario, there's a specific facility which you want
> to make use of, and you can either (a) just try to use it, and
> catch the relevant exception (called Easier to Ask Forgiveness
> than Permission, or EAFP), or (b) do some kind of check beforehand
> for whether the facility exists (called Look Before You Leap, or
> LBYL).
> 
> Your example of trying iter(foo) and catching the possible
> TypeError fits into such approaches.
> 
>   [...]
> 
>>Wouldn't it be helpful to have an abstract base class Sequence which 
>>didn't add any actual attributes or methods, but which someone writing a 
>>sequence class could include as a base class, as a sort of unenforced 
>>promise that the sequence operators were supported?
> 
> 
> Consider a DNS name -> address mapping whose __getitem__ actually
> queries the DNS.  (Ignore the fact that this could be better done
> with a function -- assume we have a function which expects a
> dict-like object and we want to use it on the DNS.)  Such an
> object could not reasonably implement __len__.  Should it falsely
> imply that it does (by inheriting from an abstract class Mapping)
> or falsely imply that it's not a mapping (by not inheriting from
> Mapping)?
> 
> Such cases -- those in which for conceptual or pragmatic reasons
> it is infeasible or undesirable to implement all of an interface
> -- are not uncommon.

Thanks, makes sense - I'll just try...catch the actual operations I use.

>>Why am I asking this question?
>>
>>I'm trying to write a regression test for a class which pickles 
>>dictionaries of attributes to store and retrieve instances.  The 
>>dictionary includes a version number, and the class has a system for 
>>updating out-of-date instances to the current version.
>>
>>As part of the test, I need to be able to compare the structure of the 
>>two dictionaries to see if the developer has modified their structure 
>>but forgotten to increment the current version.  Values in the 
>>dictionary may be unknown objects, in which case I just want to compare 
>>their types.  However, the values may also be sequences or mappings 
>>themselves, in which case I want to recursively compare their 
>>elements/values.
> 
> 
> Interesting.  So you have an example of a standard dictionary for
> a given version number, and you want to compare the structure of
> the actual dictionary to the standard example dictionary.  Is that
> right?

Yes.  The particular scenario I'm most worried about is if a developer 
makes a change to the structure of the dictionary, but fails to 
increment the current version number.  In that case, users with existing 
stored dictionaries who update to a new version of the program will end 
up with incorrectly constructed objects, and are likely to have all 
sorts of strange and hard-to-debug problems.

> Are your developers actually using dict-like and list-like objects
> instead of real dicts and lists?  If not, isn't the type-check for
> unknown objects enough?  (If they do start using work-alikes
> instead of the real thing, the test will fail, but YAGNI, right?

Yes to the first*, but yes and good points to the second and third.

[*well, not exactly - I recently replaced a dictionary with a trie data 
structure which effectively maps from sequences of strings to values, 
but it doesn't currently use keys() or __getitem__(self, key).]

However, the real problem occurs if the developer makes a change like 
this (or even just adds a new attribute which is a non-standard 
container), and does increment the version number.  Because the version 
number was incremented, the regression test would *expect* some of the 
dictionary elements to change type or structure, and would simply update 
the standard example dictionary.  Then, any *subsequent* changes to the 
structure of the values in that container would go undetected (unless 
the developer had also updated the recursive comparison operation to 
take into account that the unknown object was a container).

But this scenario requires several mistakes, so YAGNI is probably still 
good advice.  Besides, since there isn't any way to detect containers 
that don't use the standard methods, I guess the automated regression 
testing approach will never be foolproof, and we'll have to fall back on:

(1) giving stern instructions (and good documentation) to our 
developers, and

(2) making sure a developer who is aware of these issues looks over 
their shoulders and their code.  Yet another responsibility - just what 
I need.  Not to mention the sore neck I'm going to get looking over my 
own shoulder ;-)

David