Inefficiency of __getattr__

Alex Martelli aleaxit at yahoo.com
Thu Oct 5 04:10:55 EDT 2000


"Kragen Sitaker" <kragen at dnaco.net> wrote in message
news:m2UC5.3538$q9.167352 at news-west.usenetserver.com...
> In article <8rf3fm030n4 at news2.newsguy.com>,
> Alex Martelli <aleaxit at yahoo.com> wrote:
> >"Kragen Sitaker" <kragen at dnaco.net> wrote in message
> >news:NpzC5.1605$l52.81889 at news-west.usenetserver.com...
> >    [snip]
> >> So what's the advantage of static typing over dynamic typing?  Does it
> >> help you catch bugs, and if so, which ones?
> >
> >It helps catch some type-related bugs -- particularly in areas
> >of the code that it's hard to ensure are wholly tested because
> >they deal with rare/exceptional/abstruse conditions.  Even for
> >bugs one would catch anyway in testing, it does help by letting
> >you do the "catching" earlier -- costs are lower the earlier
> >you catch the bugs.
>
> I run my regression tests after the compile.  For functional stuff, it
> tends to be pretty fast --- a second or less per thousand lines of
> code.  My web stuff is slower.  That's not much of an earliness advantage
;)

Aggressive, extensive, pervasive unit-testing and regression testing
is good.  Just don't try to make it into a "magic bullet" that makes
other error-catching helpers irrelevant!

You may be blessed with only writing units/components that can be
effectively and exhaustively tested in a few seconds.  That is very,
very far from applying to all of us.  When a component's job in life
is extracting information from some intrisically huge and complex
database (or, even worse, updating said DB -- in which case every
test needs to start from a known/restored DB state, and reloading/
restoring the DB can itself be highly time-consuming), for example,
testing gets intrinsically slow.  Similar considerations apply when
a component performs very exhaustive computations (three-dimensional
modeling has many such tasks), or is user-interaction-heavy (one of
the worst situations, since you can't easily run batch tests).

Sure, one tries to model things on a reduced scale for test purposes,
but even with a thorough design-for-testability strategy (lots and
lots of test-harnesses, stubs and skeletons and drivers, simulated
DB interaction and UI interaction, etc, as well as lot of attention
paid to potential execution paths' interactions), there is no real
"magic bullet".  When you have, say, ten significant decision points
in the component (where different execution paths are possible),
and an average branching factor of four at each point, you need over
a million passes for exhaustive coverage, even assuming your design
ensured each combination of execution paths can be forced by the
test-harness; that can easily mean an hour's elapsed time even for
rather speedy single passes.  And please note I'm not describing a
component that is really *complicated* at all!  These combinatorial
explosions kill you even on "highly-atomized" component-design.

So, one compromises, and only runs tests through some subset of the
possible combinations of execution paths.  But then, inevitably,
some potential errors are left uncaught during development, and may
be caught only (e.g.) when running more exhaustive tests at night,
or on weekends, or, sometimes, much later.

Explicit assertions of type, just as any other explicit assertion
that it may be possible for the compiler to check statically, help
you catch some fractions of these nasty little bugs much earlier.
The "earliness advantage" can thus become quite significant.

It's not a panacea.  It IS a help.  I would add that its helpfulness
is rather obvious, at least to anybody who has actually developed
significant amount of code in languages both with and without some
measure of explicit-type-assertions; _denying_ it is just as extreme
a position as making a fetish out of it (as some proponents of
static typing at all costs do!).  One should be aware of both its
benefits and costs, in order to make a serious engineering tradeoff
about it in a language (or, if using a language allowing optional
type-annotation, in a specific program's coding).


> >It also helps your compiler generate better code.  And, your
> >source can be more expressive, as it states certain things that
> >you know about your code, in a precise and unambiguous way.
>
> I agree with generating better code; static typing definitely makes
> writing good compilers easier.

Yes, although it can be interesting to see what can be done for
a language without explicit type assertion, just by a very clever
compiler.  I'm only aware of Lisp-ish languages, particularly
Scheme, being subject to such treatment; a particularly interesting
thing to study is Bigloo, which is able to compile both Scheme
(without explicit/static typing) and Caml (a language largely
designed expressly to allow type-inference, and based on static
typing just-about-exclusively).  I believe Bigloo's code quality
(at least on toy-sized examples) is quite comparable for both.

But, of course, stating something outright does make it easier
even if one "should" be able to deduce/infer it.  It surely does
for the program's human reader; and, for a compiler, it does
at least under separate-compilation rather than whole-program
analysis.


> It *does* state certain things that you know about your code in a
> precise and unambiguous way.  The problem is that they are things that
> are minimally related to the problem at hand.  It makes your source

Not at all.  The *abstract* typing of (e.g.) "the arguments I can
accept" is very much "related to the problem at hand" -- it says a
lot about how I have analyzed the situation and proceeded with
designing my solution (particularly when typing is taken to include
design-by-contract issues).

> code "more expressive" in the same way that assembler is "more
> expressive" than C: assembler is five times as long, and precisely and
> unambiguously describes which registers you are using and what order
> you are computing operations in.

Not at all: you're confusing implementation with design.  Assembler
forces me to detail the implementation, and thereby may well obscure
the design.  A higher-level language, by letting me hide implementation
details, is thus more expressive of design-aspects.

But the higher-level, more abstract language may be lacking in not
letting me express, "declaratively" rather than "imperatively",
certain things I do know about my problem (my analysis of it, my
design about it) and which it might be helpful to share with human
readers and with the compiler.  Natural language (in comments,
docstrings, etc) is no real substitute for an expressive language,
as the latter is unambiguous, doesn't "fall out of sync" with the
way things really are, etc.

Consider, for example, the way Python's excellent docs often
mention that "a sequence" is needed in a certain position.  At
times they say "a list" and it's not really true (any more), as
any sequence is in fact accepted.  But even when they say "a
sequence", it's often not clear at all what they mean.  When
talking about, e.g., the 'for' statement, they imply an object
that will respond to __getitem__ for increasing natural and
normally raise an index exception when 'exhausted'; but in other
contexts, they ALSO want the object to have a __len__ method
they can call.  E.g.:

>>> class numbers:
 def __getitem__(self, i):
  return i
>>> for i, c in zip(numbers(), "peep"):
 print i,c


0 p
1 e
2 e
3 p
>>> for i, c in map(None, numbers(), "peep"):
 print i,c


Traceback (innermost last):
  File "<pyshell#14>", line 1, in ?
    for i, c in map(None, numbers(), "peep"):
AttributeError: 'numbers' instance has no attribute '__len__'
>>>


Can you evince this difference in behaviour between zip and
map from just reading the docs?  Maybe, but it's going to be
a useless burden on you.  Why not have a language that lets
the designer of zip and map *state outright* the typing
characteristics he or she certainly knows about these two
functions?  Having an abstract-interface (or typeclass, or
however you want to call it) "Sequence", which only needs
the __getitem__ part; a derived interface "Length
Limited Sequence", which adds the constraint of having a
__len__; and a way in the language to annotate the args
(optionally) with the needed abstract-interface; would
make it all smooth and wonderful!


> But you have convinced me that it is useful, at least, to describe what
> interface you require of your parameters.  STL seems to have gone
> furthest in this direction.

Not really, IMHO, since STL has had to use natural language
(docs/comments/etc) for this purpose.  I believe the Haskell 98
standard (particularly the part about libraries) may be a better
example.  Haskell's typeclasses and generic-polymorphism are a
very elegant and suitable approach for this, again IMHO.  The
language itself may not go far enough for many uses -- it has
no 'dynamic' typing at all, just the generic polymorphism.  But
the mechanisms and approaches it uses for the static/generic
part are well worth studying, I think.


Alex






More information about the Python-list mailing list