variable inirialization and loop forms (was Re: why no "do : until"?)

Alex Martelli aleaxit at yahoo.com
Wed Jan 3 07:40:38 EST 2001


"Steve Lamb" <grey at despair.rpglink.com> wrote in message
news:slrn94shbe.150.grey at teleute.rpglink.com...
> On Sat, 30 Dec 2000 18:56:12 GMT, Fredrik Lundh <fredrik at effbot.org>
wrote:
> >from what world do you come from where initializing variables to
arbitrary
> >values just to get your program flow right is a good thing?
>
>     I have never heard of a single programming book or course that says,
"Oh,
> don't init variables."  In this case I feel you are squarely in the wrong
> with, oh, the entire CS community.

Initializing variables *to correct values* is a great thing -- but using
*arbitrary* values for the initialization isn't optimal style.

Considerations vary among languages; but, in Python, by just assigning some
arbitrary value to a variable, whose 'correct' initial value is unknown, you
lose an important runtime error indication -- the (e.g.) 'unbound local
variable'
exception, that you get if inappropriately using the uninitialized variable,
disappears (and you get subtler bugs) if you've just "assigned any old
value"
to it, for purposes unrelated to your actual application's semantics.


Consider a slight abstraction of the example under discussion:

# wait until warm enough
while 1:
    T = getCurrentTemperature()
    if T > threshold: break

Here, T satisfies a very simple invariant: IF it's set at all, then
it is set to the latest temperature value as returned from the function
whose job is to measure "current temperature".

If our stylistic qualms convince us to code this differently:

# wait until warm enough
T = threshold
while T <= threshold:
    T = getCurrentTemperature()

then the invariant becomes less simple, and trickier: T is _either_ set
to 'latest measurement', OR _in an arbitrary way_ to 'threshold'; further,
there is no easy way to distinguish, if T is indeed set to "threshold",
whether this is a "true" measurement, or just our arbitrary initial
setting for it (while, in the previous form, it was very easy to use
try/except if such a test was needed).

This matters!  For example, the getCurrentTemperature function *might*
smooth actual point-measurements (which 'wobble') by averaging with
the previously-measured value; T (as a global variable) is usable for
that in the first form, but not in the second one.
If we want to show the temperature as it gets measured, again the
second form provides no way to ward against a wholly artificial initial
"spike", while the first form lets us use try/except.

But, there's more!  When we "busy-loop" this way, we may not want
to suck up 100% of the CPU while waiting for something as slow-moving
as temperature to adjust; the typical mod would be to sleep a while
after an unsatisfactory measurement.

The first form offers a perfectly satisfactory avenue to that:

# wait until warm enough
while 1:
    T = getCurrentTemperature()
    if T > threshold: break
    else: time.sleep(aWhile)

but the second one is more fragile -- we end up having to repeat
the termination-test, or using additional flags, etc:

# wait until warm enough
T = threshold
while T <= threshold:
    T = getCurrentTemperature()
    if T <= threshold: time.sleep(aWhile)

It's always unpleasant to repeat code -- because it implies a
commitment to keep several pieces of code 'in sync' when the
exact specification changes.  E.g., suppose that we determined
after some testing that we must deem it "warm enough" when T
is _equal_ to the threshold, not just _greater_.  First form
shews its robustness by having small change in specs equate to
correspondingly small change in code:

# wait until warm enough, take 2
while 1:
    T = getCurrentTemperature()
    if T >= threshold: break
    else: time.sleep(aWhile)

A 'greater than' becomes 'greater-equal' in the specs => the
exact same thing happens in the code; perfect!  THIS is the
kind of thing that gives Python its well-deserved fame as
"executable pseudo-code"!-)

But the second form is not as resilient to spec changes:

# wait until warm enough, take 2
T = threshold - epsilon
while T < threshold:
    T = getCurrentTemperature()
    if T < threshold: time.sleep(aWhile)

_TWO_ '<=' must become '<', *and* we need to change our
"arbitrary initialization" as well, accordingly.

Or suppose the simple test for 'warm enough' becomes
more complex -- enough to make it worth our while to
move it to a separate function, which takes all sort
of other things into account to determine whether a
given temperature is "warm enough" (wind speed, % of
humidity, whatever).  Again, first-form shines:

# wait until warm enough, take 3
while 1:
    T = getCurrentTemperature()
    if warmEnough(T): break
    else: time.sleep(aWhile)

isn't it wonderful?  Simple spec change -> identically
simple code change.  But, the second form...:

# wait until warm enough, take 3
T = ???
while warmEnough(T):
    T = getCurrentTemperature()
    if not warmEnough(T): time.sleep(aWhile)

*Eeeep*!  What a ROUT!  *What* shall we initialize T to,
now -- do we need to know the exact algorithm warmEnough
will use?!  And, the two separate calls to warmEnough with
some delay between them -- how do we assure ourselves that
their semantics is indeed what we want?  If some cruel
taskmaster forbid us from the shining clarity of form 1,
we would no doubt have to introduce a temporary flag
variable at this point, and forget about initializing T
to some 'appropriate' (???) value...:

# wait until warm enough, take 3
mustLoopAgain = 1
while mustLoopAgain:
    T = getCurrentTemperature()
    mustLoopAgain = not warmEnough(T)
    if mustLoopAgain: time.sleep(aWhile)

better, but nowhere as simple and clear as form 1.  The
auxiliary flag, whose only reason for existence is to
avoid using a perfectly good Python idiom, is another
little piece of deadweight our code carries to no good
or justifiable purpose.


> >next you gonna tell us that it's better to use flags than to use break
and/or
> >exceptions...
>
>     No, but I don't think that when you can place the test into the while
that
> one should use a while 1:break construct.

*IF AND WHEN* there are NO costs in testing in the while
statement, sure.  But often there ARE some costs: when a
loop WANTS to be an "N-and-a-half times loop", the general
form, then coding it as such (which in Python is expressed
by while 1/...break) produces simpler and therefore better
code than making believe the idiom isn't available.  And it's
often the case that a loop that appears to be a do/until one
now will actually generalize to N-and-a-half with the smallest
spec change.

This underlines, I believe, one of Tim Peters' Python
principles, about syntax mattering in often unexpected
ways.  If the generic Python loop was spelled a LITTLE
differently, say:
    loop [<condition1>]:
        <suite1>
    while <condition2>:
        <suite2>
(with either of the loop or while clauses omitted if
the attached condition and/or suite allows it), nobody
would presumably suggest that it not be used.  Since
it is spelled as in fact it is...:
    while <condition1>:
        <suite1>
        if not <condition2>: break
        <suite2>
we keep having these debates.  Don't let the very minor
syntax sugar issue of how 'loop' and 'while' are spelled,
or the indentation of the latter clause, affect your
judgment to such an extent...!-)


Alex






More information about the Python-list mailing list