[Python-Dev] Is core dump always a bug? Advice requested

Wed May 12 23:13:15 EDT 2004

At 02:15 PM 5/13/04 +1200, Greg Ewing wrote:
>Jeremy Hylton <jeremy at alum.mit.edu>:
>
> > I think it would only need to check that any path through the loop has a no
> > net effect on stack depth -- before branching back to the beginning of the
> > loop, it must pop everything it pushed.
>
>If the branching is sufficiently spaghettoid, it might have
>trouble identifying any coherent loops or paths.

As I think Tim has already mentioned, all you do is track the stack depth 
for each instruction, and ensure it's consistent.  If there is an 
inconsistency in the predicted stack level for an instruction, as compared 
with any of the instructions that jump to it (no matter where they jump 
from), you have a problem.

Or, to look at it another way:

maxdepth = 0
startpoints = [(0,0)]
stackdepth = [None] * len(bytecode)

while startpoints:
     location,depth = startpoints.pop()

     while True:
        if location >= len(bytecode):
            raise VerificationError("Unexpected end of bytecode at",location)

        if stackdepth[location] is None:
            stackdepth[location] = depth
        else:
            if stackdepth[location]<>depth:
                raise VerificationError("Inconsistent stack depth at",location)
            else:
                break    # we've already processed this part of the code

        instruction = fetch(bytecode,location)
        instruction.verify_for(code_object,location)  # Check operand validity

        depth += instruction.effectOnStack()
        maxdepth = max(depth,maxdepth)

        if depth<0:
            raise VerificationError("Stack underflow at",location)

        if instruction.isterminal:  (e.g. return, raise)
            break        # Block is over, handle the next one if any

        elif instruction.isconditional:
            startpoints.append((instruction.alt_location(location),depth))

        location = instruction.next_location(location)

This algorithm should readily handle all forward and backward jumps, while 
ensuring that no stack growth or shrinkage can take place, because every 
instruction must have a single known stack depth, no matter how many 
forward or backward jumps it takes to get there.

Notice that the algorithm will "follow" every branch, but will not retrace 
the code at the target location once the stack depth is seen to be 
consistent.  It also tackles underflow and "falling off the end" of the 
bytecode, while simultaneously computing the maximum stack depth 
used.  Each instruction is fetched and processed for its depth effects and 
other characteristics exactly once, so the timing is O(n).  And, it has a 
nice place to put opcode-specific operand validation for opcodes that use 
locals, freevars, names, or constants.

Anyway, I think we can safely halt discussion of the halting problem 
now.  :)  And, since I've now reduced this from an interesting theoretical 
idea to talk about, into something resembling work (God forbid), anybody 
who wants to actually *implement* this crazy idea is now free to do so, 
without even having to read the paper Tim linked to.  :)

(Disclaimer: I didn't read the paper either, so there may be other 
interesting/applicable verifications there that I didn't sketch 
above.  This is just a spur-of-the-moment off-the-top-of-my-head draft thingy.)