Hi,<div><br></div><div>I need a bit of help sorting this out... </div><div><br></div><div>I have a memory test script that is a bit of compiled C. The test itself can only ever return a 0 or 1 exit code, this is explicitly coded and there are no other options.</div>
<div><br></div><div>I also have a wrapper test script that calls the C program that should also only return 0 or 1 on completion.</div><div><br></div><div>The problem i'm encountering, however, involves the return code when subprocess.poll() is called against the running memory test process. The current code in my wrapper program looks like this:</div>
<div><br></div><div><div>def run_processes(self, number, command):</div><div> passed = True</div><div> pipe = []</div><div> for i in range(number):</div><div> pipe.append(self._command(command))</div>
<div> print "Started: process %u pid %u: %s" % (i, pipe[i].pid, command)</div><div> sys.stdout.flush()</div><div> waiting = True</div><div> while waiting:</div><div> waiting = False</div>
<div> for i in range(number):</div><div> if pipe[i]:</div><div> line = pipe[i].communicate()[0]</div><div> if line and len(line) > 1:</div><div> print "process %u pid %u: %s" % (i, pipe[i].pid, line)</div>
<div> sys.stdout.flush()</div><div> if pipe[i].poll() == -1:</div><div> waiting = True</div><div> else:</div><div> return_value = pipe[i].poll()</div>
<div> if return_value != 0:</div><div> print "Error: process %u pid %u retuned %u" % (i, pipe[i].pid, return_value)</div><div> passed = False</div>
<div> print "process %u pid %u returned success" % (i, pipe[i].pid)</div><div> pipe[i] = None</div><div> sys.stdout.flush()</div><div> return passed</div>
</div><div><br></div><div>So what happens here is that in the waiting loop, if pipe[i].poll returns a -1, we keep waiting, and then if it returns anything OTHER than -1, we exit and return the return code.</div><div><br>
</div>
<div>BUT, I'm getting, in some cases, a return code of 127, which is impossible to get from the memory test program.</div><div><br></div><div>The output from this bit of code looks like this in a failing situation:</div>
<div><div>Error: process 0 pid 2187 retuned 127</div><div>process 0 pid 2187 returned success</div><div>Error: process 1 pid 2188 retuned 127</div><div>process 1 pid 2188 returned success</div></div><div><br></div><div>I'm thinking that I'm hitting some sort of race here where the kernel is reporting -1 while the process is running, then returns 127 or some other status when the process is being killed and then finally 0 or 1 after the process has completely closed out. I "think" that the poll picks up this intermediate exit status and immediately exits the loop, instead of waiting for a 0 or 1.</div>
<div><br></div><div>I've got a modified version that I'm getting someone to test for me now that changes</div><div><br></div><div><div> if pipe[i].poll() == -1:</div><div> waiting = True</div></div><div><br></div>
<div>to this</div><div><br></div><div>if pipe[i].poll() not in [0,1]:</div><div> waiting = True</div><div><br></div><div>So my real question is: am I on the right track here, and am I correct in my guess that the kernel is reporting different status codes to subprocess.poll() during the shutdown of the polled process?</div>
<div><br></div>