[Twisted-Python] [andrea@cpushare.com: Re: error after launching cpushare client]

Hello, a CPUShare user reported a failure in the process protocol processEneded method. The status parameter passed to the processEnded callback is like this: status.value.signal == 11 status.value.status == 139 My server code was validating the sigsegv status returned by the client, and it noticed it wasn't 11. But status.value.signal == 11. See the debugging patch that produced the below output: http://www.cpushare.com/hypermail/cpushare-discuss/05/08/0018.html How can it be that the same status has .value.signal == 11 and .value.status == 139 at the same time? I suspect this is a twisted bug. Thanks for any help! ----- Forwarded message from Andrea Arcangeli <andrea@cpushare.com> ----- Date: Wed, 3 Aug 2005 16:35:37 +0200 From: Andrea Arcangeli <andrea@cpushare.com> To: cpushare-discuss@cpushare.com Subject: Re: error after launching cpushare client On Wed, Aug 03, 2005 at 03:26:39PM +0200, Loïc Le Loarer wrote:
Yes it helps since it verified that for some reason twisted reports an exit status of 139, which collides with an exist status of 11. (it wasn't a communication error between client and server) I think it's a twisted bug and not a mistake from my part. I'll ask on the twisted lists to be sure. What distro/twisted are you using? In the meantime this will work around it so you can start earning CPUCoins ;) Index: cpushare/proto.py =================================================================== RCS file: /home/andrea/crypto/cvs/cpushare/client/cpushare/cpushare/proto.py,v retrieving revision 1.49 diff -u -p -r1.49 proto.py --- cpushare/proto.py 6 Jul 2005 23:19:02 -0000 1.49 +++ cpushare/proto.py 3 Aug 2005 14:34:34 -0000 @@ -82,7 +82,7 @@ class state_machine(object): self.protocol.sendString(PROTO_SECCOMP_SUCCESS) def end_failure(failure): end_common() - self.protocol.sendString(PROTO_SECCOMP_FAILURE + struct.pack('!i', failure.value.status)) + self.protocol.sendString(PROTO_SECCOMP_FAILURE + struct.pack('!i', failure.value.signal)) def started(result): self.protocol.sendString(PROTO_SECCOMP_RUN) Thanks! ----- End forwarded message -----

On Wed, Aug 03, 2005 at 10:16:55PM +0300, Tommi Virtanen wrote:
.status is the _raw_ status.
You probably mean .exitCode instead.
Ok but then do you have an idea where the 139 comes from? I'd like to understand what's going on, to me that 139 number comes out of the blue.
Actual .status values are unportable, thus transferring them raw over the network is not a good idea.
status should be the same that waitpid returns, from the docs: "return a tuple containing its pid and exit status indication: a 16-bit number, whose low byte is the signal number that killed the process, and whose high byte is the exit status (if the signal number is zero); the high bit of the low byte is set if a core file was produced. Availability: Unix." Now I will ask to use exitCode but still I'd like to understand how status is connected with exitCode. BTW, I was using exitCode already, but I thought it would be set only if a signal wasn't delivered. Infact I wrote code like this: if status.value.exitCode or status.value.signal: if status.value.exitCode == 4: print 'Failure in setting the stack size to %d bytes.' % self.seccomp.stack if status.value.signal == signal.SIGKILL: print 'Seccomp task gracefully killed by seccomp.' elif status.value.signal == signal.SIGSEGV: print 'Seccomp task gracefully killed by sigsegv, status %r.' % status.value.status elif status.value.signal == signal.SIGQUIT: print 'Seccomp task killed by sigquit - should never happen.' self.d_end.errback(status) else: print 'Seccomp task completed successfully.' self.d_end.callback(None) (and in the above code status.value.signal == signal.SIGSEGV but status.value.status == 139 ;)

Andrea Arcangeli wrote:
Ok but then do you have an idea where the 139 comes from? I'd like to understand what's going on, to me that 139 number comes out of the blue.
139 == 128 + 11. One way to set up the numbering is that exit codes are 0..127, signals etc. have hight bit set. Naturally, all real access should go through the macros WIFEXITED etc, but that's how the number ranges are classically set up.
I think you are reading about os.wait and twisted is using os.waitpid. Otherwise, the document is lying to you. The status _may_ be laid out like that on _some_ platform, but unless python does some readjustment, the only portable way to access it is WIFEXITED and friends. cat >crash.c <<EOF int main(void) { /* comment out the next line if you want a normal exit */ *(char*)0 = 42; return 34; } EOF cat >run.py <<EOF #!/usr/bin/python import os pid = os.fork() if pid: # parent pid, status = os.waitpid(pid, 0) print pid, status if os.WIFEXITED(status): print 'exited', os.WEXITSTATUS(status) elif os.WIFSIGNALED(status): print 'signaled', os.WTERMSIG(status) print 'coredump', os.WCOREDUMP(status) elif os.WIFSTOPPED(status): print 'stopped', os.WSTOPSIG(status) elif os.WIFCONTINUED(status): print 'continued' else: print 'unknown' else: # child os.execv('./a.out', ['a.out']) raise RuntimeError, "exec failed" EOF chmod a+x run.py gcc -Wall crash.c ./run.py
Now I will ask to use exitCode but still I'd like to understand how status is connected with exitCode.
exitCode and signal are decoded from status.
BTW, I was using exitCode already, but I thought it would be set only if a signal wasn't delivered. Infact I wrote code like this:
Exactly. If a process exits due to a signal, there is no exit code in the sense of calling _exit(2).

On Thu, Aug 04, 2005 at 08:47:39AM +0300, Tommi Virtanen wrote:
Ah, I think I got why he gets 139, that's because the core dumping was enabled.
Yes, the status is the same in wait and waitpid, and it's the same as well in C (I doubt that python does any mangling of the C value).
It's not like I've an huge portability, because seccomp currently only available on linux, but I'll follow your suggestion and I'll try to make it more portable.
Ok thanks a lot for the example.
Exactly. If a process exits due to a signal, there is no exit code in the sense of calling _exit(2).
Ok, same as with C.

On Wed, Aug 03, 2005 at 10:16:55PM +0300, Tommi Virtanen wrote:
.status is the _raw_ status.
You probably mean .exitCode instead.
Ok but then do you have an idea where the 139 comes from? I'd like to understand what's going on, to me that 139 number comes out of the blue.
Actual .status values are unportable, thus transferring them raw over the network is not a good idea.
status should be the same that waitpid returns, from the docs: "return a tuple containing its pid and exit status indication: a 16-bit number, whose low byte is the signal number that killed the process, and whose high byte is the exit status (if the signal number is zero); the high bit of the low byte is set if a core file was produced. Availability: Unix." Now I will ask to use exitCode but still I'd like to understand how status is connected with exitCode. BTW, I was using exitCode already, but I thought it would be set only if a signal wasn't delivered. Infact I wrote code like this: if status.value.exitCode or status.value.signal: if status.value.exitCode == 4: print 'Failure in setting the stack size to %d bytes.' % self.seccomp.stack if status.value.signal == signal.SIGKILL: print 'Seccomp task gracefully killed by seccomp.' elif status.value.signal == signal.SIGSEGV: print 'Seccomp task gracefully killed by sigsegv, status %r.' % status.value.status elif status.value.signal == signal.SIGQUIT: print 'Seccomp task killed by sigquit - should never happen.' self.d_end.errback(status) else: print 'Seccomp task completed successfully.' self.d_end.callback(None) (and in the above code status.value.signal == signal.SIGSEGV but status.value.status == 139 ;)

Andrea Arcangeli wrote:
Ok but then do you have an idea where the 139 comes from? I'd like to understand what's going on, to me that 139 number comes out of the blue.
139 == 128 + 11. One way to set up the numbering is that exit codes are 0..127, signals etc. have hight bit set. Naturally, all real access should go through the macros WIFEXITED etc, but that's how the number ranges are classically set up.
I think you are reading about os.wait and twisted is using os.waitpid. Otherwise, the document is lying to you. The status _may_ be laid out like that on _some_ platform, but unless python does some readjustment, the only portable way to access it is WIFEXITED and friends. cat >crash.c <<EOF int main(void) { /* comment out the next line if you want a normal exit */ *(char*)0 = 42; return 34; } EOF cat >run.py <<EOF #!/usr/bin/python import os pid = os.fork() if pid: # parent pid, status = os.waitpid(pid, 0) print pid, status if os.WIFEXITED(status): print 'exited', os.WEXITSTATUS(status) elif os.WIFSIGNALED(status): print 'signaled', os.WTERMSIG(status) print 'coredump', os.WCOREDUMP(status) elif os.WIFSTOPPED(status): print 'stopped', os.WSTOPSIG(status) elif os.WIFCONTINUED(status): print 'continued' else: print 'unknown' else: # child os.execv('./a.out', ['a.out']) raise RuntimeError, "exec failed" EOF chmod a+x run.py gcc -Wall crash.c ./run.py
Now I will ask to use exitCode but still I'd like to understand how status is connected with exitCode.
exitCode and signal are decoded from status.
BTW, I was using exitCode already, but I thought it would be set only if a signal wasn't delivered. Infact I wrote code like this:
Exactly. If a process exits due to a signal, there is no exit code in the sense of calling _exit(2).

On Thu, Aug 04, 2005 at 08:47:39AM +0300, Tommi Virtanen wrote:
Ah, I think I got why he gets 139, that's because the core dumping was enabled.
Yes, the status is the same in wait and waitpid, and it's the same as well in C (I doubt that python does any mangling of the C value).
It's not like I've an huge portability, because seccomp currently only available on linux, but I'll follow your suggestion and I'll try to make it more portable.
Ok thanks a lot for the example.
Exactly. If a process exits due to a signal, there is no exit code in the sense of calling _exit(2).
Ok, same as with C.
participants (2)
-
Andrea Arcangeli
-
Tommi Virtanen