User + sys time bigger than real time, in case of no real parallelism
I am really sorry for the OT :-( I asked elsewhere but without any answer :-( I can not figure out why in this short example the user+sys time is bigger than real time. The example executes the task() functions twice, with each execution in a separate thread. The task() just increment 10**6 times a global int: $ cat foo.py from threading import Thread, Lock result = 0 lock = Lock() def task(): global result for i in range(10**6): lock.acquire() result += 1 lock.release() if __name__ == '__main__': t1, t2 = Thread(target=task), Thread(target=task) t1.start() t2.start() t1.join() t2.join() print('result:', result) When I execute it (Python 3.6), I get a sys+user time bigger than the real time: $ time python foo.py result: 2000000 real 0m7.088s user 0m6.597s sys 0m5.043s That is usually what I can expect in case of tasks executed in parallel on different CPUs. But my example should not be the case, due to the GIL. What am I missing? Thank you very much, and sorry again for the OT :( -- Marco Buttu INAF-Osservatorio Astronomico di Cagliari Via della Scienza n. 5, 09047 Selargius (CA) Phone: 070 711 80 217 Email: mbuttu@oa-cagliari.inaf.it
That is usually what I can expect in case of tasks executed in parallel on different CPUs. But my example should not be the case, due to the GIL. What am I missing? Thank you very much, and sorry again for the OT :(
With such finely intermingled thread activity, there might be a fair bit of spinlock spinning involved. Additionally, I suspect that the kernel does not track CPU time at microsecond precision and will tend to round the used times up. Obviously, this is not a reasonable way to use threads. The example is only effective at producing lots of overhead.
On Sonntag, 5. Februar 2017 07:59:04 Marco Buttu wrote:
I am really sorry for the OT :-( I asked elsewhere but without any answer :-( I can not figure out why in this short example the user+sys time is bigger than real time. The example executes the task() functions twice, with each execution in a separate thread. The task() just increment 10**6 times a global int:
$ cat foo.py from threading import Thread, Lock
result = 0 lock = Lock()
def task(): global result for i in range(10**6): lock.acquire() result += 1 lock.release()
if __name__ == '__main__': t1, t2 = Thread(target=task), Thread(target=task) t1.start() t2.start() t1.join() t2.join() print('result:', result)
When I execute it (Python 3.6), I get a sys+user time bigger than the real time:
$ time python foo.py result: 2000000
real 0m7.088s user 0m6.597s sys 0m5.043s
That is usually what I can expect in case of tasks executed in parallel on different CPUs. But my example should not be the case, due to the GIL. What am I missing? Thank you very much, and sorry again for the OT :(
/usr/bin/time(!) gives some insight: $ /usr/bin/time python3 foo.py result: 2000000 1.18user 1.30system 0:01.52elapsed 163%CPU (0avgtext+0avgdata 8228maxresident)k 8inputs+0outputs (0major+1151minor)pagefaults 0swaps Obviously, the spinning happens outside the GIL... Cheers, Pete
participants (3)
-
Hans-Peter Jansen
-
Marco Buttu
-
Stefan Ring