Mailman 3 s390x backend gil and threading issue - pypy-dev

2 Feb 2016

      hi,

I'm currently searching for a problem, I have debugged for quite a long
time. I think this is the root problem why the pypy translation with
pypy is still slower than cpython. Here are some of my findings
(+questions):

The list of last tests that fail all have one thing in common: They have
an issue with the gil/threading. (See [1])
Most interesting ones are the last five.

* test_gc_locking (2x) fail on the build bot (only using cpython), but
not on my machine. This is strange because bbot and my vm use the same
distro, same compiler version, .... the only difference is that bbot has
better hardware and the tests are run with testrunner.
Is there another way I can reproduce this?

* test_ping_pong. (-A test) ping pong from one thread to another
stressing locking and the GIL switch. On s390x and the translated VM
this takes really long (10 seconds and on bbot it seems to exceed 30
seconds when run in parallel). However if I run the same test with
PYPYLOG=jit:- it completes in ~0.96 seconds (in gdb it is the same). If
you subtract the time needed for printing you might end up with the same
speed as x86 has for this test.
What does the printing/gdbing trigger to let the GIL switch happen that
smoothly?

* I have placed memory fences at the same positions as on ppc (2x isync
and lwsync). Are there any other places that need to complete all
pending the memory operations?

* There is one path in call_release_gil (just after the call) where
rpy_fastgil was acquired (because it as 0) and the shadowstack is not
the one of the current thread. Then *rpy_fastgil = 0 is set for the
slowpath function.
Wouldn't it be possible to steal the gil at this point? Would that lead
to a problem?

Cheers,
Richard

[1] http://buildbot.pypy.org/summary?builder=own-linux-s390x

s390x backend gil and threading issue

Richard Plangger

tags

participants (2)