[pypy-dev] PyParallel-style threads

20 Jun 2016

      Hi all,
There was an experiment based on CPython's code called PyParallel
<https://github.com/pyparallel/pyparallel> that allows running threads in
parallel without STM and modifying source code of both Python and C
extensions. The only limitation is that they disallow mutation of global
state in parallel context.
I briefly mentioned it before on PyPy's freenode channel.
I'd like to discuss why the approach is useful, how it can benefit PyPy
users and how can it be implemented.
Allowing to run in parallel without mutating global state can help servers
use each thread to handle a request. It can also allow to log in parallel
or send an HTTP request (or an AMQP message) without sharing the response
with the main thread. This is useful in some cases and since PyParallel
managed to keep the same semantics it (shouldn't) break CPyExt.
If we keep to the following rules:

   1. No global state mutation is allowed
   2. No new keywords or code modifications required
   3. No CPyExt code is allowed (for now)

I believe that users can somewhat benefit from this implementation if done
correctly.
As for implementation, if we can trace the code running in the thread and
ensure it's not mutating global state and that CPyExt is never used during
the thread's course we can simply release the GIL when such a thread is
run. That requires less knowledge than using STM and less code
modifications.
However I think that attempting to do so will introduce the same issue with
caching traces (Armin am I correct here?).

As for CPyExt, we could copy the same code modifications that PyParallels
did but I suspect that it will be so slow that the benefit of running in
parallel will be completely lost for all cases but very long threads.

Is what I'm suggesting even possible? How challenging will it be?

Thanks,
Omer Katz.