[pypy-dev] Thread cloning (coroutine cloning, really)

Aurélien Campéas aurelien.campeas at logilab.fr
Fri May 12 10:24:42 CEST 2006

On Thu, May 11, 2006 at 04:24:13PM +0200, Armin Rigo wrote:
> Hi all,
> On Thu, May 11, 2006 at 08:05:06AM +0200, holger krekel wrote:
> > - thread cloning approaches 
> > 
> >   As discussed earlier it is urgent to have a working 
> >   thread cloning approach.  Armin and Christian will 
> >   try to present the approache(s) and time frames for
> >   this feature in some written form. 
> A word of introduction: we are dividing the work in two directions; one
> is thread pickling, which is lead by Christian.  The other is thread
> cloning, lead by me.  Thread cloning is in principle a subset of what
> can be done with pickling, as we could pickle a thread, unpickle it, et
> voila: we have cloned the thread.  But pickling is harder and comes with
> a different set of problems than "just" cloning.  It is also likely that
> pickling threads in random interpreter states will be very difficult, a
> restriction that does not apply to thread cloning.  So we divided the
> work; the most important reason is that cloning is what the constraint
> solver developments require urgently.

Just a note there : the constraint solver doesn't really need thread
cloning. It's the framework that makes possible modular integration of
constraint solving and logic programming that needs it.

Even more off-topic : I've been playing with Mercurial, one in the
million new distributed SCM that appears these days (it's Python all
over the place of course), and I am pleasantly surprised to discover
how the basic concepts of computation spaces seem to match these of an
DSCM. Some of the primitives are just the same : clone, merge (with an
SCM you have the added ability to arbitrate merge conflicts, with
comp. spaces conflicts just mean only one can win), commit (seemingly
same semantics, but different usage patterns). This convergence is
reassuring, at least on the front of the generality of the
aforementioned primitives </end-of-bs>

> So here is my current point of view on cloning.  The goal is to give to
> RPython code some way to duplicate a "chain of frames", i.e. an RPython
> coroutine.  There are three levels of issues:
> 1. the interface that RPython code needs to use to do that
> 2. the selection of what GcStructures must be duplicated or shared
> 3. the automatic generation of the necessary walkers from the
>      stackless transformer
> In the following days we should focus on 3.  This will need quite some
> coding efforts by itself, maybe ~ 1 week of dedicated work from where we
> stand now.
> Then we need to experiment with various ways to select what needs to be
> duplicated or not.  The problem is that there are some RPython-level and
> app-level objects that need "obviously" to be shared, like app-level
> modules, and like interp-level singleton state objects; and others that
> need obviously to be duplicated so that the newly cloned coroutine has
> its own copy, like local lists.

Applevel modules are the responsibility of their implementors : if
they contain shared global mutable state, then sharing them makes them
thread-unsafe and unsuitable for use in comp. spaces. But that's

well, I cannot but ask : would it be possible to be able to clone
these anyway ?

Now I wonder if built-in modules will need a make-them-thread-safe
pass (this question is motivated by these interp-level singleton state
objects you mention).

> It's unclear which option we will choose, but Christian has proposed
> some good ideas.  Only experimentation will tell.  A possibly good
> solution that he proposed (but hard to implement efficiently) is to
> duplicate exactly those objects that have been allocated by the
> coroutine that we are cloning.  We could try to do that *inefficiently*,
> e.g. by adding an "allocated-by" field to every GcStructs, and see how
> the result works in practice.  Trying that shouldn't be extremely hard;
> at most 1 more week of dedicated efforts.  An advantage of this approach
> over alternatives is that it's conceptually simpler, and also that the
> required RPython-level interface is very easy: something along the lines
> of "newstate = state.fork()".
> In light of this, it seems that we are at some 2 weeks of hard work from
> a prototype, in the line of thread cloning.

Happy to hear this :)
Thanks for all of this.

More information about the Pypy-dev mailing list