[pypy-dev] Thread cloning (coroutine cloning, really)

Armin Rigo arigo at tunes.org
Thu May 11 16:24:13 CEST 2006


Hi all,

On Thu, May 11, 2006 at 08:05:06AM +0200, holger krekel wrote:
> - thread cloning approaches 
> 
>   As discussed earlier it is urgent to have a working 
>   thread cloning approach.  Armin and Christian will 
>   try to present the approache(s) and time frames for
>   this feature in some written form. 

A word of introduction: we are dividing the work in two directions; one
is thread pickling, which is lead by Christian.  The other is thread
cloning, lead by me.  Thread cloning is in principle a subset of what
can be done with pickling, as we could pickle a thread, unpickle it, et
voila: we have cloned the thread.  But pickling is harder and comes with
a different set of problems than "just" cloning.  It is also likely that
pickling threads in random interpreter states will be very difficult, a
restriction that does not apply to thread cloning.  So we divided the
work; the most important reason is that cloning is what the constraint
solver developments require urgently.

So here is my current point of view on cloning.  The goal is to give to
RPython code some way to duplicate a "chain of frames", i.e. an RPython
coroutine.  There are three levels of issues:

1. the interface that RPython code needs to use to do that
2. the selection of what GcStructures must be duplicated or shared
3. the automatic generation of the necessary walkers from the
     stackless transformer

In the following days we should focus on 3.  This will need quite some
coding efforts by itself, maybe ~ 1 week of dedicated work from where we
stand now.

Then we need to experiment with various ways to select what needs to be
duplicated or not.  The problem is that there are some RPython-level and
app-level objects that need "obviously" to be shared, like app-level
modules, and like interp-level singleton state objects; and others that
need obviously to be duplicated so that the newly cloned coroutine has
its own copy, like local lists.

It's unclear which option we will choose, but Christian has proposed
some good ideas.  Only experimentation will tell.  A possibly good
solution that he proposed (but hard to implement efficiently) is to
duplicate exactly those objects that have been allocated by the
coroutine that we are cloning.  We could try to do that *inefficiently*,
e.g. by adding an "allocated-by" field to every GcStructs, and see how
the result works in practice.  Trying that shouldn't be extremely hard;
at most 1 more week of dedicated efforts.  An advantage of this approach
over alternatives is that it's conceptually simpler, and also that the
required RPython-level interface is very easy: something along the lines
of "newstate = state.fork()".

In light of this, it seems that we are at some 2 weeks of hard work from
a prototype, in the line of thread cloning.

I have intentionally left out the other stackless subject, which is
thread pickling; I'll let Christian present it when he's ready to.


A bientot,

Armin.



More information about the Pypy-dev mailing list