[pypy-dev] Thread cloning (coroutine cloning, really)

Aurélien Campéas aurelien.campeas at logilab.fr
Fri May 12 12:42:17 CEST 2006

On Fri, May 12, 2006 at 11:05:37AM +0200, Armin Rigo wrote:
Hi Armin,

> Hi Aurelien,
> On Fri, May 12, 2006 at 10:24:42AM +0200, Aur?lien Camp?as wrote:
> > Just a note there : the constraint solver doesn't really need thread
> > cloning. It's the framework that makes possible modular integration of
> > constraint solving and logic programming that needs it.
> I was about to ask precisely if the operation I described here makes
> sense from your point of view.  That's basically what you need, isn't
> it?

Roughly I think so. Some funny details may be discovered later of

Well, clone != fork in my mind. We really want a *copy*. Clone or copy
are good words for this. Absolutely avoid to share stuff (unless we
know for sure sharing is safe but that would be an implementation
detail). (fork is confusingly related to some unixish syscalls whose
precise semantics are no more clear to me at this points -- it's been
a long time since I've directly used it).

Especially, cloning a space is different from 'newspace', which is
much more akin to the fork operator. I don't have the time right now
to expand on this & will try to make things more clear next week. I'm
still unsure about the semantics of newspace...

> > Applevel modules are the responsibility of their implementors : if
> > they contain shared global mutable state, then sharing them makes them
> > thread-unsafe and unsuitable for use in comp. spaces. But that's
> > life. 
> > 
> > well, I cannot but ask : would it be possible to be able to clone
> > these anyway ?
> It's part of the experimentation that we'll need.  They would be cloned
> if we go for the approach that all objects *created* by a thread are
> cloned together with the thread.  This includes app-level objects.

I'm not sure I understand what another approach would look like...

We could restrict the programming style for code to be running inside
comp. spaces but if it is possible to effectively clone everything
(for some ill-defined notion of everything) then it'l be fine. I'm not
100% sure. Here are some of the constraints :

In principle, side-effects on the parent space and the outer world
should be forbidden from within a running comp. space ('cause the jury
is still out on its outcome).

The problem is not that we want to clone app-level modules (most
Python code tend to be decomposed into many modules, I don't believe
we want to forbid that) but that we will, by doing so, allow the
computation running inside a space to do unwanted stuff, like sneakily
doing I/O for instance, or indirectly calling some fancy C code that
does God-knows-what. Something like a "restricted execution"
environment would be interesting to have.

In Mozart/Oz, for what it's worth, they 'just' filter any attempt to
mutate the upper/outside world. I suspect it comes with a price, at
least in terms of implementation complexity (I've not actually seen
the code though). They do this also because comp. spaces in Oz are
more, hmmm, expressive that what we'll have in PyPy : the 'newspace'
is not scheduled to go into PyPy (well, at this point). Its usefulness
is unclear to us.

So, yes, we want to clone app-level modules, and generally, that
everything created by a thread be cloned (first). We'll see later what
can be made to prevent insanity to happen from within spaces.

> This interpretation follows from a nice -- if vague -- high-level
> description from Christian: if you start a thread and it computes
> something up to some point, and clone it as this point, then you get a
> new thread that "looks like" it has been started from scratch and then
> ran up to the same point, repeating the same computations.  Of course,
> this is vague because we need to explain that input-output and other
> side effects that the thread might have had are not duplicated.

The Oz way is to leave IO to the 'top-level' space. What really counts
is the internal state of the threads. Whether we would like a
'logic/relational program' to perform IO is a topic for another day (I
suspect currently we expect such a program to act on local state,
where local means 'belongs 100% to the Python world' (but what
does/doesn't exactly ? I don't know for sure).

Oh, one last point : in my view, cloning one thread means cloning a
whole tree of threads.

Confusedly & hastily yours
Hope this helps a bit.

More information about the Pypy-dev mailing list