[pypy-dev] Continuations and sandboxing

Mon Jan 10 21:18:05 CET 2011

Hi all,

On Mon, Jan 10, 2011 at 09:22, William ML Leslie
<william.leslie.ttg at gmail.com> wrote:
> On 10 January 2011 15:24, Nathanael D. Jones <nathanael.jones at gmail.com> wrote:
>> Hi folks,

>> 2) Serializeable continuations. With gameplay being based on good plot and
>> story flow, continuations are critical to allow straightforward
>> implementation of 'workflows' that depend on user choice at every turn.
>> 3) Tail-call elimination.  By nature, players will accumulate a very large
>> call stack. While this isn't terribly bad a first glance, the following
>> issue combines with it to cause a very big problem:
>> When code changes underneath a continuation, we need to determine how to
>> resume flow. One option is annotating a checkpoint method in each code
>> 'file'.
[I didn't get this part. Reading what's below, I assume that for each
call frame, you remember somehow the file defining the called
function.]
> > However, if a user's call stack includes every file in the system,
>> each change will cause them to restart.
>> Tail-call elimination would help eliminate unneeded stack frames and
>> minimize re-spawning.

That optimization looks maybe invalid. Suppose file A contains a
function f() which tail-calls g(1, 2, 3), and then file A is modified,
so that f() does another tail-call. Now it's not clear why do you do
restart: if you do restart when executed code is modified, then this
optimization is invalid. If you reload only when code yet to execute
is modified, then the optimization is valid, but you could perform
also more advanced optimizations to avoid restart (you could compare
the generated bytecode up to the end of the outermost loop containing
the point where the call frame was saved and another procedure was
invoked).

It is also not clear which semantic guarantee you want to achieve by
this restart. Would you use transactions to avoid performing
side-effects again?

>> 4) Dynamic code loading. Users will be able to 'branch' their own version of
>> the world and share it with others. There may be thousands of versions of a
>> class, and they need to be able to execute in separate sandboxes at the same
>> time. Source code will be pulled from a Git repository or some kind of
>> versioning database.

> Quite like this idea.

> You do have to deal with a bunch of (fairly well known) problems,
> which any specific implementation of dynamic code loading is going to
> need to solve (or not).  Pypy doesn't currently implement any
> hot-schema-change magic, and reloading has always been error prone in
> the presence of state.  First-class mutable types make it particularly
> difficult (there is no sensible answer to what it means to reload a
> python class).

You might want to reuse the solutions to those issues used in the Java
(and maybe .NET) world. Java allows reloading a class in a different
classloader, and that has been used inside OSGi (see
http://www.osgi.org/About/Technology).
Not sure about the solution in OSGi, but Java serialization allows to
serialize an instance of version 1 of a class, and to de-serialize it
with version 2 of that class, if v2 takes extra care for that; one
could use this to convert existing instances.

> The one issue that interests me is where you implement the persistence
> boundary - do you go with orthogonal persistence and act as if
> everything is preserved, or assume all user code is run within some
> sort of (fast and loose) transaction that can be re-entered at will,
> providing an API for persistent data access?  The second case makes
> the reloading question a bit more reasonable, because you can always
> throw away the current delta and replay the external effects, assuming
> the interface for the external events hasn't changed significantly.

The key question is: when would you start and commit such transactions?

Apart from that, your idea looks very similar to Software
Transactional Memory (STM). STM restarts explicitly-marked
transactions in a thread when some other thread modifies the affected
data (which would be a data race) and commits its transaction. In your
case, a transaction is restarted when some other thread modifies the
involved code.

Cheers,
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/