[pypy-dev] STM

Armin Rigo arigo at tunes.org
Fri Jan 6 15:23:32 CET 2012

Hi William,

On Fri, Jan 6, 2012 at 06:52, William ML Leslie
<william.leslie.ttg at gmail.com> wrote:
> This is the way it has been described, and how most common usages will
> probably look.  But I don't think there has ever been any suggestion
> that dynamic extent is the scope at which transactions *should* be
> implemented, any more than context managers are the the
> be-all-and-end-all solution for resource management.

I agree, but my issue is precisely that (say) gcc 4.7, as far as I can
tell, *imposes* the dynamic extent to be a nested block.  There is no
way to ask gcc to produce code handling the lower-level but more
general alternative.  It looks a bit like saying that the expected
most common case is covered, but the general case is not; and, too
bad, we really need the general case to turn CPython's GIL into
transactions, so we can't even start experimenting at all.

But indeed, I am describing something slightly different in the rest
of the mail, and this other approach can be implemented as a nested
block at the level of C.  So we are saved, even though it seems to be
a bit by chance.  (But maybe it's not by chance after all: see below.)

> The requirement to be a generator is clever, (...)

I think the relationship to Stackless can be made clearer: the
generator approach is basically the same as found in some event
packages.  If we move to Stackless or greenlets, then "the time
between two yields" is replaced with "the time between two switches",
but the basic idea remains the same (even though greenlets allows
again random functions to switch away, whereas just using generators
constrains the programmer to more discipline).

Most probably, we can generalize this example: the approach should
work with any event-like system, like twisted's or pygame's.  The
general idea is to have a main loop that calls pending events;
assuming that (often enough) there are several independent events
waiting to be processed, then they can be processed in parallel, with
one transaction each.

It may be that this is a good approach: it gives the power of using
multiple processors to (even existing) programs that are *not* written
to use multiple threads at all, so they are free from all the dangers
of multithreading.

> The other case was a function to call to commit the transaction (and
> start a new one?).  I would like to think that you shouldn't be able
> to commit a transaction that you don't know about (following
> capability discipline)

That other approach relies on the assumption that "a transaction" is
not really the correct point of view.  There are cases where you don't
clearly have such a transaction as the central concept.  The typical
example is CPython with a transactional GIL.  In this approach, a
transaction corresponds to whatever runs between the last time the GIL
was acquired and the next time the GIL is released; i.e. between two
points in time that are rather unrelated to each other.  In this model
it doesn't make sense to give too much emphasis on "the transaction"
by itself.  By opposition you have a clearer pairing between the end
of a transaction (release the GIL) and the start of the next one
(re-acquire it).  Also, in this model it doesn't even make sense to
think about nesting transactions.

But I'm not saying that this is the perfect model, or even that it
makes real sense.  It seems to let one CPython interpreter to run
internally on multiple threads when the programmers requests multiple
threads, so it seems the most straightforward solution; but maybe it
is not, simply because "import thread" may not be the "correct"
long-term solution for the programmer.  There may be a relation
between this --- transactions in the interpreter but normal threads
for the Python programmer --- and the unusual requirement of
non-nested-scope-like transactions in C.

> The reason I bring this up is that even though you implement
> transaction handling with its own special llop, you'd never sensibly
> model this with a generator.  If you were limited to generators, you'd
> not be able to implement this in the llinterp, or the blackhole
> interpreter either without sufficient magic.

I'm unsure it changes something if we take an approach that doesn't
allow nested transactions.  The difference seems to be only on whether
you have to pass around the transaction as an object, or whether the
transaction is in some global thread-local variable.  I have to say
that I don't really see the point of nested transactions so far, but
that may be only because I've taken too much the point of view of
"CPython+transactional GIL"; if it's not the correct one after all, I
need to learn more :-)

A bientôt,


More information about the pypy-dev mailing list