[pypy-dev] Syntax for the 'transaction' module

Tue May 1 12:03:22 CEST 2012

In a message of Tue, 01 May 2012 11:27:56 +0200, Armin Rigo writes:
>* First question: 'transaction'.  The name is kind of bogus, because
>it implies that it must be based on transactional memory.  Such a name
>doesn't make sense if, say, you are running the single-core emulator
>version.  What the module is about is to give a way to schedule tasks
>and run them in some unspecified order.

The problem is that the csc namespace is overloaded with terms that
already mean 'make-a-noun out of the verb I want to do', and the
verb list that means 'do something' is a rather full namespace as well.

Often you can get what you want by dropping the noun-ification -- so
you would replace 'transaction' with 'transact'.  This will not help the
'tranactional memory' association.

So how about 'schedule'?

My experience says that if you pick a verb, and not a noun, the code
often comes out cleaner, because otherwise you tend to write -- or the people
who use your library tend to write --  state-full
things by habit even when they are not needed because the language points
them that way.  It looks to me as if your design is rather stateless by
design, so maybe you want a verb and not a noun in any case.

>* How about replacing the global functions 'transaction.add()' and
>'transaction.run()' with a class, like 'transaction.Runner', that you
>need to instantiate and on which you call the methods add() and run().
> If moreover the class has an '__exit__()' that redirects to run(),
>then you can use it like this:
>
>    with transaction.Runner() as t:
>        for block in blocks:
>            t.add(do_stuff, block)
>
>And maybe, like concurrent.futures, a map() method --- although it
>seems to me like CPython was happily relegating the built-in map() in
>the corner of "advanced users only"; then adding a map() again seems a
>bit counter-productive --- it would look like that:
>
>    with transaction.Runner() as t:
>        t.map(do_stuff, blocks)
>
>* Note that the added transactions are only run on __exit__(), not
>when you call add() or map().  This might be another argument against
>map().

Or an argument to build a different sort of 'add me lots of things' 
function??  I am not sure this makes sense.  I am going kayaking
overnight now, I will think of this more while paddling.

>* The point of the examples above is that "t" can also be passed
>around in the transactions, and t.add() called on it again from there.
> Also, such a syntax nicely removes the need for any global state, and
>so it opens the door to nesting: you can write one of the above
>examples inside code that happens to run itself as a transaction from
>some unrelated outer Runner().  Right now, you cannot call
>transaction.run() from within a transaction --- and it doesn't make
>sense because we cannot tell if the transaction.add() that you just
>did were meant to schedule transactions for the outer or the future
>inner run().  That's what is better with this proposed new API.
>(Supporting it requires more work in the current implementation,
>though.)

But, for what it is worth, it pleases my aesthetic sense to no end.
Which is why I want an 'add me lots of things' that works with this. :-)

>* Another issue.  I think by now that we need special support to mean:
>"I want to end a transaction, then non-transactionally call this C
>function that will likely block for some time, and when it returns, I
>want to start the next transaction".  This seem general enough to
>support various kinds of things, like calling select() or
>epoll_wait().  The question is what kind of support we want.
>
>I played with various ideas and I'll present the combination that
>satisfies me the most, but I'm open to any other suggestion.
>
>We could in theory support calling in-line the function, i.e. just
>call select() and it will break the current transaction in two.  This
>is similar to the fact that select() in CPython releases and
>re-acquires the GIL.  But it breaks the abstraction that every add()
>gives *one* transaction.  

transaction == 'set of things to schedule, atomically'? or something 
else?  Right now it is not clear to me why 'one' is important, probably
because I do not understand something.

>It kind of goes against the general design
>of the API so far, which is that you add() things to do, but don't do
>them right now --- they will be done later.  To voice it differently,
>I dislike this solution because you can break a working program just
>by adding a debugging "print" in the middle (assuming that "print"
>would also break the current transaction in two, like it releases the
>GIL in CPython).  It would break the program because what used to be
>in the same transaction, no longer is: random things (done by other
>unrelated transactions) can suddenly have happened because you added a
>"print".

I can agree that any program that breaks when you add a print is a
bear to debig -- I used to get this quite often with Borland's C++ and
it made me pull out hair.

>The idea I'm playing with is two running modes: "breakable" vs
>"non-breakable".  Say you have a "@breakable" decorator that you have
>to add explicitly on some of your Python functions.  The transaction
>is breakable only when all functions in the call stack are @breakable.
> As soon as one non-breakable function is in the call stack, then the
>transaction is not breakable (to err on the side of safety).  No clue
>if this would make any sense to the user, though.  In the end a call
>to select() would either break the transaction in two (if the current
>mode is "breakable"), or, like now, in non-breakable mode it would
>turn the transaction inevitable (which is bad if the C call is
>blocking, because it blocks all other transactions too, but which is
>at least correct).

This seems up-side down to me.  Do you want to decorate them as
breakable?  Or assume that they are breakable and decorate them as
unbreakable?  Maybe I am missing something crucial, but I think the
other way may be easier for newly written code.  Your way may be cooler
for porting exisitng things, though, I need to think more.

>
>Thanks for reading all my ranting.  Ideas welcome...
>
>
>A bientÃ´t
>
>Armin.

Laura