
Hi all, For those of you who missed it: the proposal on Software Transactional Memory is launched ("remove the GIL but give a better way to write multi-threaded applications"). It is available at http://pypy.org/tmdonate.html . I plan to ask the PSF for some participation, following the push towards doing so at the PyCon PSF meeting. See http://mail.python.org/pipermail/python-dev/2012-March/117680.html , "Funding from the Python Software Foundation". A bientôt, Armin.

Hi Armin: I looked over the work plan. Cool! Some questions and comments. 1. rstm library: I have looked at the rstm library. Wrote the "hello world" of transactional programming: the bank account example. . I also played with your STM stuff a few months ago. A little while back, I started to go through your code to see what strategies you are using and the general architecture (hence the last batch of questions). I didn't get far but I want to resume real soon. If folks can tolerate scanned hand drawings (I am terrible with illustration software), I could post what I do. And the references I use will looking at the code ( i.e., "Software Transactional Memory 2nd by Tim Harris et el). I don't know if this is helpful? User Interface: I am familar with the AME papers. AME literature seems to taper off. Have you tried talking to the various authors? Or get your hands on an implementation. For join papers, I talked to the Microsoft researcher involved and he was helpful. To further understand what you are doing and get a feel for STM would look like, I wanted to Python versions of the STAMP examples. Perhaps this would be a good way to craft an API in parallel with other work? Half-Baked Ideas: The reason I encountered STM literature is because I used stackless.py to prototype join patterns. One of these days, I'll finish that work and post it. After talking to the authors of "Scalable Join Patterns," I was referred to the work of John Reppy on Concurrent and Parallel ML. Polyphonic C# has something that operates like channels. And Concurrent ML channels are very much like Stackless Python channels in the sense they are synchronous. The tie-in to STM is both those implementations under the hood use a combination of STM and lock-free algorithms for efficient implementation. Now what makes this attractive for say, a future Stackless Python is that message passing model usually lends itself well to a style where the programmer naturally knows how to partition their data amongst coroutines. STM is used under the hood to implement the channels and other associated constructs. Hopefully this would 1) lead to a smaller transactional footprint. 2) Totally hide STM from the programmer. 3) Use an already existing API. Thoughts? Cheers, Andrew ________________________________ From: Armin Rigo <arigo@tunes.org> To: PyPy Developer Mailing List <pypy-dev@python.org> Sent: Monday, March 26, 2012 6:53 AM Subject: [pypy-dev] STM proposal funding Hi all, For those of you who missed it: the proposal on Software Transactional Memory is launched ("remove the GIL but give a better way to write multi-threaded applications"). It is available at http://pypy.org/tmdonate.html . I plan to ask the PSF for some participation, following the push towards doing so at the PyCon PSF meeting. See http://mail.python.org/pipermail/python-dev/2012-March/117680.html , "Funding from the Python Software Foundation". A bientôt, Armin. _______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev

Hi Andrew, On Mon, Mar 26, 2012 at 17:28, Andrew Francis <andrewfr_ice@yahoo.com> wrote:
Indeed, and it was around 2007, so I expect the authors to have been involved in completely different things for quite some time now... But I could try to contact them anyway.
Yes for 2) and 3): that's precisely where I'm going. Right now I'm focusing on Twisted instead of Stackless (with help from J.P. Calderone), but the idea is the same: hide it totally from the end programmer by using the existing API. Only the one tweaking the Twisted internals needs to know about it. The same would apply to Stackless. A bientôt, Armin.

Hi Armin: ________________________________ From: Armin Rigo <arigo@tunes.org> To: Andrew Francis <andrewfr_ice@yahoo.com> Cc: PyPy Developer Mailing List <pypy-dev@python.org> Sent: Wednesday, March 28, 2012 5:08 AM Subject: Re: The Work Plan Re: [pypy-dev] STM proposal funding Hi Andrew, On Mon, Mar 26, 2012 at 17:28, Andrew Francis <andrewfr_ice@yahoo.com> wrote: AF> I am familar with the AME papers. AME literature seems to taper AF> off. Have you tried talking to the various authors? Or get your AF> hands on an implementation.
Communications is good :-) AF> STM is used under the hood to implement the channels and other AF> associated constructs. Hopefully this would 1) lead to a smaller AF> transactional footprint. 2) Totally hide STM from the programmer. 3) AF> Use an already existing API.
Cool. I use Twisted!
I don't know if Christian is following this conversation.... My PyPy knowledge is still sketchy but I am changing that. I do understand the Twisted reactor model (thanks to my 2008 Pycon Talk) so I could follow discussions in that area. Is this discussed on IRC? I would *love* to engage in a similar effort on the Stackless side. At first things would be slow on my side. One of the tasks I would need to do is better examine how threads interact with the Stackless scheduler (an area I seldom look at but follow the discussions in the Stackless Mailing List. I may have misconceptions that could readily be cleared up). The low hanging fruit would be 1) rewriting some of the STAMP applications to use Stackless. 2) exposing some of the lock-free mechanisms - this would get me deeper in the FFI. On the wild idea front, a while back, I took a graduate research course in Distributed Algorithms. There I learnt about Nancy Lynch's I/O Automaton model and encountered synchronisers (not that I am an expert). A modified version of the I/O automaton model is used to develop proofs for transactional algorithms. For this discussion, ascheduler and a transaction manager can be thought of as a synchroniser. I have a gut feeling that somehow the scheduler and a transaction manager can be unified. I need to revisit the literature. Cheers, Andrew

Hi Andrew, hi all, On Wed, Mar 28, 2012 at 18:23, Andrew Francis <andrewfr_ice@yahoo.com> wrote:
I'm also thinking about writing a short paper collecting things I said and think on various blog posts. A kind of "position paper". What do others think of this idea?
This is not discussed a lot right now. But it is apparently relatively easy to adapt the epoll-based Twisted reactor to use the 'transaction' module. (Again, this module is present in the stm-gc branch; look for lib_pypy/transaction.py for the interface, and pypy/module/transaction/* for the Python implementation on top of STM as exposed by RPython.) This 'transaction' module is also meant to be used directly, for example in this kind of Python code: for n in range(...): do_something(n) If each call to do_something() has "reasonable chances" to be independent from other calls, and if the order doesn't matter, then it can be rewritten as: for n in range(...): transaction.add(do_something, n) transaction.run() In addition, each transaction can add more transactions that will be run after it. So if you want to play with lib_pypy/stackless.py to add calls to 'transaction', feel free :-) Maybe it will show that a slightly different API is required from the 'transaction' module; I don't really know so far. A bientôt, Armin.

On 4/1/12 11:23 AM, Armin Rigo wrote:
Hi A(rmin|ndrew), it is funny how familiar this code looks, re-writing it in terms of Stackless and tasklets: ''' for n in range(...): tasklet(do_something)(n) stackless.run() In addition, each tasklet can add more tasklets that will be run after it. So if you (...) ''' Well, sure, it is not much more than a simple match, and the tasklets are more like sequences of transactions, when they give up their computation by stackless.schedule(), that adds them back to the pool. Anyway, the underlying ideas have similarities that make me think a lot. Thinking of Stackless, I was looking for a way to isolate tasklets in a way to let them run in parallel, as long as they are independent. In STM, independence is enforced, currently at a relatively high price. If Stackless were able to provide some better isolation by design, maybe there could be a hybrid approach, where most operations would not need to rely on STM all the time? Just rolling ideas out -- Chris -- Christian Tismer :^)<mailto:tismer@stackless.com> tismerysoft GmbH : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de work +49 173 24 18 776 mobile +49 173 24 18 776 fax n.a. PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

Hi Christian, On 04.04.2012, Christian Tismer <tismer@stackless.com> wrote:
it is funny how familiar this code looks, re-writing it in terms of Stackless and tasklets:
Yes, that similarity is not accidental :-) It looks like it's "just" a matter of hacking at lib_pypy/stackless.py to use ``transaction``. But as I said it's possible that some changes to the API of ``transaction`` would be needed for an easier mapping. Feel free to try it out.
Maybe. It's a language design question though, and based on future technical work, so I'll not consider it for now. The point of STM is that it allows anything, and once we have a good implementation, we can start tweaking other aspects (like the language) to improve performance. A bientôt, Armin.

Hi Christian: ________________________________ From: Christian Tismer <tismer@stackless.com> To: Armin Rigo <arigo@tunes.org> Cc: Andrew Francis <andrewfr_ice@yahoo.com>; "stackless@stackless.com" <stackless@stackless.com>; PyPy Developer Mailing List <pypy-dev@python.org> Sent: Wednesday, April 4, 2012 6:47 AM Subject: Re: [pypy-dev] The Work Plan Re: STM proposal funding ...
Anyway, the underlying ideas have similarities that make me think a lot.
Thinking of Stackless, I was looking for a way to isolate tasklets in a way to let them run in parallel, as long as they are independent.
In STM, independence is enforced, currently at a relatively high price.
Just rolling ideas out -- Chris
The idea I like the most is to use STM and lock-free algorithms for the implementation of the channels themselves. Again, the Scalable Join Patterns and Parallel ML papers are the inspiration for this approach. In contrast I have looked at Go's channel implementation and it has to do stuff like sorting to get the correct locking order. What I like is that the approach assumes that Stackless programmers know how to write programmes that are fairly isolated. One could experiment with this approach using the low-level rstm module or prototypes written in C using existing STM and lock-free libraries. Cheers, Andrew

Hi Armin: ________________________________ From: Armin Rigo <arigo@tunes.org> To: Andrew Francis <andrewfr_ice@yahoo.com> Cc: PyPy Developer Mailing List <pypy-dev@python.org>; "stackless@stackless.com" <stackless@stackless.com> Sent: Sunday, April 1, 2012 5:23 AM Subject: Re: The Work Plan Re: [pypy-dev] STM proposal funding
I think this is a great idea.
Yes I can see transaction.add() being a wrapper/decorator for tasklet creation. However I think that is the easy part. I'm trying to reason about the behaviour. Starting with a simple Stackless programme: 1) All tasklets run on top of a single OS thread. 2) Tasklets do not yield until they are ready to commit, that is they do not call schedule() or block on a channel . 3) Tasklets do not share state/ or variables are immutable ( because of #1 and #2, this isn't important) This is a natural transaction A more complicated but somewhat contrived scenario: 1) tasklets are still running over a single OS thread 2) tasklets yield 3) tasklets share state def transactionA(account1, account2): account1.fromAccount -= 50 if someRandomFunction(): schedule() account2.toAccount += 50 def transactionB(account1, account2): t = arg.fromAccount * .1 account1.fromAccount -= t if someRandomFunction(): schedule() account2.toAccount += t since the tasklets yield, this opens the door for race conditions. I need to look at how the underlying rstm module works to see how easy it would be detect conflicts amongst tasklets. another scenario: def transferTasklet(ch) .... while someFlag: toAcount, fromAccount, amount = ch.receive() # transaction start fromAccount.withdraw(amount) toAccount.deposit(amount) #transaction end Question: without specific transaction_start() and transaction_commit() calls, how does rstm know what the start and finish of transactions are? Cheers, Andrew

Hi Andrew, On Sun, Apr 8, 2012 at 01:25, Andrew Francis <andrewfr_ice@yahoo.com> wrote:
Question: without specific transaction_start() and transaction_commit() calls, how does rstm know what the start and finish of transactions are?
Please take a different point of view: if the proper adaptation is done, the users of the "stackless" module don't need any change. They will continue to work as they are, with the same semantic as today --- with the exception of the ordering among tasklets, which will become truly random. This can be achieved by hacking at a different level. A bientôt, Armin.

Hi Armin: ________________________________ From: Armin Rigo <arigo@tunes.org> To: Andrew Francis <andrewfr_ice@yahoo.com> Cc: PyPy Developer Mailing List <pypy-dev@python.org>; "stackless@stackless.com" <stackless@stackless.com> Sent: Wednesday, April 11, 2012 8:29 AM Subject: Re: The Work Plan Re: [pypy-dev] STM proposal funding On Sun, Apr 8, 2012 at 01:25, Andrew Francis <andrewfr_ice@yahoo.com> wrote: AF> Question: without specific transaction_start() and transaction_commit() AF> calls, how does rstm know what the start and finish of transactions are?
Please take a different point of view: if the proper adaptation is done, the users of the "stackless" module don't need any change.
Yes. Suggestion: perhaps in a future position paper, you can state as a design principle that the programmer does not explicitly state a commit point? So returning to the example, a programme more in the spirit of what you are doing is: def transferTasklet(ch) def transfer(toAccount, fromAccount, amount): . fromAccount.withdraw(amount) toAccount.deposit(amount) while someFlag: toAcount, fromAccount, amount = ch.receive() transaction.add(transfer, toAccount, fromAccount, amount) if __name__ == "__main__": ch = stackless.channel() task = stackless.tasklet(transferTasklet)(ch) transaction.add(task) .... """ let us assume that the stackless scheduler and transaction manager are somehow integrated """ stackless.run() and we assume that the underlying system "magically" takes of stuff. However the programmer does have to throw the transaction manager a bone (hence transaction.add())
Noted. I think issues of ordering (serialisation) is a consequence of a correctly implemented transaction manager. If I understand your strategy, the approach is to give a Python developer a race free programme with minimum effort. However I think a major concern would be implementations that minimises conflicts/contention. As I stated in previous posts, I believe in the case of Stackless, the message passing system is natural way to give the application programmer more control in regard to minimising conflicts/contention (or whatever term the literature uses).
This can be achieved by hacking at a different level.
I am interested in what this hacking looks like under the hood. Again, I am assuming as a design principle guideline (and to focus folk's efforts), one should be providing the rstm module just enough information to work but make no assumptions about how it works. So one ought not depend on whether strategies like eager/lazy evaluation (right now, I think your system depends on lazy evaluation since I see stuff like redo logs), are used. Still I am interested in stuff like: would rstm function correctly if the underlying implementation tracked tasklets rather than threads? I am looking at your implementation and the AME.c/h file in Duhton. I want to get into a position to do some hacking. So I want to think about: What do we know about the relationship between transactions, tasklets and threads? Under what conditions in a Stackless application, would conflicts occur? What could be done by 1) The programmer 2) The STM implemenation to avoid this? Some side notes: I was reading about the Haskell implementation in the "Software Transactional Memory 2nd" book (chapter 4.6.2 - conditionl synchronization). It seems that Haskell has a user space "thread" library that is somewhat integrated with the STM. I am going to look at this more since this is something that Stackless could take advantage of. Salut, Andrew

Hi Andrew, On Sun, Apr 15, 2012 at 18:07, Andrew Francis <andrewfr_ice@yahoo.com> wrote:
So returning to the example, a programme more in the spirit of what you are doing is:
No, you are still missing my point. The goal is to work on existing stackless programs, not to write custom programs that combine "stackless" and "transaction" explicitly. Yes, I'm saying that _any_ existing stackless-based program can be run on multiple cores (assuming common causes of conflicts are found and removed). It doesn't make much sense to go into details now because without an actual implementation we can't really know for sure what the common causes of conflicts will be --- not to mention, I don't like to talk endlessly just to try to convince people that it''s actually even possible :-) A bientôt, Armin.

Hi Armin, On 04/16/2012 11:34 AM Armin Rigo wrote:
Do you want to turn an arbitrary number of cores into a virtual uniprocessor and then use a "yield" (a la OCM) as a kind of logical critical section marker for seamless coroutine-like transfer between one section (space between yields) and another such section in another process? IOW, everything between yield executions is logically guaranteed serial within and atomic as a whole? And everything becomes a speculatively executed critical section, with potential rollback and retry? For removing "common causes of conflicts", are you thinking of static analysis to separate yield-demarked thread sections into probability-of-contention categories for different strategies of scheduling execution by actual cores? Or maybe not just static analysis, but also jit-like tracing of contention and dynamically rewriting with mutex to serialize where conflict detection/rollback/retry thrashes? Does that seem possible? How do you avoid being overprotective? E.g., if x,y,z are being written, and x,y *has* to be atomically coherent, like coordinates, but z can be any version, should the programmer write yields tightly around coherent outputs, e.g., "... yield; x=foo(); y=bar(); yield; z=baz() ..." in order not to create a non-requirement that z also be synced with x,y? Should there not be some way of programming atomicity of noun sets as well as verb sequences? E.g., what do the new concepts do for supporting programming an atomic class concept that would inherit from some base class that guarantees atomic execution of methods and thus synced access to guaranteed coherent attributes? (Which probably exists with locks etc already, but wondering what the new concepts bring to it). Really curious what you have up your sleeve. OTOH, I understand that you probably have experiments and thoughts under way that you'd rather work on than discuss or explain, so back to lurking for some decent interval ;-) Regards, Bengt Richter

Hi Bengt, I feel like I have actually already explained over and over again what I am doing. But it's true that such communication has been going on on various channels. So, here is the documentation I've got so far: https://bitbucket.org/pypy/pypy/raw/stm-gc/lib_pypy/transaction.py In particular, this is not OCM, because it doesn't work with explicit yields. It has typically longer transactions; the essential point is that it is not possible to call a random function that will unexpectedly break the transaction in two parts by calling "yield". A bientôt, Armin.

Hi Armin: I looked over the work plan. Cool! Some questions and comments. 1. rstm library: I have looked at the rstm library. Wrote the "hello world" of transactional programming: the bank account example. . I also played with your STM stuff a few months ago. A little while back, I started to go through your code to see what strategies you are using and the general architecture (hence the last batch of questions). I didn't get far but I want to resume real soon. If folks can tolerate scanned hand drawings (I am terrible with illustration software), I could post what I do. And the references I use will looking at the code ( i.e., "Software Transactional Memory 2nd by Tim Harris et el). I don't know if this is helpful? User Interface: I am familar with the AME papers. AME literature seems to taper off. Have you tried talking to the various authors? Or get your hands on an implementation. For join papers, I talked to the Microsoft researcher involved and he was helpful. To further understand what you are doing and get a feel for STM would look like, I wanted to Python versions of the STAMP examples. Perhaps this would be a good way to craft an API in parallel with other work? Half-Baked Ideas: The reason I encountered STM literature is because I used stackless.py to prototype join patterns. One of these days, I'll finish that work and post it. After talking to the authors of "Scalable Join Patterns," I was referred to the work of John Reppy on Concurrent and Parallel ML. Polyphonic C# has something that operates like channels. And Concurrent ML channels are very much like Stackless Python channels in the sense they are synchronous. The tie-in to STM is both those implementations under the hood use a combination of STM and lock-free algorithms for efficient implementation. Now what makes this attractive for say, a future Stackless Python is that message passing model usually lends itself well to a style where the programmer naturally knows how to partition their data amongst coroutines. STM is used under the hood to implement the channels and other associated constructs. Hopefully this would 1) lead to a smaller transactional footprint. 2) Totally hide STM from the programmer. 3) Use an already existing API. Thoughts? Cheers, Andrew ________________________________ From: Armin Rigo <arigo@tunes.org> To: PyPy Developer Mailing List <pypy-dev@python.org> Sent: Monday, March 26, 2012 6:53 AM Subject: [pypy-dev] STM proposal funding Hi all, For those of you who missed it: the proposal on Software Transactional Memory is launched ("remove the GIL but give a better way to write multi-threaded applications"). It is available at http://pypy.org/tmdonate.html . I plan to ask the PSF for some participation, following the push towards doing so at the PyCon PSF meeting. See http://mail.python.org/pipermail/python-dev/2012-March/117680.html , "Funding from the Python Software Foundation". A bientôt, Armin. _______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev

Hi Andrew, On Mon, Mar 26, 2012 at 17:28, Andrew Francis <andrewfr_ice@yahoo.com> wrote:
Indeed, and it was around 2007, so I expect the authors to have been involved in completely different things for quite some time now... But I could try to contact them anyway.
Yes for 2) and 3): that's precisely where I'm going. Right now I'm focusing on Twisted instead of Stackless (with help from J.P. Calderone), but the idea is the same: hide it totally from the end programmer by using the existing API. Only the one tweaking the Twisted internals needs to know about it. The same would apply to Stackless. A bientôt, Armin.

Hi Armin: ________________________________ From: Armin Rigo <arigo@tunes.org> To: Andrew Francis <andrewfr_ice@yahoo.com> Cc: PyPy Developer Mailing List <pypy-dev@python.org> Sent: Wednesday, March 28, 2012 5:08 AM Subject: Re: The Work Plan Re: [pypy-dev] STM proposal funding Hi Andrew, On Mon, Mar 26, 2012 at 17:28, Andrew Francis <andrewfr_ice@yahoo.com> wrote: AF> I am familar with the AME papers. AME literature seems to taper AF> off. Have you tried talking to the various authors? Or get your AF> hands on an implementation.
Communications is good :-) AF> STM is used under the hood to implement the channels and other AF> associated constructs. Hopefully this would 1) lead to a smaller AF> transactional footprint. 2) Totally hide STM from the programmer. 3) AF> Use an already existing API.
Cool. I use Twisted!
I don't know if Christian is following this conversation.... My PyPy knowledge is still sketchy but I am changing that. I do understand the Twisted reactor model (thanks to my 2008 Pycon Talk) so I could follow discussions in that area. Is this discussed on IRC? I would *love* to engage in a similar effort on the Stackless side. At first things would be slow on my side. One of the tasks I would need to do is better examine how threads interact with the Stackless scheduler (an area I seldom look at but follow the discussions in the Stackless Mailing List. I may have misconceptions that could readily be cleared up). The low hanging fruit would be 1) rewriting some of the STAMP applications to use Stackless. 2) exposing some of the lock-free mechanisms - this would get me deeper in the FFI. On the wild idea front, a while back, I took a graduate research course in Distributed Algorithms. There I learnt about Nancy Lynch's I/O Automaton model and encountered synchronisers (not that I am an expert). A modified version of the I/O automaton model is used to develop proofs for transactional algorithms. For this discussion, ascheduler and a transaction manager can be thought of as a synchroniser. I have a gut feeling that somehow the scheduler and a transaction manager can be unified. I need to revisit the literature. Cheers, Andrew

Hi Andrew, hi all, On Wed, Mar 28, 2012 at 18:23, Andrew Francis <andrewfr_ice@yahoo.com> wrote:
I'm also thinking about writing a short paper collecting things I said and think on various blog posts. A kind of "position paper". What do others think of this idea?
This is not discussed a lot right now. But it is apparently relatively easy to adapt the epoll-based Twisted reactor to use the 'transaction' module. (Again, this module is present in the stm-gc branch; look for lib_pypy/transaction.py for the interface, and pypy/module/transaction/* for the Python implementation on top of STM as exposed by RPython.) This 'transaction' module is also meant to be used directly, for example in this kind of Python code: for n in range(...): do_something(n) If each call to do_something() has "reasonable chances" to be independent from other calls, and if the order doesn't matter, then it can be rewritten as: for n in range(...): transaction.add(do_something, n) transaction.run() In addition, each transaction can add more transactions that will be run after it. So if you want to play with lib_pypy/stackless.py to add calls to 'transaction', feel free :-) Maybe it will show that a slightly different API is required from the 'transaction' module; I don't really know so far. A bientôt, Armin.

On 4/1/12 11:23 AM, Armin Rigo wrote:
Hi A(rmin|ndrew), it is funny how familiar this code looks, re-writing it in terms of Stackless and tasklets: ''' for n in range(...): tasklet(do_something)(n) stackless.run() In addition, each tasklet can add more tasklets that will be run after it. So if you (...) ''' Well, sure, it is not much more than a simple match, and the tasklets are more like sequences of transactions, when they give up their computation by stackless.schedule(), that adds them back to the pool. Anyway, the underlying ideas have similarities that make me think a lot. Thinking of Stackless, I was looking for a way to isolate tasklets in a way to let them run in parallel, as long as they are independent. In STM, independence is enforced, currently at a relatively high price. If Stackless were able to provide some better isolation by design, maybe there could be a hybrid approach, where most operations would not need to rely on STM all the time? Just rolling ideas out -- Chris -- Christian Tismer :^)<mailto:tismer@stackless.com> tismerysoft GmbH : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de work +49 173 24 18 776 mobile +49 173 24 18 776 fax n.a. PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

Hi Christian, On 04.04.2012, Christian Tismer <tismer@stackless.com> wrote:
it is funny how familiar this code looks, re-writing it in terms of Stackless and tasklets:
Yes, that similarity is not accidental :-) It looks like it's "just" a matter of hacking at lib_pypy/stackless.py to use ``transaction``. But as I said it's possible that some changes to the API of ``transaction`` would be needed for an easier mapping. Feel free to try it out.
Maybe. It's a language design question though, and based on future technical work, so I'll not consider it for now. The point of STM is that it allows anything, and once we have a good implementation, we can start tweaking other aspects (like the language) to improve performance. A bientôt, Armin.

Hi Christian: ________________________________ From: Christian Tismer <tismer@stackless.com> To: Armin Rigo <arigo@tunes.org> Cc: Andrew Francis <andrewfr_ice@yahoo.com>; "stackless@stackless.com" <stackless@stackless.com>; PyPy Developer Mailing List <pypy-dev@python.org> Sent: Wednesday, April 4, 2012 6:47 AM Subject: Re: [pypy-dev] The Work Plan Re: STM proposal funding ...
Anyway, the underlying ideas have similarities that make me think a lot.
Thinking of Stackless, I was looking for a way to isolate tasklets in a way to let them run in parallel, as long as they are independent.
In STM, independence is enforced, currently at a relatively high price.
Just rolling ideas out -- Chris
The idea I like the most is to use STM and lock-free algorithms for the implementation of the channels themselves. Again, the Scalable Join Patterns and Parallel ML papers are the inspiration for this approach. In contrast I have looked at Go's channel implementation and it has to do stuff like sorting to get the correct locking order. What I like is that the approach assumes that Stackless programmers know how to write programmes that are fairly isolated. One could experiment with this approach using the low-level rstm module or prototypes written in C using existing STM and lock-free libraries. Cheers, Andrew

Hi Armin: ________________________________ From: Armin Rigo <arigo@tunes.org> To: Andrew Francis <andrewfr_ice@yahoo.com> Cc: PyPy Developer Mailing List <pypy-dev@python.org>; "stackless@stackless.com" <stackless@stackless.com> Sent: Sunday, April 1, 2012 5:23 AM Subject: Re: The Work Plan Re: [pypy-dev] STM proposal funding
I think this is a great idea.
Yes I can see transaction.add() being a wrapper/decorator for tasklet creation. However I think that is the easy part. I'm trying to reason about the behaviour. Starting with a simple Stackless programme: 1) All tasklets run on top of a single OS thread. 2) Tasklets do not yield until they are ready to commit, that is they do not call schedule() or block on a channel . 3) Tasklets do not share state/ or variables are immutable ( because of #1 and #2, this isn't important) This is a natural transaction A more complicated but somewhat contrived scenario: 1) tasklets are still running over a single OS thread 2) tasklets yield 3) tasklets share state def transactionA(account1, account2): account1.fromAccount -= 50 if someRandomFunction(): schedule() account2.toAccount += 50 def transactionB(account1, account2): t = arg.fromAccount * .1 account1.fromAccount -= t if someRandomFunction(): schedule() account2.toAccount += t since the tasklets yield, this opens the door for race conditions. I need to look at how the underlying rstm module works to see how easy it would be detect conflicts amongst tasklets. another scenario: def transferTasklet(ch) .... while someFlag: toAcount, fromAccount, amount = ch.receive() # transaction start fromAccount.withdraw(amount) toAccount.deposit(amount) #transaction end Question: without specific transaction_start() and transaction_commit() calls, how does rstm know what the start and finish of transactions are? Cheers, Andrew

Hi Andrew, On Sun, Apr 8, 2012 at 01:25, Andrew Francis <andrewfr_ice@yahoo.com> wrote:
Question: without specific transaction_start() and transaction_commit() calls, how does rstm know what the start and finish of transactions are?
Please take a different point of view: if the proper adaptation is done, the users of the "stackless" module don't need any change. They will continue to work as they are, with the same semantic as today --- with the exception of the ordering among tasklets, which will become truly random. This can be achieved by hacking at a different level. A bientôt, Armin.

Hi Armin: ________________________________ From: Armin Rigo <arigo@tunes.org> To: Andrew Francis <andrewfr_ice@yahoo.com> Cc: PyPy Developer Mailing List <pypy-dev@python.org>; "stackless@stackless.com" <stackless@stackless.com> Sent: Wednesday, April 11, 2012 8:29 AM Subject: Re: The Work Plan Re: [pypy-dev] STM proposal funding On Sun, Apr 8, 2012 at 01:25, Andrew Francis <andrewfr_ice@yahoo.com> wrote: AF> Question: without specific transaction_start() and transaction_commit() AF> calls, how does rstm know what the start and finish of transactions are?
Please take a different point of view: if the proper adaptation is done, the users of the "stackless" module don't need any change.
Yes. Suggestion: perhaps in a future position paper, you can state as a design principle that the programmer does not explicitly state a commit point? So returning to the example, a programme more in the spirit of what you are doing is: def transferTasklet(ch) def transfer(toAccount, fromAccount, amount): . fromAccount.withdraw(amount) toAccount.deposit(amount) while someFlag: toAcount, fromAccount, amount = ch.receive() transaction.add(transfer, toAccount, fromAccount, amount) if __name__ == "__main__": ch = stackless.channel() task = stackless.tasklet(transferTasklet)(ch) transaction.add(task) .... """ let us assume that the stackless scheduler and transaction manager are somehow integrated """ stackless.run() and we assume that the underlying system "magically" takes of stuff. However the programmer does have to throw the transaction manager a bone (hence transaction.add())
Noted. I think issues of ordering (serialisation) is a consequence of a correctly implemented transaction manager. If I understand your strategy, the approach is to give a Python developer a race free programme with minimum effort. However I think a major concern would be implementations that minimises conflicts/contention. As I stated in previous posts, I believe in the case of Stackless, the message passing system is natural way to give the application programmer more control in regard to minimising conflicts/contention (or whatever term the literature uses).
This can be achieved by hacking at a different level.
I am interested in what this hacking looks like under the hood. Again, I am assuming as a design principle guideline (and to focus folk's efforts), one should be providing the rstm module just enough information to work but make no assumptions about how it works. So one ought not depend on whether strategies like eager/lazy evaluation (right now, I think your system depends on lazy evaluation since I see stuff like redo logs), are used. Still I am interested in stuff like: would rstm function correctly if the underlying implementation tracked tasklets rather than threads? I am looking at your implementation and the AME.c/h file in Duhton. I want to get into a position to do some hacking. So I want to think about: What do we know about the relationship between transactions, tasklets and threads? Under what conditions in a Stackless application, would conflicts occur? What could be done by 1) The programmer 2) The STM implemenation to avoid this? Some side notes: I was reading about the Haskell implementation in the "Software Transactional Memory 2nd" book (chapter 4.6.2 - conditionl synchronization). It seems that Haskell has a user space "thread" library that is somewhat integrated with the STM. I am going to look at this more since this is something that Stackless could take advantage of. Salut, Andrew

Hi Andrew, On Sun, Apr 15, 2012 at 18:07, Andrew Francis <andrewfr_ice@yahoo.com> wrote:
So returning to the example, a programme more in the spirit of what you are doing is:
No, you are still missing my point. The goal is to work on existing stackless programs, not to write custom programs that combine "stackless" and "transaction" explicitly. Yes, I'm saying that _any_ existing stackless-based program can be run on multiple cores (assuming common causes of conflicts are found and removed). It doesn't make much sense to go into details now because without an actual implementation we can't really know for sure what the common causes of conflicts will be --- not to mention, I don't like to talk endlessly just to try to convince people that it''s actually even possible :-) A bientôt, Armin.

Hi Armin, On 04/16/2012 11:34 AM Armin Rigo wrote:
Do you want to turn an arbitrary number of cores into a virtual uniprocessor and then use a "yield" (a la OCM) as a kind of logical critical section marker for seamless coroutine-like transfer between one section (space between yields) and another such section in another process? IOW, everything between yield executions is logically guaranteed serial within and atomic as a whole? And everything becomes a speculatively executed critical section, with potential rollback and retry? For removing "common causes of conflicts", are you thinking of static analysis to separate yield-demarked thread sections into probability-of-contention categories for different strategies of scheduling execution by actual cores? Or maybe not just static analysis, but also jit-like tracing of contention and dynamically rewriting with mutex to serialize where conflict detection/rollback/retry thrashes? Does that seem possible? How do you avoid being overprotective? E.g., if x,y,z are being written, and x,y *has* to be atomically coherent, like coordinates, but z can be any version, should the programmer write yields tightly around coherent outputs, e.g., "... yield; x=foo(); y=bar(); yield; z=baz() ..." in order not to create a non-requirement that z also be synced with x,y? Should there not be some way of programming atomicity of noun sets as well as verb sequences? E.g., what do the new concepts do for supporting programming an atomic class concept that would inherit from some base class that guarantees atomic execution of methods and thus synced access to guaranteed coherent attributes? (Which probably exists with locks etc already, but wondering what the new concepts bring to it). Really curious what you have up your sleeve. OTOH, I understand that you probably have experiments and thoughts under way that you'd rather work on than discuss or explain, so back to lurking for some decent interval ;-) Regards, Bengt Richter

Hi Bengt, I feel like I have actually already explained over and over again what I am doing. But it's true that such communication has been going on on various channels. So, here is the documentation I've got so far: https://bitbucket.org/pypy/pypy/raw/stm-gc/lib_pypy/transaction.py In particular, this is not OCM, because it doesn't work with explicit yields. It has typically longer transactions; the essential point is that it is not possible to call a random function that will unexpectedly break the transaction in two parts by calling "yield". A bientôt, Armin.
participants (5)
-
Andrew Francis
-
Armin Rigo
-
Bengt Richter
-
Christian Tismer
-
Maciej Fijalkowski