[pypy-dev] pypy-stm and lock-free data structures

Fri Oct 25 19:27:14 CEST 2013

Hi Armin:

On Friday, October 25, 2013 4:36 AM, Armin Rigo <arigo at tunes.org> wrote:

>Precisely my point.  You're coming with paper X describing Y and
>thinking "pypy-stm could do this", whereas I don't see any obvious
>connection between the two.  To repeat myself, pypy-stm has no GIL
>internally, but exposes to the user semantics that are the same as a
>GIL Python.

I understand your point. In regards to the Go race detector, I am interested in the collision detection aspects. For now this is secondary. The thing I am mostly interested in is avoiding fine grained locking and everything that entails. 

>To be on the clear side, you could use CPython and write C, with
>GIL-releasing code around the complicated lock-free algorithm; but the
>point of the lock-free algorithm is to be very fast, negating the
>advantage of releasing the GIL in the first place.  Is that correct?

Yes. I want a Python environment where I can test lock free algorithms without the GIL getting in the way of analysis. I believe there are tricks that essentially turn off the GIL but I would prefer it not being there. Maybe my reasoning is wrong?

>Then if you want to integrate it inside pypy-stm instead, it's more
>complicated than that: we don't want an algorithm (say a simple
>lock-free queue) without some behavior that is tailored to the stm
>world.  For example, if you implement only a queue, then: - pops will
>appear out of order if they are done in some order but the
>transactions commit in a different orders; - pops will be lost if they
>are done by a transaction that aborts; - pushes must be at least
>delayed until the transaction commits, otherwise we see bogus data
>from other threads.

Yes I can see these problems occurring if the lock-free implementation and the STM are not aware of one-another. However the underlying assumption is that both the STM/transaction model and Stackless API (i.e., send(), receive()) are available to the application programmer. And underlying Stackless and STM machinery are orthogonal (for example, Haskell STM uses its light-weight thread scheduler) . It may just so happen that the lower level pypy-stm machinery is being used to implement the Stackless API. It is way too early to say. And I don't know enough.

>So you need to think about it more deeply than just "pypy-stm is cool,
>let's use it for X".  You can't just think about pypy-stm as a
>lock-free Python, because that's only half the truth.  PyPy-stm is all
>about a special use of lock-free algorithms in order to give the user
>the illusion of the GIL.  I would say that it's not really the correct
>ground to experiment with lock-free structures in general --- it's a
>place where we can experiment with custom adaptations of some
>lock-free structures, done towards a specific goal that makes sense
>for the user of pypy-stm.

I understand your point. 
I tend to look at things from the perspective of a Stackless Python user.
I also want to be a pypy-stm user too!

I am going under the assumption that I can use STM feature in much the same way I use the PyPy stackless.py implementation: as an erector set for experimenting with new features. 

In the bigger scheme of things, I am buying an argument made in the Scalable Join Pattern paper. For this discussion, treat join pattern as synonymous with channels/message passing. The argument is that although efficient implementation of join patterns and the STM both involve a transactional aspect, STM is a more general solution, hence harder to optimise. 

Cheers,
Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20131025/57ac0f70/attachment-0001.html>