[pypy-dev] 1.4.1 threading
p.giarrusso at gmail.com
Mon Dec 27 18:57:55 CET 2010
On Mon, Dec 27, 2010 at 07:00, Dima Tisnek <dimaqq at gmail.com> wrote:
> (dan) random binary trees: O(log2(n)) is 7.2 for builtins, though the
> constant factor for locks, etc, might make it worthwhile
"7.2"? What is that? Do you mean 7.2=log2(no. of builtins)?
> (paolo) non blocking hash maps: memory barriers can be quite costly too
Well, you just _cannot_ avoid them, whatever is your fancy data structure.
You can only move most of the cost (or all of it) to write operations
- which is helpful if reads are most common, a case which you discuss
in your mail.
== Why memory barriers are required (in more detail) ==
If you use persistent data structures, also called copy-on-write or
purely functional (and that's the most obvious way to use a tree in a
parallel setting), the whole synchronization problem is reduced to
exchanging a pointer (to the root of the tree) between a writer and a
Let's consider this in the Java memory model (the state of the art).
In this model the writer thread has to use an expensive StoreLoad
barrier, i.e. mfence on x86 , costing more or less as much as a
cache miss, while the reader needs just a cheap Load* barrier (for
free on x86, cheap on other processor).
If you don't use a volatile field, there is very little guarantee
about what your program means, and it is almost always buggy; the
equivalent in the new C++ memory model (which is otherwise very
similar) gives completely undefined semantics (and even crashes are
possible in practice, for instance if the reader thread sees a
not-fully-initialized object because of the race and invokes a virtual
method on it).
== Other stuff ==
> different options will have to be implemented and tested when we get there,
> speaking on which, is there a test dict load?
> does anyone have some profiling data, what portion of runtime is spent
> reading and writing attributes and globals anyways?
I'd be interested as well in such data.
> Anyhow, back to parallel interpretation, Is there an issue or page
> that tracks what's needed for parallel interpretation?
> so far mentioned: gc, dict, c api
> Btw, I think that jit is more important at the moment, but time comes
> when jit juice has been mostly squeezed out ;-)
Paolo Giarrusso - Ph.D. Student
More information about the Pypy-dev