[pypy-dev] Would the following shared memory model be possible?
fijall at gmail.com
Thu Jul 29 10:55:25 CEST 2010
On Thu, Jul 29, 2010 at 10:50 AM, William Leslie
<william.leslie.ttg at gmail.com> wrote:
> On 29 July 2010 18:02, Maciej Fijalkowski <fijall at gmail.com> wrote:
>> On Thu, Jul 29, 2010 at 9:57 AM, William Leslie
>> <william.leslie.ttg at gmail.com> wrote:
>>> I claim that there are two alternatives in the face of one thread
>>> mutating an object and the other observing:
>>> 0. You can give up consistency and do fine-grained locking, which is
>>> reasonably fast but error prone, or
>>> 1. Expect python to handle all of this for you, effectively not making
>>> a change to the memory model. You could do this with implicit
>>> per-object locks which might be reasonably fast in the absence of
>>> contention, but not when several threads are trying to use the object.
>>> Queues already are in a sense your per-object-lock,
>>> one-thread-mutating, but usually one thread has acquire semantics and
>>> one has release semantics, and that combination actually works. It's
>>> when you expect to have a full memory barrier that is the problem.
>>> Come to think of it, you might be right Kevin: as long as only one
>>> thread mutates the object, the mutating thread never /needs/ to
>>> acquire, as it knows that it has the latest revision.
>>> Have I missed something?
>>> William Leslie
>> So my question is why you think 1. is really expensive (can you find
>> evidence). I don't see what is has to do with cache misses. Besides,
>> in python you cannot guarantee much about mutability of objects. So
>> you don't know if object passed in a queue is mutable or not, unless
>> you restrict yourself to some very simlpe types (in which case there
>> is no shared memory, since you only pass immutable objects).
> If task X expects that task Y will mutate some object it has, it needs
> to go back to the source for every read. This means that if you do use
> mutation of some shared object for communication, it needs to be
> synchronised before every access. What this means for us is that every
> read from a possibly mutable object requires an acquire, and every
> write requires a release. It's as if every reference in the program is
> implemented with a volatile pointer. Even if the object is never
> mutated, there can be a lot of unnecessary bus chatter waiting for
> MESI to tell us so.
I do agree there is an overhead. Can you provide some data how much
this overhead is? Python is not a very simple language and a lot of
things are complex and time consuming, so I wonder how it compares to
locking per object.
More information about the Pypy-dev