Exploiting Dual Core's with Py_NewInterpreter's separated GIL ?
rridge at csclub.uwaterloo.ca
Tue Nov 7 13:03:03 CET 2006
"Martin v. Löwis" <martin at v.loewis.de> writes:
> Ah, but in the case where the lock# signal is used, it's known that
> the data is not in the cache of the CPU performing the lock operation;
> I believe it is also known that the data is not in the cache of any
> other CPU. So the CPU performing the LOCK INC sequence just has
> to perform two memory cycles. No cache coherency protocol runs
> in that case.
Paul Rubin wrote:
> How can any CPU know in advance that the data is not in the cache of
> some other CPU?
In the case where the LOCK# signal is asserted the area of memory
accessed is marked as being uncachable. In a SMP system all CPUs must
have the same mapping of cached and uncached memory or things like this
break. In the case where the LOCK# signal isn't used, the MESI
protocol informs the CPU of which of it's cache lines might also be in
the cache of another CPU.
> OK, this is logical, but it already implies a cache miss, which costs
> many dozen (100?) cycles. But this case may be uncommon, since one
> hops that cache misses are relatively rare.
The cost of the cache miss is the same whether the increment
instruction is locked or not.
More information about the Python-list