2011/5/23 &quot;Martin v. Löwis&quot; <span dir="ltr">&lt;<a href="mailto:martin@v.loewis.de">martin@v.loewis.de</a>&gt;</span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div class="im">&gt; I&#39;m not a compiler/profiling expert so the main question is if such<br>

&gt; design can work, and maybe someone was thinking about something<br>

&gt; similar?<br>

<br>

</div>My expectation is that your approach would likely make the issues<br>

worse in a multi-CPU setting. If you put multiple reference counters<br>

into a contiguous block of memory, unrelated reference counters will<br>

live in the same cache line. Consequentially, changing one reference<br>

counter on one CPU will invalidate the cached reference counters of<br>

that cache line on other CPU, making your problem a) actually worse.<br>

<br>

Regards,<br>

<font color="#888888">Martin<br>

</font><div><div class="h5"></div></div></blockquote><div><br></div><div>I don&#39;t think that moving ob_refcnt to a proper memory pool will solve the problem of cache pollution anyway.</div><div><br></div><div>ob_refcnt is obviously the most stressed field in PyObject, but it&#39;s not the only one. We have , that is needed to model each object (instance) &quot;behavior&quot;, which is massively accessed too, so a cache line will be loaded as well when the object will be used.</div>

<div><br></div><div>Also, only a few of simple objects have just ob_refcnt and ob_type. Most of them have other fields too, and accessing them means a line cache load.</div><div><br></div><div>Regards,</div><div>Cesare</div>

<div><br></div><div><div>P.S. Memory allocation granularity can help sometimes, leaving some data (ob_refcnt and/or ob_type) on one cache line, and the other on the next one.</div></div>