Cpython optimization

Fri Oct 23 05:22:40 EDT 2009

> Olof Bjarnason wrote:
> [snip]
>> A short question after having read through most of this thread, on the
>> same subject (time-optimizing CPython):
>>
>> http://mail.python.org/pipermail/python-list/2007-September/098964.html
>>
>> We are experiencing multi-core processor kernels more and more these
>> days. But they are all still connected to the main memory, right?
>>
>> To me that means, even though some algorithm can be split up into
>> several threads that run on different cores of the processor, that any
>> algorithm will be memory-speed limited. And memory access is a quite
>> common operation for most algorithms.
>>
>> Then one could ask oneself: what is the point of multiple cores, if
>> memory bandwidth is the bottleneck? Specifically, what makes one
>> expect any speed gain from parallelizing a sequential algorithm into
>> four threads, say, when the memory shuffling is the same speed in both
>> scenarios? (Assuming memory access is much slower than ADDs, JMPs and
>> such instructions - a quite safe assumption I presume)

Modern (multi-core) processors have several levels of caches that interact
with the other cores in different ways.

You should read up on NUMA.

http://en.wikipedia.org/wiki/Non-Uniform_Memory_Access

Stefan