Cpython optimization

Olof Bjarnason olof.bjarnason at gmail.com
Fri Oct 23 09:46:26 CEST 2009


2009/10/23 Olof Bjarnason <olof.bjarnason at gmail.com>

>
>
> 2009/10/22 MRAB <python at mrabarnett.plus.com>
>
> Olof Bjarnason wrote:
>> [snip]
>>
>>  A short question after having read through most of this thread, on the
>>> same subject (time-optimizing CPython):
>>>
>>> http://mail.python.org/pipermail/python-list/2007-September/098964.html
>>>
>>> We are experiencing multi-core processor kernels more and more these
>>> days. But they are all still connected to the main memory, right?
>>>
>>> To me that means, even though some algorithm can be split up into several
>>> threads that run on different cores of the processor, that any algorithm
>>> will be memory-speed limited. And memory access is a quite common operation
>>> for most algorithms.
>>>
>>> Then one could ask oneself: what is the point of multiple cores, if
>>> memory bandwidth is the bottleneck? Specifically, what makes one expect any
>>> speed gain from parallelizing a sequential algorithm into four threads, say,
>>> when the memory shuffling is the same speed in both scenarios? (Assuming
>>> memory access is much slower than ADDs, JMPs and such instructions - a quite
>>> safe assumption I presume)
>>>
>>> [ If every core had it's own primary memory, the situation would be
>>> different. It would be more like the situation in a distributed/internet
>>> based system, spread over several computers. One could view each core as a
>>> separate computer actually ]
>>>
>>>  Don't forget about the on-chip cache! :-)
>>
>
> Sorry for continuing slightly OT:
>
> Yes, that makes matters even more interesting.
>
> Caches for single-cpu-boards speed up memory access quite dramatically. Are
> caches for multi-core boards shared among the cores? Or do each core have a
> separate cache? I can only imagine how complicated the read/write logic must
> be of these tiny electronic devices, in any case.
>
> Of course caches makes the memory access-operations must faster, but I'm
> guessing register instructions are still orders of magnitude faster than
> (cached) memory access. (or else registers would not really be needed - you
> could just view the whole primary memory as an array of registers!)
>
> So I think my first question is still interesting: What is the point of
> multiple cores, if memory is the bottleneck?
> (it helps to think of algorithms such as line-drawing or ray-tracing, which
> is easy to parallellize, yet I believe are still faster using a single core
> instead of multiple because of the read/write-to-memory-bottleneck. It does
> help to bring more workers to the mine if
>

um typo "It does NOT help to .."

only one is allowed access at a time, or more likely, several are allowed
> yet it gets so crowded that queues/waiting is inevitable)
>
> --
>
>> http://mail.python.org/mailman/listinfo/python-list
>>
>
>
>
> --
> twitter.com/olofb
> olofb.wordpress.com
> olofb.wordpress.com/tag/english
>
>


-- 
twitter.com/olofb
olofb.wordpress.com
olofb.wordpress.com/tag/english
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20091023/4cf82f62/attachment.html>


More information about the Python-list mailing list