[Python-ideas] Python 3000 TIOBE -3%

Massimo Di Pierro massimo.dipierro at gmail.com
Fri Feb 10 17:05:44 CET 2012


On Feb 10, 2012, at 9:28 AM, Stefan Behnel wrote:

> Massimo Di Pierro, 10.02.2012 15:52:
>> 
>> Forking is a solution only for simple toy cases and in trivially
>> parallel cases. People use processes to parallelize web serves and task
>> queues where the tasks do not need to talk to each other (except with
>> the parent/master process). If you have 100 cores even with a small 50MB
>> program, in order to parallelize it you go from 50MB to 5GB. Memory and
>> memory access become a major bottle neck.
> 
> I think you should read up a bit on the various mechanisms for parallel
> processing.

yes I should ;-) 
(Perhaps I should take this course http://www.cdm.depaul.edu/academics/pages/courseinfo.aspx?CrseId=001533)

The fact is, in my experience, many modern applications where performance is important try to take advantage of all parallelization available. I have worked on many years in lattice QCD and I have written code that runs on various parallel machines. We used processes to parallelize across nodes, threads to parallelize on single node, and assembly vectorial instructions to parallelize within each core. This used to be a state of art way of programming but now I see these patters trickling down to many consumer applications, for example games. People do not like threads because of the need for locking but, as you increase the number of cores, the bottle neck becomes memory access. If you use processes, you don't just bloat ram usage killing cache performance but you need to use message passing for interprocess communication. Message passing require copy of data which is expensive (remember ram is the bottle neck). Ever worse, some times message passing cannot be done using ram only and you need disk buffered message for interprocess communication.

Some programs are parallelized ok with processes. Those I have experience with require both processes and threads. Again, this does not mean using threading APIs. The VM should use threads to parallelize tasks. How this is exposed to the developed is a different matter.


Massimo


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120210/1e6ecd03/attachment.html>


More information about the Python-ideas mailing list