[Numpy-discussion] OT: performance in C extension; OpenMP, or SSE ?

Francesc Alted faltet at pytables.org
Sun Feb 20 07:09:37 EST 2011

A Sunday 20 February 2011 00:01:59 Sturla Molden escrigué:
> pthreads will give you better control than OpenMP, but are messy and
> painful to work with.
> With MPI you have separate processes, so everything is completely
> isolated. It's more difficult to program and debug than OpenMP code,
> but will usually perform better.

To be more specific, MPI will perform better if you don't need to share 
the memory of your working set among all your processes, but in case you 
need to do this, threads (pthreads, OpenMP) lets you access all parts of 
the working set in memory much more easily.  In fact, all the threads of 
a process can access the complete working set transparently and 
efficiently (although this is precisely why they are trickier to 
program: you have to explicitly avoid simultaneous writing in the same 
memory area), not to mention that threads are much cheaper to create 
than processes.

Generally speaking, if your problem is large, CPU intensive and not very 
memory bounded, MPI usually leads to better results.  Otherwise threads 
tend to do a better job.  The thing is that memory is increasingly 
becoming more and more a bottleneck nowadays, so threads are here to 
stay for a long, long time (which is certainly unfortunate for 
programmers, specially for pthreads ones :).

Francesc Alted

More information about the NumPy-Discussion mailing list