
On Fri, Mar 05, 2010 at 09:53:02AM +0100, Francesc Alted wrote:
Yeah, 10% of improvement by using multi-cores is an expected figure for memory bound problems. This is something people must know: if their computations are memory bound (and this is much more common that one may initially think), then they should not expect significant speed-ups on their parallel codes.
Hey Francesc, Any chance this can be different for NUMA (non uniform memory access) architectures? AMD multicores used to be NUMA, when I was still following these problems. FWIW, I observe very good speedups on my problems (pretty much linear in the number of CPUs), and I have data parallel problems on fairly large data (~100Mo a piece, doesn't fit in cache), with no synchronisation at all between the workers. CPUs are Intel Xeons. Gael