
I did some optimization, and the results are very instructive, although not surprising: javascript:SetCmd(cmdSend); As I wrote before, I processed stereoscopic movie recordings, by making each a memory mapped file and processing it in several steps. By this way I produced extra GB of transient data. Running as one process took 45 seconds, and in dual parallel process ~40 seconds. After rewriting the application to process the recording frame by frame. The code became shorter and the new scores are: One process --- 16 seconds, and dual process --- 9 seconds. What I learned: * Design for multi-procssing from the start, not as afterthought * Shared memory works, but on the expense of code elegance (much like common blocks in fortran) * Memory mapped files can be used much as shared memory. The strange thing is that I got an ignored AttributeError on every frame access to the memory mapped file from the child process. Nadav -----Original Message----- From: numpy-discussion-bounces@scipy.org on behalf of Brian Granger Sent: Fri 05-Mar-10 21:29 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy Francesc, Yeah, 10% of improvement by using multi-cores is an expected figure for
memory bound problems. This is something people must know: if their computations are memory bound (and this is much more common that one may initially think), then they should not expect significant speed-ups on their parallel codes.
+1 Thanks for emphasizing this. This is definitely a big issue with multicore. Cheers, Brian
Thanks for sharing your experience anyway, Francesc
A Thursday 04 March 2010 18:54:09 Nadav Horesh escrigué:
I can not give a reliable answer yet, since I have some more improvement to make. The application is an analysis of a stereoscopic-movie raw-data recording (both channels are recorded in the same file). I treat the data as a huge memory mapped file. The idea was to process each channel (left and right) on a different core. Right now the application is IO bounded since I do classical numpy operation, so each channel (which is handled as one array) is scanned several time. The improvement now over a single process is 10%, but I hope to achieve 10% ore after trivial optimizations.
I used this application as an excuse to dive into multi-processing. I hope that the code I posted here would help someone.
Nadav.
-----Original Message----- From: numpy-discussion-bounces@scipy.org on behalf of Francesc Alted Sent: Thu 04-Mar-10 15:12 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy
What kind of calculations are you doing with this module? Can you please send some examples and the speed-ups you are getting?
Thanks, Francesc
A Thursday 04 March 2010 14:06:34 Nadav Horesh escrigué:
Extended module that I used for some useful work. Comments: 1. Sturla's module is better designed, but did not work with very large (although sub GB) arrays 2. Tested on 64 bit linux (amd64) + python-2.6.4 + numpy-1.4.0
Nadav.
-----Original Message----- From: numpy-discussion-bounces@scipy.org on behalf of Nadav Horesh Sent: Thu 04-Mar-10 11:55 To: Discussion of Numerical Python Subject: RE: [Numpy-discussion] multiprocessing shared arrays and numpy
Maybe the attached file can help. Adpted and tested on amd64 linux
Nadav
-----Original Message----- From: numpy-discussion-bounces@scipy.org on behalf of Nadav Horesh Sent: Thu 04-Mar-10 10:54 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy
There is a work by Sturla Molden: look for multiprocessing-tutorial.pdf and sharedmem-feb13-2009.zip. The tutorial includes what is dropped in the cookbook page. I am into the same issue and going to test it today.
Nadav
On Wed, 2010-03-03 at 15:31 +0100, Jesper Larsen wrote:
Hi people,
I was wondering about the status of using the standard library multiprocessing module with numpy. I found a cookbook example last updated one year ago which states that:
"This page was obsolete as multiprocessing's internals have changed. More information will come shortly; a link to this page will then be added back to the Cookbook."
http://www.scipy.org/Cookbook/multiprocessing
I also found the code that used to be on this page in the cookbook but it does not work any more. So my question is:
Is it possible to use numpy arrays as shared arrays in an application using multiprocessing and how do you do it?
Best regards, Jesper _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
-- Francesc Alted _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion