
Here is a strange thing I am getting with multiprocessing and memory mapped array: The below script generates the error message 30 times (for every slice access): Exception AttributeError: AttributeError("'NoneType' object has no attribute 'tell'",) in <bound method memmap.__del__ of memmap(2949995000.0)> ignored Although I get the correct answer eventually. ------------------------------------------------------ import numpy as N import multiprocessing as MP def average(cube): return [plane.mean() for plane in cube] N.arange(30*100*100, dtype=N.int32).tofile(open('30x100x100_int32.dat','w')) data = N.memmap('30x100x100_int32.dat', dtype=N.int32, shape=(30,100,100)) pool = MP.Pool(processes=1) job = pool.apply_async(average, [data,]) print job.get() ------------------------------------------------------ I use python 2.6.4 and numpy 1.4.0 on 64 bit linux (amd64) Nadav -----Original Message----- From: numpy-discussion-bounces@scipy.org on behalf of Gael Varoquaux Sent: Thu 11-Mar-10 11:36 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] multiprocessing shared arrays and numpy On Thu, Mar 11, 2010 at 10:04:36AM +0100, Francesc Alted wrote:
As far as I know, memmap files (or better, the underlying OS) *use* all available RAM for loading data until RAM is exhausted and then start to use SWAP, so the "memory pressure" is still there. But I may be wrong...
I believe that your above assertion is 'half' right. First I think that it is not SWAP that the memapped file uses, but the original disk space, thus you avoid running out of SWAP. Second, if you open several times the same data without memmapping, I believe that it will be duplicated in memory. On the other hand, when you memapping, it is not duplicated, thus if you are running several processing jobs on the same data, you save memory. I am very much in this case. Gaël _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion