Accessing a numpy array in a mmap fashion
![](https://secure.gravatar.com/avatar/2f575fdb1ba211d94cd41eea2269d364.jpg?s=120&d=mm&r=g)
Hello all, I'm wondering if there is a way to use a numpy array that uses disk as a memory store rather than ram. I'm looking for something like mmap but which can be used like a numpy array. The general idea is this. I'm simulating a system which produces a large dataset over a few hours of processing time. Rather than store the numpy array in memory during processing I'd like to write the data directly to disk but still be able to treat the array as a numpy array. Is this possible? Any ideas? Thanks, Brian -- Brian Donovan Research Assistant Microwave Remote Sensing Lab UMass Amherst
![](https://secure.gravatar.com/avatar/ab7e74f2443b81e5175638d72be65e07.jpg?s=120&d=mm&r=g)
On 30/08/2007, Brian Donovan <donovan@mirsl.ecs.umass.edu> wrote:
You want numpy.memmap: http://mail.python.org/pipermail/python-list/2007-May/443036.html This will do exactly what you want (though you may have problems with arrays bigger than a few gigabytes, particularly on 32-bit systems) and there may be a few rough edges. You will probably need to create the file first. Keep in mind that if the array is actually temporary, the virtual memory system will push unused parts out to disk as memory fills up, so there's no need to use memmap explicitly. If you want the array permanently on disk, though, memmap is probably the most convenient way to do it - though if your access patterns are not local it may involve a lot of thrashing. Sequential disk writes have the advantage (?) of forcing you to write code that accesses disks in a local fashion. Anne
![](https://secure.gravatar.com/avatar/ab7e74f2443b81e5175638d72be65e07.jpg?s=120&d=mm&r=g)
On 30/08/2007, Brian Donovan <donovan@mirsl.ecs.umass.edu> wrote:
You want numpy.memmap: http://mail.python.org/pipermail/python-list/2007-May/443036.html This will do exactly what you want (though you may have problems with arrays bigger than a few gigabytes, particularly on 32-bit systems) and there may be a few rough edges. You will probably need to create the file first. Keep in mind that if the array is actually temporary, the virtual memory system will push unused parts out to disk as memory fills up, so there's no need to use memmap explicitly. If you want the array permanently on disk, though, memmap is probably the most convenient way to do it - though if your access patterns are not local it may involve a lot of thrashing. Sequential disk writes have the advantage (?) of forcing you to write code that accesses disks in a local fashion. Anne
participants (3)
-
Anne Archibald
-
Brian Donovan
-
Ryan May