[Numpy-discussion] Adding `offset` argument to np.lib.format.open_memmap and np.load

Robert Kern robert.kern at gmail.com
Mon Feb 28 19:15:42 EST 2011


On Thu, Feb 24, 2011 at 09:49, Jon Olav Vik <jonovik at gmail.com> wrote:

> My use case was to preallocate a big record array on disk, then start many
> processes writing to their separate, memory-mapped segments of the file. The
> end result was one big array on disk, with the correct shape and data type
> information. Using a record array makes the data structure more self-
> documenting. Using open_memmap with mode="w+" is the fastest way I've found to
> preallocate an array on disk; it does not create the huge array in memory.
> Letting multiple processes memory-map and read/write to non-overlapping
> portions without interfering with each other allows for fast, simple parallel I/
> O.

You can have each of those processes memory-map the whole file and
just operate on their own slices. Your operating system's virtual
memory manager should handle all of the details for you.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list