[Numpy-discussion] Adding `offset` argument to np.lib.format.open_memmap and np.load

Robert Kern robert.kern at gmail.com
Mon Feb 28 19:15:42 EST 2011

On Thu, Feb 24, 2011 at 09:49, Jon Olav Vik <jonovik at gmail.com> wrote:

> My use case was to preallocate a big record array on disk, then start many
> processes writing to their separate, memory-mapped segments of the file. The
> end result was one big array on disk, with the correct shape and data type
> information. Using a record array makes the data structure more self-
> documenting. Using open_memmap with mode="w+" is the fastest way I've found to
> preallocate an array on disk; it does not create the huge array in memory.
> Letting multiple processes memory-map and read/write to non-overlapping
> portions without interfering with each other allows for fast, simple parallel I/
> O.

You can have each of those processes memory-map the whole file and
just operate on their own slices. Your operating system's virtual
memory manager should handle all of the details for you.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

More information about the NumPy-Discussion mailing list