writing a known-size 1D ndarray serially as it's calced
I want to calc multiple ndarrays at once and lack memory, so want to write in chunks (here sized to GPU batch capacity). It seems there should be an interface to write the header, then write a number of elements cyclically, then add any closing rubric and close the file. Is it as simple as lib.format.write_array_header_2_0(fp, d) then writing multiple shape(N,) arrays of float by fp.write(item.tobytes())?
On Tue, Aug 23, 2022 at 8:47 PM <bross_phobrain@sonic.net> wrote:
I want to calc multiple ndarrays at once and lack memory, so want to write in chunks (here sized to GPU batch capacity). It seems there should be an interface to write the header, then write a number of elements cyclically, then add any closing rubric and close the file.
Is it as simple as lib.format.write_array_header_2_0(fp, d) then writing multiple shape(N,) arrays of float by fp.write(item.tobytes())?
`item.tofile(fp)` is more efficient, but yes, that's the basic scheme. There is no footer after the data. The alternative is to use `np.lib.format.open_memmap(filename, mode='w+', dtype=dtype, shape=shape)`, then assign slices sequentially to the returned memory-mapped array. A memory-mapped array is usually going to be friendlier to whatever memory limits you are running into than a nominally "in-memory" array. -- Robert Kern
Hi all, I‘ve made the Pip/Conda module npy-append-array for exactly this purpose, see https://github.com/xor2k/npy-append-array It works with one dimensional arrays, too, of course. The key challange is to properly initialize and update the header accordingly as the array grows which my module takes care of. I‘d like to integrate this functionality directly into Numpy, see PR https://github.com/numpy/numpy/pull/20321/ but I have been busy and did have not received any feedback recently. A more direct integration into Numpy would allow to skip or ease the header update part, e.g. by introducing a new file format version. This could turn .npy into a sort of binary CSV equivalent where the size of the array is determined by the file size. Best, Michael
On 24. Aug 2022, at 03:04, Robert Kern <robert.kern@gmail.com> wrote: On Tue, Aug 23, 2022 at 8:47 PM <bross_phobrain@sonic.net> wrote:
I want to calc multiple ndarrays at once and lack memory, so want to write in chunks (here sized to GPU batch capacity). It seems there should be an interface to write the header, then write a number of elements cyclically, then add any closing rubric and close the file.
Is it as simple as lib.format.write_array_header_2_0(fp, d) then writing multiple shape(N,) arrays of float by fp.write(item.tobytes())?
`item.tofile(fp)` is more efficient, but yes, that's the basic scheme. There is no footer after the data.
The alternative is to use `np.lib.format.open_memmap(filename, mode='w+', dtype=dtype, shape=shape)`, then assign slices sequentially to the returned memory-mapped array. A memory-mapped array is usually going to be friendlier to whatever memory limits you are running into than a nominally "in-memory" array.
-- Robert Kern _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: michael.siebert2k@gmail.com
Thanks, np.lib.format.open_memmap() works great! With prediction procs using minimal sys memory, I can get twice as many on GPU, with fewer optimization warnings. Why even have the number of records in the header? Shouldn't record size plus system-reported/growable file size be enough? I'd love to have a shared-mem analog for smaller-scale data; now I load data and fork to emulate that effect. My file sizes will exceed memory, so I'm hoping to get the most out of memmap. Will this in-loop assignment to predsum work to avoid loading all to memory? predsum = np.lib.format.open_memmap(outfile, mode='w+', shape=(ids_sq,), dtype=np.float32) for i in range(len(IN_FILES)): pred = numpy.lib.format.open_memmap(IN_FILES[i]) predsum = np.add(predsum, pred) ################# <- del pred del predsum -- Phobrain.com On 2022-08-23 18:02, Robert Kern wrote:
On Tue, Aug 23, 2022 at 8:47 PM <bross_phobrain@sonic.net> wrote:
I want to calc multiple ndarrays at once and lack memory, so want to write in chunks (here sized to GPU batch capacity). It seems there should be an interface to write the header, then write a number of elements cyclically, then add any closing rubric and close the file.
Is it as simple as lib.format.write_array_header_2_0(fp, d) then writing multiple shape(N,) arrays of float by fp.write(item.tobytes())?
`item.tofile(fp)` is more efficient, but yes, that's the basic scheme. There is no footer after the data.
The alternative is to use `np.lib.format.open_memmap(filename, mode='w+', dtype=dtype, shape=shape)`, then assign slices sequentially to the returned memory-mapped array. A memory-mapped array is usually going to be friendlier to whatever memory limits you are running into than a nominally "in-memory" array. -- Robert Kern _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: bross_phobrain@sonic.net
On Thu, Aug 25, 2022 at 4:27 AM Bill Ross <bross_phobrain@sonic.net> wrote:
Thanks, np.lib.format.open_memmap() works great! With prediction procs using minimal sys memory, I can get twice as many on GPU, with fewer optimization warnings.
Why even have the number of records in the header? Shouldn't record size plus system-reported/growable file size be enough?
Only in the happy case where there is no corruption. Implicitness is not a virtue in the use cases that the format was designed for. There is an additional use case where the length is unknown a priori where implicitness would help, but the format was not designed for that case (and I'm not sure I want to add that use case).
I'd love to have a shared-mem analog for smaller-scale data; now I load data and fork to emulate that effect.
There are a number of ways to do that, including using memmap on files on a memory-backed filesystem like /dev/shm/ on Linux. See this article for several more options: https://luis-sena.medium.com/sharing-big-numpy-arrays-across-python-processe...
My file sizes will exceed memory, so I'm hoping to get the most out of memmap. Will this in-loop assignment to predsum work to avoid loading all to memory?
predsum = np.lib.format.open_memmap(outfile, mode='w+', shape=(ids_sq,), dtype=np.float32)
for i in range(len(IN_FILES)):
pred = numpy.lib.format.open_memmap(IN_FILES[i])
predsum = np.add(predsum, pred) ################# <-
This will replace the `predsum` array with a new in-memory array the first time through this loop. Use `out=predsum` to make sure that the output goes into the memory-mapped array np.add(predsum, pred, out=predsum) Or the usual augmented assignment: predsum += pred
del pred del predsum
The precise memory behavior will depend on your OS's virtual memory configuration. But in general, `np.add()` will go through the arrays in order, causing the virtual memory system to page in memory pages as they are accessed for reading or writing, and page out the old ones to make room for the new pages. Linux, in my experience, isn't always the best at managing that backlog of old pages, especially if you have multiple processes doing similar kinds of things (in the past, I have seen *each* of those processes trying to use *all* of the main memory for their backlog of old pages), but there are configuration tweaks that you can make. -- Robert Kern
participants (4)
-
Bill Ross
-
bross_phobrain@sonic.net
-
Michael Siebert
-
Robert Kern