Re: [Numpy-discussion] Only integer scalar arrays can be converted to a scalar index

I was hoping that numpy doing this in a vectorised way would only load the surrounding traces into memory for each X and Y as it needs to rather than the whole cube. I'm using hdf5 for the storage. My example was just a short example without using hdf5. On 15 Sep 2017 1:16 am, "Elliot Hallmark" <Permafacture@gmail.com> wrote: Won't any solution not using hdf5 or some other chunked on disk storage method load the whole cube into memory? _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion

Nope. Numpy only works on in memory arrays. You can determine your own chunking strategy using hdf5, or something like dask can figure that strategy out for you. With numpy you might worry about not accidentally making duplicates or intermediate arrays, but that's the extent of memory optimization you can do in numpy itself. Elliot

On Fri, Sep 15, 2017 at 2:37 PM, Elliot Hallmark <Permafacture@gmail.com> wrote:
Nope. Numpy only works on in memory arrays. You can determine your own chunking strategy using hdf5, or something like dask can figure that strategy out for you. With numpy you might worry about not accidentally making duplicates or intermediate arrays, but that's the extent of memory optimization you can do in numpy itself.
NumPy does have it's own memory map variant on ndarray: https://docs.scipy.org/doc/numpy/reference/generated/numpy.memmap.html -- Robert McLeod, Ph.D. robbmcleod@gmail.com robbmcleod@protonmail.com

No thoughts on optimizing memory, but that indexing error probably comes from np.mean producing float results. An astype call shoulder that work. -CHB Sent from my iPhone On Sep 15, 2017, at 5:51 PM, Robert McLeod <robbmcleod@gmail.com> wrote: On Fri, Sep 15, 2017 at 2:37 PM, Elliot Hallmark <Permafacture@gmail.com> wrote:
Nope. Numpy only works on in memory arrays. You can determine your own chunking strategy using hdf5, or something like dask can figure that strategy out for you. With numpy you might worry about not accidentally making duplicates or intermediate arrays, but that's the extent of memory optimization you can do in numpy itself.
NumPy does have it's own memory map variant on ndarray: https://docs.scipy.org/doc/numpy/reference/generated/numpy.memmap.html -- Robert McLeod, Ph.D. robbmcleod@gmail.com robbmcleod@protonmail.com _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion

+1 on the astype(int) call. +1 also on using dask. scikit-image has a couple of functions that might be useful: - skimage.util.apply_parallel: applies a function to an input array in chunks, with user-selectable chunk size and margins. This is powered by dask. - skimage.util.view_as_windows: uses stride tricks to produce a sliding window view over an n-dimensional array. On 16 Sep 2017, 8:16 AM +1000, Chris Barker - NOAA Federal <chris.barker@noaa.gov>, wrote:
No thoughts on optimizing memory, but that indexing error probably comes from np.mean producing float results. An astype call shoulder that work.
-CHB
Sent from my iPhone
On Sep 15, 2017, at 5:51 PM, Robert McLeod <robbmcleod@gmail.com> wrote:
On Fri, Sep 15, 2017 at 2:37 PM, Elliot Hallmark <Permafacture@gmail.com> wrote:
Nope. Numpy only works on in memory arrays. You can determine your own chunking strategy using hdf5, or something like dask can figure that strategy out for you. With numpy you might worry about not accidentally making duplicates or intermediate arrays, but that's the extent of memory optimization you can do in numpy itself.
NumPy does have it's own memory map variant on ndarray:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.memmap.html
-- Robert McLeod, Ph.D. robbmcleod@gmail.com robbmcleod@protonmail.com
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion

On Sat, Sep 16, 2017 at 7:16 AM, Chris Barker - NOAA Federal < chris.barker@noaa.gov> wrote:
No thoughts on optimizing memory, but that indexing error probably comes
from np.mean producing float results. An astype call shoulder that work. Why? It's not being used as an index. It's being assigned into a float array. Rather, it's the slicing inside of `trace_block()` when it's being given arrays as inputs for `x` and `y`. numpy simply doesn't support that because in general the result wouldn't have a uniform shape. -- Robert Kern

@Robert, good point, always good to try out code before speculating on a thread. ;) Here’s working code to do the averaging, though it’s not block-wise, you’ll have to add that on top with dask/util.apply_parallel. Note also that because of the C-order of numpy arrays, it’s much more efficient to think of axis 0 as the “vertical” axis, rather than axis 2. See http://scikit-image.org/docs/dev/user_guide/numpy_images.html#notes-on-array... for more info. import numpy as np from skimage import util vol = np.linspace(1, 125, 125, dtype=np.int32).reshape(5, 5, 5) window_shape = (1, 3, 3) windows = util.view_as_windows(vol, window_shape) print(windows.shape) # (5, 3, 3, 1, 3, 3) averaged = np.mean(windows, axis=(3, 4, 5)) HTH! Juan. On 16 Sep 2017, 12:34 PM +1000, Robert Kern <robert.kern@gmail.com>, wrote:
On Sat, Sep 16, 2017 at 7:16 AM, Chris Barker - NOAA Federal <chris.barker@noaa.gov> wrote:
No thoughts on optimizing memory, but that indexing error probably comes from np.mean producing float results. An astype call shoulder that work.
Why? It's not being used as an index. It's being assigned into a float array.
Rather, it's the slicing inside of `trace_block()` when it's being given arrays as inputs for `x` and `y`. numpy simply doesn't support that because in general the result wouldn't have a uniform shape.
-- Robert Kern _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
participants (6)
-
Chris Barker - NOAA Federal
-
Elliot Hallmark
-
Juan Nunez-Iglesias
-
Michael Bostock
-
Robert Kern
-
Robert McLeod