3D array -> 2D array issue...
Hi all, Let's say C1 is a 3D array, and q0 and k are 2D array. dim C1 = nx*ny*nz dim q0 = nx*ny = dim k I have to do the following: q0[0, 0] = C1[0, 0, k[0, 0]] q0[1, 1] = C1[1, 1, k[1, 1]] ... q0[i, j] = C1[i, j, k[i, j]] ... I tried q0 = C1[:, :, k] but this obviously does not work. How could I do this ala NumPy? TIA Cheers, -- Fred
On Mon, Feb 7, 2011 at 7:21 AM, Fred <fredmfp@gmail.com> wrote:
Hi all,
Let's say C1 is a 3D array, and q0 and k are 2D array.
dim C1 = nx*ny*nz
dim q0 = nx*ny = dim k
I have to do the following:
q0[0, 0] = C1[0, 0, k[0, 0]] q0[1, 1] = C1[1, 1, k[1, 1]] ... q0[i, j] = C1[i, j, k[i, j]] ...
I tried
q0 = C1[:, :, k]
but this obviously does not work.
you need to build or broadcast the indices, something like this (written, not tried out) n0, n1 = k.shape ind0 = np.arange(n0)[:,None] ind1 = np.arange(n1) q0 = C1[ind0,ind1, k[ind0,ind1]] or better q0 = C1[ind0,ind1, k] Josef
How could I do this ala NumPy?
TIA
Cheers,
-- Fred _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Hi there, I have a big array (44 GB) I want to decimate. But this array has a lot of NaN (only 1/3 has value, in fact, so 2/3 of NaN). If I "basically" decimate it (a la NumPy, ie data[::nx, ::ny, ::nz], for instance), the decimated array will also have a lot of NaN. What I would like to have in one cell of the decimated array is the nearest (for instance) value in the big array. This is what I call a "condensated array". How could I do that, a la NumPy? TIA. Cheers, -- Fred
On Fri, Feb 25, 2011 at 10:36:42AM +0100, Fred wrote:
I have a big array (44 GB) I want to decimate.
But this array has a lot of NaN (only 1/3 has value, in fact, so 2/3 of NaN).
If I "basically" decimate it (a la NumPy, ie data[::nx, ::ny, ::nz], for instance), the decimated array will also have a lot of NaN.
What I would like to have in one cell of the decimated array is the nearest (for instance) value in the big array. This is what I call a "condensated array".
What exactly do you mean by 'decimating'. To me is seems that you are looking for matrix factorization or matrix completion techniques, which are trendy topics in machine learning currently. They however are a bit challenging, and I fear that you will have read the papers and do some implementation, unless you have a clear application in mind that enables for simple tricks to solve it. G
Le 25/02/2011 10:42, Gael Varoquaux a écrit :
What exactly do you mean by 'decimating'. To me is seems that you are looking for matrix factorization or matrix completion techniques, which are trendy topics in machine learning currently. By decimating, I mean this:
input array data.shape = (nx, ny, nz) -> data[::ax, ::ay, ::az], ie output array data[::ax, ::ay, ::az].shape = (nx/ax, ny/ay, nz/az). -- Fred
On Fri, Feb 25, 2011 at 10:52:09AM +0100, Fred wrote:
What exactly do you mean by 'decimating'. To me is seems that you are looking for matrix factorization or matrix completion techniques, which are trendy topics in machine learning currently. By decimating, I mean this:
input array data.shape = (nx, ny, nz) -> data[::ax, ::ay, ::az], ie output array data[::ax, ::ay, ::az].shape = (nx/ax, ny/ay, nz/az).
OK, this can be seen as an interpolation on a grid with a nearest neighbor interpolator. What I am unsure about is whether you want to interpolate your NaN, or whether they just mean missing data. I would do this by representing the matrix as a sparse matrix in COO, this would give you a list of row and col positions for your data points. Then I would use a nearest neighbor (such as scipy's KDTree, or the scikit-learn's BallTree for even better performance http://scikit-learn.sourceforge.net/modules/neighbors.html) to find, for each grid point which data point is closest and fill in your grid. I suspect that your problem is that you can't fit the whole matrix in memory. If your data points are reasonnably homogeneously distributed in the matrix, I would simply process the problem using sub matrices, and making sure that I train the nearest neighbor on a sub matrix that is largest than the sampling grid by a factor of more than the inter-point distance. HTH, Gael
2011/2/25 Gael Varoquaux <gael.varoquaux@normalesup.org>:
On Fri, Feb 25, 2011 at 10:36:42AM +0100, Fred wrote:
I have a big array (44 GB) I want to decimate.
But this array has a lot of NaN (only 1/3 has value, in fact, so 2/3 of NaN).
If I "basically" decimate it (a la NumPy, ie data[::nx, ::ny, ::nz], for instance), the decimated array will also have a lot of NaN.
What I would like to have in one cell of the decimated array is the nearest (for instance) value in the big array. This is what I call a "condensated array".
What exactly do you mean by 'decimating'. To me is seems that you are looking for matrix factorization or matrix completion techniques, which are trendy topics in machine learning currently.
They however are a bit challenging, and I fear that you will have read the papers and do some implementation, unless you have a clear application in mind that enables for simple tricks to solve it.
Indeed the following paper by G. Martinsson from there is also a section on matrix summarization: http://arxiv.org/abs/0909.4061 http://www.stanford.edu/group/mmds/slides2010/Martinsson.pdf The scikit-learn randomized SVD implementation is coming this paper. It's pretty useful in practice. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel
Gaël, Olivier, I finally got working it. I don't compute the nearest value but the mean. Works like a charm ;-) Thanks anyway. Cheers, -- Fred
Hi all, I get some issue using gradient on an array created from memmap: PC-Fred[pts/10]:~/{11}/> ipython -p numpy Python 2.6.7 (r267:88850, Jul 10 2011, 08:11:54) Type "copyright", "credits" or "license" for more information. IPython 0.10.2 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object'. ?object also works, ?? prints more. IPython profile: numpy PC-Fred[12:39:18]:~/{1}/> a=array([[1.,2.], [3., 4.]], dtype='f') PC-Fred[12:40:24]:~/{2}/> a.tofile('a.sep') PC-Fred[12:40:32]:~/{3}/> del a PC-Fred[12:40:45]:~/{4}/> a = memmap('a.sep', mode='r', dtype='f', shape=(2,2)) PC-Fred[12:40:49]:~/{5}/> x, y = gradient(a, 1, 1) --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /mnt/common/home/fred/<ipython console> in <module>() /usr/lib/pymodules/python2.6/numpy/lib/function_base.py in gradient(f, *varargs) 842 for axis in range(N): 843 # select out appropriate parts for this dimension --> 844 out = np.zeros_like(f).astype(otype) 845 slice1[axis] = slice(1, -1) 846 slice2[axis] = slice(2, None) /usr/lib/pymodules/python2.6/numpy/core/memmap.py in __array_finalize__(self, obj) 255 if hasattr(obj, '_mmap'): 256 self._mmap = obj._mmap --> 257 self.filename = obj.filename 258 self.offset = obj.offset 259 self.mode = obj.mode AttributeError: 'memmap' object has no attribute 'filename' Sounds like a bug or not? Any clue? TIA. Cheers, PS : NumPy 1.5.1 on wheezy debian box -- Fred
On Mon, Sep 5, 2011 at 12:43 PM, Fred <fredmfp@gmail.com> wrote:
Hi all,
I get some issue using gradient on an array created from memmap:
PC-Fred[pts/10]:~/{11}/> ipython -p numpy Python 2.6.7 (r267:88850, Jul 10 2011, 08:11:54) Type "copyright", "credits" or "license" for more information.
IPython 0.10.2 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object'. ?object also works, ?? prints more.
IPython profile: numpy
PC-Fred[12:39:18]:~/{1}/> a=array([[1.,2.], [3., 4.]], dtype='f')
PC-Fred[12:40:24]:~/{2}/> a.tofile('a.sep')
PC-Fred[12:40:32]:~/{3}/> del a
PC-Fred[12:40:45]:~/{4}/> a = memmap('a.sep', mode='r', dtype='f', shape=(2,2))
PC-Fred[12:40:49]:~/{5}/> x, y = gradient(a, 1, 1) --------------------------------------------------------------------------- AttributeError Traceback (most recent call last)
/mnt/common/home/fred/<ipython console> in <module>()
/usr/lib/pymodules/python2.6/numpy/lib/function_base.py in gradient(f, *varargs) 842 for axis in range(N): 843 # select out appropriate parts for this dimension
--> 844 out = np.zeros_like(f).astype(otype) 845 slice1[axis] = slice(1, -1) 846 slice2[axis] = slice(2, None)
/usr/lib/pymodules/python2.6/numpy/core/memmap.py in __array_finalize__(self, obj) 255 if hasattr(obj, '_mmap'): 256 self._mmap = obj._mmap --> 257 self.filename = obj.filename 258 self.offset = obj.offset 259 self.mode = obj.mode
AttributeError: 'memmap' object has no attribute 'filename'
Sounds like a bug or not?
Any clue?
That rings a bell. I can reproduce this with 1.5.1, and it's fixed in master.
Ralf
participants (5)
-
Fred
-
Gael Varoquaux
-
josef.pktd@gmail.com
-
Olivier Grisel
-
Ralf Gommers