"upsample" or scale an array
Thanks Warren, this is great, and even handles giant arrays just fine if you've got enough RAM. I also just found this StackOverflow post with another solution. a.repeat(2, axis=0).repeat(2, axis=1). http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array np.kron lets you do more, but for my simple use case the repeat() method is faster and more ram efficient with large arrays. In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8') In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 10 loops, best of 3: 182 ms per loop In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8')) 1 loops, best of 3: 513 ms per loop Or for a 43200x4800 array: In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8') In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 1 loops, best of 3: 6.92 s per loop In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8')) 1 loops, best of 3: 27.8 s per loop In this case repeat() peaked at about 1gb of ram usage while np.kron hit about 1.7gb. Thanks again Warren. I'd tried way too many variations on reshape and rollaxis, and should have come to the Numpy list a lot sooner! -Robin On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote:
On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote:
I need to take an array - derived from raster GIS data - and upsample or scale it. That is, I need to repeat each value in each dimension so that, for example, a 2x2 array becomes a 4x4 array as follows:
[[1, 2], [3, 4]]
becomes
[[1,1,2,2], [1,1,2,2], [3,3,4,4] [3,3,4,4]]
It seems like some combination of np.resize or np.repeat and reshape + rollaxis would do the trick, but I'm at a loss.
Many thanks!
-Robin
Just a day or so ago, Josef Perktold showed one way of accomplishing this using numpy.kron:
In [14]: a = arange(12).reshape(3,4)
In [15]: a Out[15]: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]])
In [16]: kron(a, ones((2,2))) Out[16]: array([[ 0., 0., 1., 1., 2., 2., 3., 3.], [ 0., 0., 1., 1., 2., 2., 3., 3.], [ 4., 4., 5., 5., 6., 6., 7., 7.], [ 4., 4., 5., 5., 6., 6., 7., 7.], [ 8., 8., 9., 9., 10., 10., 11., 11.], [ 8., 8., 9., 9., 10., 10., 11., 11.]])
Warren
You can also use numpy.tile -=- Olivier 2011/12/3 Robin Kraft <rkraft4@gmail.com>
Thanks Warren, this is great, and even handles giant arrays just fine if you've got enough RAM.
I also just found this StackOverflow post with another solution.
a.repeat(2, axis=0).repeat(2, axis=1). http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array
np.kron lets you do more, but for my simple use case the repeat() method is faster and more ram efficient with large arrays.
In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8')
In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 10 loops, best of 3: 182 ms per loop
In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8')) 1 loops, best of 3: 513 ms per loop
Or for a 43200x4800 array:
In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8')
In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 1 loops, best of 3: 6.92 s per loop
In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8')) 1 loops, best of 3: 27.8 s per loop
In this case repeat() peaked at about 1gb of ram usage while np.kron hit about 1.7gb.
Thanks again Warren. I'd tried way too many variations on reshape and rollaxis, and should have come to the Numpy list a lot sooner!
-Robin
On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote:
On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote:
* I need to take an array - derived from raster GIS data - and upsample or*>* scale it. That is, I need to repeat each value in each dimension so that,*>* for example, a 2x2 array becomes a 4x4 array as follows:*>**>* [[1, 2],*>* [3, 4]]*>**>* becomes*>**>* [[1,1,2,2],*>* [1,1,2,2],*>* [3,3,4,4]*>* [3,3,4,4]]*>**>* It seems like some combination of np.resize or np.repeat and reshape +*>* rollaxis would do the trick, but I'm at a loss.*>**>* Many thanks!*>**>* -Robin*>**
Just a day or so ago, Josef Perktold showed one way of accomplishing this using numpy.kron:
In [14]: a = arange(12).reshape(3,4)
In [15]: a Out[15]: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]])
In [16]: kron(a, ones((2,2))) Out[16]: array([[ 0., 0., 1., 1., 2., 2., 3., 3.], [ 0., 0., 1., 1., 2., 2., 3., 3.], [ 4., 4., 5., 5., 6., 6., 7., 7.], [ 4., 4., 5., 5., 6., 6., 7., 7.], [ 8., 8., 9., 9., 10., 10., 11., 11.], [ 8., 8., 9., 9., 10., 10., 11., 11.]])
Warren
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
That does repeat the elements, but doesn't get them into the desired order. In [4]: print a [[1 2] [3 4]] In [7]: np.tile(a, 4) Out[7]: array([[1, 2, 1, 2, 1, 2, 1, 2], [3, 4, 3, 4, 3, 4, 3, 4]]) In [8]: np.tile(a, 4).reshape(4,4) Out[8]: array([[1, 2, 1, 2], [1, 2, 1, 2], [3, 4, 3, 4], [3, 4, 3, 4]]) It's close, but I want to repeat the elements along the two axes, effectively stretching it by the lower right corner: array([[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]]) It would take some more reshaping/axis rolling to get there, but it seems doable. Anyone know what combination of manipulations would work with the result of np.tile? -Robin On Dec 3, 2011, at 11:05 AM, Olivier Delalleau wrote:
You can also use numpy.tile
-=- Olivier
2011/12/3 Robin Kraft
Thanks Warren, this is great, and even handles giant arrays just fine if you've got enough RAM.
I also just found this StackOverflow post with another solution.
a.repeat(2, axis=0).repeat(2, axis=1). http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array
np.kron lets you do more, but for my simple use case the repeat() method is faster and more ram efficient with large arrays.
In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8')
In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 10 loops, best of 3: 182 ms per loop
In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8')) 1 loops, best of 3: 513 ms per loop
Or for a 43200x4800 array:
In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8')
In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 1 loops, best of 3: 6.92 s per loop
In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8')) 1 loops, best of 3: 27.8 s per loop
In this case repeat() peaked at about 1gb of ram usage while np.kron hit about 1.7gb.
Thanks again Warren. I'd tried way too many variations on reshape and rollaxis, and should have come to the Numpy list a lot sooner!
-Robin
On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote:
On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote:
I need to take an array - derived from raster GIS data - and upsample or scale it. That is, I need to repeat each value in each dimension so that, for example, a 2x2 array becomes a 4x4 array as follows:
[[1, 2], [3, 4]]
becomes
[[1,1,2,2], [1,1,2,2], [3,3,4,4] [3,3,4,4]]
It seems like some combination of np.resize or np.repeat and reshape + rollaxis would do the trick, but I'm at a loss.
Many thanks!
-Robin
Just a day or so ago, Josef Perktold showed one way of accomplishing this using numpy.kron:
In [14]: a = arange(12).reshape(3,4)
In [15]: a Out[15]: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]])
In [16]: kron(a, ones((2,2))) Out[16]: array([[ 0., 0., 1., 1., 2., 2., 3., 3.], [ 0., 0., 1., 1., 2., 2., 3., 3.], [ 4., 4., 5., 5., 6., 6., 7., 7.], [ 4., 4., 5., 5., 6., 6., 7., 7.], [ 8., 8., 9., 9., 10., 10., 11., 11.], [ 8., 8., 9., 9., 10., 10., 11., 11.]])
Warren
Ah sorry, I hadn't read carefully enough what you were trying to achieve. I think the double repeat solution looks like your best option then. -=- Olivier 2011/12/3 Robin Kraft <rkraft4@gmail.com>
That does repeat the elements, but doesn't get them into the desired order.
In [4]: print a [[1 2] [3 4]]
In [7]: np.tile(a, 4) Out[7]: array([[1, 2, 1, 2, 1, 2, 1, 2], [3, 4, 3, 4, 3, 4, 3, 4]])
In [8]: np.tile(a, 4).reshape(4,4) Out[8]: array([[1, 2, 1, 2], [1, 2, 1, 2], [3, 4, 3, 4], [3, 4, 3, 4]])
It's close, but I want to repeat the elements along the two axes, effectively stretching it by the lower right corner:
array([[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]])
It would take some more reshaping/axis rolling to get there, but it seems doable.
Anyone know what combination of manipulations would work with the result of np.tile?
-Robin
On Dec 3, 2011, at 11:05 AM, Olivier Delalleau wrote:
You can also use numpy.tile
-=- Olivier
2011/12/3 Robin Kraft
Thanks Warren, this is great, and even handles giant arrays just fine if you've got enough RAM.
I also just found this StackOverflow post with another solution.
a.repeat(2, axis=0).repeat(2, axis=1). http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array
np.kron lets you do more, but for my simple use case the repeat() method is faster and more ram efficient with large arrays.
In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8')
In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 10 loops, best of 3: 182 ms per loop
In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8')) 1 loops, best of 3: 513 ms per loop
Or for a 43200x4800 array:
In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8')
In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 1 loops, best of 3: 6.92 s per loop
In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8')) 1 loops, best of 3: 27.8 s per loop
In this case repeat() peaked at about 1gb of ram usage while np.kron hit about 1.7gb.
Thanks again Warren. I'd tried way too many variations on reshape and rollaxis, and should have come to the Numpy list a lot sooner!
-Robin
On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote:
On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote:
* I need to take an array - derived from raster GIS data - and upsample or*>* scale it. That is, I need to repeat each value in each dimension so that,*>* for example, a 2x2 array becomes a 4x4 array as follows:*>**>* [[1, 2],*>* [3, 4]]*>**>* becomes*>**>* [[1,1,2,2],*>* [1,1,2,2],*>* [3,3,4,4]*>* [3,3,4,4]]*>**>* It seems like some combination of np.resize or np.repeat and reshape +*>* rollaxis would do the trick, but I'm at a loss.*>**>* Many thanks!*>**>* -Robin*>**
Just a day or so ago, Josef Perktold showed one way of accomplishing this using numpy.kron:
In [14]: a = arange(12).reshape(3,4)
In [15]: a Out[15]: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]])
In [16]: kron(a, ones((2,2))) Out[16]: array([[ 0., 0., 1., 1., 2., 2., 3., 3.], [ 0., 0., 1., 1., 2., 2., 3., 3.], [ 4., 4., 5., 5., 6., 6., 7., 7.], [ 4., 4., 5., 5., 6., 6., 7., 7.], [ 8., 8., 9., 9., 10., 10., 11., 11.], [ 8., 8., 9., 9., 10., 10., 11., 11.]])
Warren
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On 03.12.2011, at 6:47PM, Olivier Delalleau wrote:
Ah sorry, I hadn't read carefully enough what you were trying to achieve. I think the double repeat solution looks like your best option then.
Considering that it is a lot shorter than fixing the tile() result, you are probably right (I've only now looked closer at the repeat() solution ;-). I'd still be interested in the performance - since I think none of the reshape or rollaxis operations actually move any data in memory (for numpy > 1.6), it might still be faster. Cheers, Derek
On 03.12.2011, at 6:22PM, Robin Kraft wrote:
That does repeat the elements, but doesn't get them into the desired order.
In [4]: print a [[1 2] [3 4]]
In [7]: np.tile(a, 4) Out[7]: array([[1, 2, 1, 2, 1, 2, 1, 2], [3, 4, 3, 4, 3, 4, 3, 4]])
In [8]: np.tile(a, 4).reshape(4,4) Out[8]: array([[1, 2, 1, 2], [1, 2, 1, 2], [3, 4, 3, 4], [3, 4, 3, 4]])
It's close, but I want to repeat the elements along the two axes, effectively stretching it by the lower right corner:
array([[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]])
It would take some more reshaping/axis rolling to get there, but it seems doable.
Anyone know what combination of manipulations would work with the result of np.tile?
Rolling was the keyword: np.rollaxis(np.tile(a, 4).reshape(2,2,-1), 2, 1).reshape(4,4)) [[1 1 2 2] [1 1 2 2] [3 3 4 4] [3 3 4 4]] I leave the generalisation and timing up to you, but it seems for a = np.arange(M**2).reshape(M,-1) np.rollaxis(np.tile(a, N**2).reshape(M,N,-1), 2, 1).reshape(M*N,-1) should do the trick. Cheers, Derek
Ha! I knew it had to be possible! Thanks Derek. So for and N = 2 (now on my laptop): In [70]: M = 1200 In [69]: N = 2 In [71]: a = np.random.randint(0, 255, (M**2)).reshape(M,-1) In [76]: timeit np.rollaxis(np.tile(a, N**2).reshape(M,N,-1), 2, 1).reshape(M*N,-1) 10 loops, best of 3: 99.1 ms per loop In [78]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 10 loops, best of 3: 85.6 ms per loop In [79]: timeit np.kron(a, np.ones((2,2), 'uint8')) 1 loops, best of 3: 521 ms per loop It turns out np.kron and repeat are pretty straightforward for multi-dimensional data too - scaling or stretching a stacked array representing pixel data over time, for example. Nothing changes for np.kron - it handles the additional dimensionality by itself. With repeat you just tell it to operate on the last two dimensions. So to sum up: 1) np.kron is cool for the simplicity of the code and simple scaling to N dimensions. It's also handy if you want to scale the array elements themselves too. 2) repeat() along the last N axes is a bit more intuitive (i.e. less magical) to me and has a better performance profile. 3) Derek's reshape/rolling solution is almost as fast but it gives me a headache trying to visualize what it's actually doing. I don't want to think about adding another dimension ... Thanks for the help folks. Here's scaling of a hypothetical time series (i.e. 3 axes), where each sub-array represents a month. In [26]: print a [[[1 2] [3 4]] [[1 2] [3 4]] [[1 2] [3 4]]] In [27]: np.kron(a, np.ones((2,2), dtype='uint8')) Out[27]: array([[[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]], [[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]], [[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]]]) In [64]: a.repeat(2, axis=1).repeat(2, axis=2) Out[64]: array([[[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]], [[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]], [[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]]]) On Dec. 3, 2011, at 12:50PM, Derek Homeier wrote:
On 03.12.2011, at 6:22PM, Robin Kraft wrote:
That does repeat the elements, but doesn't get them into the desired order.
In [4]: print a [[1 2] [3 4]]
In [7]: np.tile(a, 4) Out[7]: array([[1, 2, 1, 2, 1, 2, 1, 2], [3, 4, 3, 4, 3, 4, 3, 4]])
In [8]: np.tile(a, 4).reshape(4,4) Out[8]: array([[1, 2, 1, 2], [1, 2, 1, 2], [3, 4, 3, 4], [3, 4, 3, 4]])
It's close, but I want to repeat the elements along the two axes, effectively stretching it by the lower right corner:
array([[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]])
It would take some more reshaping/axis rolling to get there, but it seems doable.
Anyone know what combination of manipulations would work with the result of np.tile?
Rolling was the keyword:
np.rollaxis(np.tile(a, 4).reshape(2,2,-1), 2, 1).reshape(4,4)) [[1 1 2 2] [1 1 2 2] [3 3 4 4] [3 3 4 4]]
I leave the generalisation and timing up to you, but it seems for a = np.arange(M**2).reshape(M,-1)
np.rollaxis(np.tile(a, N**2).reshape(M,N,-1), 2, 1).reshape(M*N,-1)
should do the trick.
Cheers, Derek
participants (3)
-
Derek Homeier -
Olivier Delalleau -
Robin Kraft