iterate over multiple arrays

Hy everybody,
I'm wondering what is the (best) way to apply the same function to multiple arrays.
For example, in the following code:
from numpy import *
def f(arr): return arr*2
a = array( [1,1,1] ) b = array( [2,2,2] ) c = array( [3,3,3] ) d = array( [4,4,4] )
a = f(a) b = f(b) c = f(c) d = f(d)
I would like to replace :
a = f(a) b = f(b) c = f(c) d = f(d)
with something like that, but which really modify a,b,c and d:
for x in [a,b,c,d]: x = f(x)
So having something like a pointer on the arrays.
Thanks for your help !
David --

If you can make f work in-place then you can just call map(f, [a, b, c, d]): def f(arr): arr *= 2
Otherwise, you can: - Work with a list instead (a_b_c_d = map(f, a_b_c_d), with a_b_c_d = [a, b, c, d], but this won't update the local definitions of a, b, c, d). - Use locals(): for x in ('a', 'b', 'c', 'd'): locals()[x] = f(eval(x))
-=- Olivier
2011/9/12 David Froger david.froger@gmail.com
Hy everybody,
I'm wondering what is the (best) way to apply the same function to multiple arrays.
For example, in the following code:
from numpy import *
def f(arr): return arr*2
a = array( [1,1,1] ) b = array( [2,2,2] ) c = array( [3,3,3] ) d = array( [4,4,4] )
a = f(a) b = f(b) c = f(c) d = f(d)
I would like to replace :
a = f(a) b = f(b) c = f(c) d = f(d)
with something like that, but which really modify a,b,c and d:
for x in [a,b,c,d]: x = f(x)
So having something like a pointer on the arrays.
Thanks for your help !
David
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Mon, Sep 12, 2011 at 01:52, David Froger david.froger@gmail.com wrote:
Hy everybody,
I'm wondering what is the (best) way to apply the same function to multiple arrays.
For example, in the following code:
from numpy import *
def f(arr): return arr*2
a = array( [1,1,1] ) b = array( [2,2,2] ) c = array( [3,3,3] ) d = array( [4,4,4] )
a = f(a) b = f(b) c = f(c) d = f(d)
I would like to replace :
a = f(a) b = f(b) c = f(c) d = f(d)
This is usually the best thing to do for few variables and simple function calls. If you have many more variables, you should be keeping them in a list or dict instead of individual named variables. If you have a complicated expression, wrap it in a function.
You could also do something like this:
a,b,c,d = map(f, [a,b,c,d])
but it's harder to understand what is going on that just using four separate lines and no easier to maintain.
Don't use eval() or locals().

Thank you Olivier and Robert for your replies!
Some remarks about the dictionnary solution:
from numpy import *
def f(arr): return arr + 100.
arrs = {} arrs['a'] = array( [1,1,1] ) arrs['b'] = array( [2,2,2] ) arrs['c'] = array( [3,3,3] ) arrs['d'] = array( [4,4,4] )
for key,value in arrs.iteritems(): arrs[key] = f(value)
1. about the memory Memory is first allocated with the array functions: arrs['a'] = array( [1,1,1] ) arrs['b'] = array( [2,2,2] ) arrs['c'] = array( [3,3,3] ) arrs['d'] = array( [4,4,4] )
Are there others memory allocations with this assignemnt: arrs[key] = f(value) or is the already allocated memory used to store the result of f(value)?
In other words, if I have N arrays of the same shape, each of them costing nbytes of memory, does it use N*nbytes memory, or 2*N*bytes?
I think this is well documented on the web and I can find it....
2. about individual array The problem is that now, if one want to use a individual array, one have now to use: arrs['a'] instead of just: a So I'm sometime tempted to use locals() instead of arrs...
--

On Tue, Sep 13, 2011 at 01:53, David Froger david.froger@gmail.com wrote:
Thank you Olivier and Robert for your replies!
Some remarks about the dictionnary solution:
from numpy import *
def f(arr): return arr + 100.
arrs = {} arrs['a'] = array( [1,1,1] ) arrs['b'] = array( [2,2,2] ) arrs['c'] = array( [3,3,3] ) arrs['d'] = array( [4,4,4] )
for key,value in arrs.iteritems(): arrs[key] = f(value)
- about the memory
Memory is first allocated with the array functions: arrs['a'] = array( [1,1,1] ) arrs['b'] = array( [2,2,2] ) arrs['c'] = array( [3,3,3] ) arrs['d'] = array( [4,4,4] )
Are there others memory allocations with this assignemnt: arrs[key] = f(value) or is the already allocated memory used to store the result of f(value)?
In other words, if I have N arrays of the same shape, each of them costing nbytes of memory, does it use N*nbytes memory, or 2*N*bytes?
Temporarily, yes, for all of the variations mentioned. When the expression "f(value)" is evaluated, both the result array and the input array will exist simultaneously in memory. Once the assignment happens, the original input array will be destroyed and free up the memory. There is no difference memory-wise between assigning into a dictionary or assigning to a variable name.
Sometimes, you can write your f() such that you just need to do
f(value)
and have the value object modified in-place. In that case, there is no need to reassign the result to a variable or dictionary key.
I think this is well documented on the web and I can find it....
- about individual array
The problem is that now, if one want to use a individual array, one have now to use: arrs['a'] instead of just: a So I'm sometime tempted to use locals() instead of arrs...
Seriously, don't. It makes your code worse, not better. It's also unreliable. The locals() dictionary is meant to be read-only (and even then for debugger tooling and the like, not regular code), and this is sometimes enforced. If you want to use variable names instead of dictionaries, use them, but write out each assignment statement.

I agree with Robert, don't use locals(). I should have added a disclaimer "this is very hackish and probably not a good idea", sorry ;) (interesting read: http://stackoverflow.com/questions/1450275/modifying-locals-in-python)
From what you said I think what you really want is f to work in-place. This
also has the advantage of minimizing memory allocations. If you can't directly modify f, you can also do:
def f_inplace(x): x[:] = f(x)
then just call map(f_inplace, [a, b, c, d]). However note that there will be some memory temporarily allocated to store the result of f(x) (so it is not as optimal as ensuring f directly works in-place).
In addition to the dictionary workaround mentioned by Robert, if it is not practical to have all your variables of interest into a single dictionary, you can instead declare your variables as one-element lists, or use a class with a single field:
1. a = [numpy.array(...)] a[0] = f(a[0])
2. class ArrayHolder(object): def __init__(self, arr): self.arr = arr
a = ArrayHolder(numpy.array(...)) a.arr = f(a.arr)
But of course it is not as convenient to write a[0] or a.arr instead of just a.
-=- Olivier
2011/9/13 Robert Kern robert.kern@gmail.com
On Tue, Sep 13, 2011 at 01:53, David Froger david.froger@gmail.com wrote:
Thank you Olivier and Robert for your replies!
Some remarks about the dictionnary solution:
from numpy import *
def f(arr): return arr + 100.
arrs = {} arrs['a'] = array( [1,1,1] ) arrs['b'] = array( [2,2,2] ) arrs['c'] = array( [3,3,3] ) arrs['d'] = array( [4,4,4] )
for key,value in arrs.iteritems(): arrs[key] = f(value)
- about the memory
Memory is first allocated with the array functions: arrs['a'] = array( [1,1,1] ) arrs['b'] = array( [2,2,2] ) arrs['c'] = array( [3,3,3] ) arrs['d'] = array( [4,4,4] )
Are there others memory allocations with this assignemnt: arrs[key] = f(value) or is the already allocated memory used to store the result of f(value)?
In other words, if I have N arrays of the same shape, each of them
costing
nbytes of memory, does it use N*nbytes memory, or 2*N*bytes?
Temporarily, yes, for all of the variations mentioned. When the expression "f(value)" is evaluated, both the result array and the input array will exist simultaneously in memory. Once the assignment happens, the original input array will be destroyed and free up the memory. There is no difference memory-wise between assigning into a dictionary or assigning to a variable name.
Sometimes, you can write your f() such that you just need to do
f(value)
and have the value object modified in-place. In that case, there is no need to reassign the result to a variable or dictionary key.
I think this is well documented on the web and I can find it....
- about individual array
The problem is that now, if one want to use a individual array, one have
now to
use: arrs['a'] instead of just: a So I'm sometime tempted to use locals() instead of arrs...
Seriously, don't. It makes your code worse, not better. It's also unreliable. The locals() dictionary is meant to be read-only (and even then for debugger tooling and the like, not regular code), and this is sometimes enforced. If you want to use variable names instead of dictionaries, use them, but write out each assignment statement.
-- Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Thanks everybody for the different solutions proposed, I really appreciate. What about this solution? So simple that I didn't think to it...
import numpy as np from numpy import *
def f(arr): return arr*2
a = array( [1,1,1] ) b = array( [2,2,2] ) c = array( [3,3,3] ) d = array( [4,4,4] )
for x in (a,b,c,d): x[:] = x[:]*2 #instead of: x = x*2
print a print b print c print d --

It'll work, it is equivalent to the suggestion I made in my previous post with the f_inplace wrapper function (and it has the same drawback that numpy will allocate temporary memory, which wouldn't be the case if f was working in-place directly, by implementing it as "arr *= 2").
Note that you don't need to write x[:] * 2, you can write x * 2 directly.
-=- Olivier
2011/10/1 David Froger david.froger@gmail.com
Thanks everybody for the different solutions proposed, I really appreciate. What about this solution? So simple that I didn't think to it...
import numpy as np from numpy import *
def f(arr): return arr*2
a = array( [1,1,1] ) b = array( [2,2,2] ) c = array( [3,3,3] ) d = array( [4,4,4] )
for x in (a,b,c,d): x[:] = x[:]*2 #instead of: x = x*2
print a print b print c print d -- _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Sat, Oct 1, 2011 at 11:34 AM, Olivier Delalleau shish@keba.be wrote:
It'll work, it is equivalent to the suggestion I made in my previous post with the f_inplace wrapper function (and it has the same drawback that numpy will allocate temporary memory, which wouldn't be the case if f was working in-place directly, by implementing it as "arr *= 2").
Note that you don't need to write x[:] * 2, you can write x * 2 directly.
Or even x *= 2
<snip>
Chuck

On 09/13/2011 01:53 AM, David Froger wrote:
Thank you Olivier and Robert for your replies!
Some remarks about the dictionnary solution:
from numpy import *
def f(arr): return arr + 100.
arrs = {} arrs['a'] = array( [1,1,1] ) arrs['b'] = array( [2,2,2] ) arrs['c'] = array( [3,3,3] ) arrs['d'] = array( [4,4,4] )
for key,value in arrs.iteritems(): arrs[key] = f(value)
- about the memory
Memory is first allocated with the array functions: arrs['a'] = array( [1,1,1] ) arrs['b'] = array( [2,2,2] ) arrs['c'] = array( [3,3,3] ) arrs['d'] = array( [4,4,4] )
Are there others memory allocations with this assignemnt: arrs[key] = f(value) or is the already allocated memory used to store the result of f(value)?
In other words, if I have N arrays of the same shape, each of them costing nbytes of memory, does it use N*nbytes memory, or 2*N*bytes?
I think this is well documented on the web and I can find it....
- about individual array
The problem is that now, if one want to use a individual array, one have now to use: arrs['a'] instead of just: a So I'm sometime tempted to use locals() instead of arrs...
Perhaps not quite what you want but you do have the option of a structured or record array instead of a dict if the individual arrays are the same dimensions. See for example: http://www.scipy.org/RecordArrays
import numpy as np img = np.array([(1,2,3,4), (1,2,3,4), (1,2,3,4), (1,2,3,4)],
[('a',int),('b',int),('c',int), ('d',int)])
img.dtype.names
('a', 'b', 'c', 'd')
img['a']
array([1, 1, 1, 1])
rimg = img.view(np.recarray) #take a view rimg.a
array([1, 1, 1, 1])
rimg.b
array([2, 2, 2, 2])
Other links: http://docs.scipy.org/doc/numpy/user/basics.rec.html http://www.scipy.org/Cookbook/Recarray
Bruce

Den 12.09.2011 08:52, skrev David Froger:
Hy everybody,
I'm wondering what is the (best) way to apply the same function to multiple arrays.
I tried to experiment a bit with this. Here is from an ipython session:
Create some arrays:
In [1]: import numpy as np
In [2]: a = np.zeros(4)
In [3]: b = a+1
In [4]: c = a+2
In [5]: d = a+3
Create an array with dtype=object to store the four arrays a-d:
In [6]: e = np.zeros(4, dtype=object)
In [7]: e[:] = a,b,c,d
In [8]: e
Out[8]:
array([[ 0. 0. 0. 0.], [ 1. 1. 1. 1.], [ 2. 2. 2. 2.],
[ 3. 3. 3. 3.]], dtype=object)
Modify array e inplace:
In [9]: e += 1
In [10]: e
Out[10]:
array([[ 1. 1. 1. 1.], [ 2. 2. 2. 2.], [ 3. 3. 3. 3.],
[ 4. 4. 4. 4.]], dtype=object)
This did not modify arrays a-d though:
In [11]: a
Out[11]: array([ 0., 0., 0., 0.])
Albeit e[0] was array a prior to the iadd:
In [12]: e[:] = a,b,c,d
In [13]: e[0] is a
Out[13]: True
We can apply a function to the arrays in e, getting array([f(a), f(b), f(c), f(d)]):
In [14]: np.sum(e)
Out[14]: array([ 6., 6., 6., 6.])
Observe that if e was a 2D array, np.sum(e) would have returned a scalar, like so:
In [18]: g = np.array((a,b,c,d))
In [19]: g
Out[19]:
array([[ 0., 0., 0., 0.],
[ 1., 1., 1., 1.],
[ 2., 2., 2., 2.],
[ 3., 3., 3., 3.]])
In [20]: np.sum(g)
Out[20]: 24.0
Which means:
We can create an array of arrays, and have numpy broadcast an ufunc to multiple arrays.
The other obvious way is, as mentioned by others, to keep the arrays in a normal Python container (e.g. list) and use a for loop or functional programming (map, apply, reduce).
Sturla
participants (6)
-
Bruce Southey
-
Charles R Harris
-
David Froger
-
Olivier Delalleau
-
Robert Kern
-
Sturla Molden