accessing a set of columns from a recarray
I have a few things I am trying to understand. I have a record array and and I list of columns for which I would like to get the row means. My current solution is to iterate though my list of column names and make a new "normal" array. I would like to do something like np.mean(A['x','y','z']) It would be great to append the results as an additional column to the array, but I think I have seen how to do this. Next, to make it a little more interesting I have a column that has values 0-14. I would like to get the mean of a row conditional on the value of this column. Ideally I get a new array with columns 0-14 with missing values where needed. I hope this is clear, please ask for clarification if not. *Vincent Davis 720-301-3003 * vincent@vincentdavis.net my blog <http://vincentdavis.net> | LinkedIn<http://www.linkedin.com/in/vincentdavis>
To many distractions let me try to write that a little better. I have a record array and a list of columns for which I would like to get the row means. My current solution is to iterate though the list of column names and make a new "normal" array. then calculate the row means. I would like to do something like np.mean(A['x','y','z']) where x,y,z are the tiles of the columns *Vincent Davis 720-301-3003 * vincent@vincentdavis.net my blog <http://vincentdavis.net> | LinkedIn<http://www.linkedin.com/in/vincentdavis> On Sun, Mar 21, 2010 at 7:52 PM, Vincent Davis <vincent@vincentdavis.net>wrote:
I have a few things I am trying to understand. I have a record array and and I list of columns for which I would like to get the row means. My current solution is to iterate though my list of column names and make a new "normal" array. I would like to do something like np.mean(A['x','y','z']) It would be great to append the results as an additional column to the array, but I think I have seen how to do this.
Next, to make it a little more interesting I have a column that has values 0-14. I would like to get the mean of a row conditional on the value of this column. Ideally I get a new array with columns 0-14 with missing values where needed.
I hope this is clear, please ask for clarification if not.
*Vincent Davis 720-301-3003 * vincent@vincentdavis.net my blog <http://vincentdavis.net> | LinkedIn<http://www.linkedin.com/in/vincentdavis>
On Sun, Mar 21, 2010 at 10:20 PM, Vincent Davis <vincent@vincentdavis.net> wrote:
To many distractions let me try to write that a little better. I have a record array and a list of columns for which I would like to get the row means. My current solution is to iterate though the list of column names and make a new "normal" array. then calculate the row means. I would like to do something like np.mean(A['x','y','z']) where x,y,z are the tiles of the columns
If you have a rec array Y = np.rec.array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0), (7.0, 8.0, 9.0)], dtype=[('var1', '<f8'), ('var2', '<f8'), ('var3', '<f8')]) You can access the rows like, Y[['var1','var2','var3']] Note the list within []. If you want a "normal" array, I like this way that Pierre recently pointed out. 3 is the number of columns, and it fills in the number of rows. Y[['var1','var2','var3']].view((float,3)) note the tuple for the view, if they're all floats. Taking a view might not work if var# have different types, like ints and floats. If you want the mean of the rows (mean over the columns axis = 1) Y[['var1','var2','var3']].view((float,3)).mean(1) Some shortcuts. Y[list(Y.dtype.names)].view((float,len(Y.dtype))).mean(1) Also, for now, the columns will given back to you in the order they're in in the array no matter which way you ask for them. A patch has been submitted for returning the order you ask that I hope gets picked up... Skipper
Thanks, Thats what I needed *Vincent Davis 720-301-3003 * vincent@vincentdavis.net my blog <http://vincentdavis.net> | LinkedIn<http://www.linkedin.com/in/vincentdavis> On Sun, Mar 21, 2010 at 8:33 PM, Skipper Seabold <jsseabold@gmail.com>wrote:
On Sun, Mar 21, 2010 at 10:20 PM, Vincent Davis <vincent@vincentdavis.net> wrote:
To many distractions let me try to write that a little better. I have a record array and a list of columns for which I would like to get
the row means. My current solution is to iterate though the list of column names and make a new "normal" array. then calculate the row means. I would like to do something like np.mean(A['x','y','z']) where x,y,z are the tiles of the columns
If you have a rec array
Y = np.rec.array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0), (7.0, 8.0, 9.0)], dtype=[('var1', '<f8'), ('var2', '<f8'), ('var3', '<f8')])
You can access the rows like,
Y[['var1','var2','var3']]
Note the list within [].
If you want a "normal" array, I like this way that Pierre recently pointed out. 3 is the number of columns, and it fills in the number of rows.
Y[['var1','var2','var3']].view((float,3))
note the tuple for the view, if they're all floats. Taking a view might not work if var# have different types, like ints and floats.
If you want the mean of the rows (mean over the columns axis = 1)
Y[['var1','var2','var3']].view((float,3)).mean(1)
Some shortcuts.
Y[list(Y.dtype.names)].view((float,len(Y.dtype))).mean(1)
Also, for now, the columns will given back to you in the order they're in in the array no matter which way you ask for them. A patch has been submitted for returning the order you ask that I hope gets picked up...
Skipper _______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
I am a little puzzeled by this
data2[list(typeallnames)][0][0] 171.0 data2[list(typeallnames)][0][0]=0 data2[list(typeallnames)][0][0] 171.0
How do I change the value? *Vincent Davis 720-301-3003 * vincent@vincentdavis.net my blog <http://vincentdavis.net> | LinkedIn<http://www.linkedin.com/in/vincentdavis> On Sun, Mar 21, 2010 at 8:33 PM, Skipper Seabold <jsseabold@gmail.com>wrote:
On Sun, Mar 21, 2010 at 10:20 PM, Vincent Davis <vincent@vincentdavis.net> wrote:
To many distractions let me try to write that a little better. I have a record array and a list of columns for which I would like to get
the row means. My current solution is to iterate though the list of column names and make a new "normal" array. then calculate the row means. I would like to do something like np.mean(A['x','y','z']) where x,y,z are the tiles of the columns
If you have a rec array
Y = np.rec.array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0), (7.0, 8.0, 9.0)], dtype=[('var1', '<f8'), ('var2', '<f8'), ('var3', '<f8')])
You can access the rows like,
Y[['var1','var2','var3']]
Note the list within [].
If you want a "normal" array, I like this way that Pierre recently pointed out. 3 is the number of columns, and it fills in the number of rows.
Y[['var1','var2','var3']].view((float,3))
note the tuple for the view, if they're all floats. Taking a view might not work if var# have different types, like ints and floats.
If you want the mean of the rows (mean over the columns axis = 1)
Y[['var1','var2','var3']].view((float,3)).mean(1)
Some shortcuts.
Y[list(Y.dtype.names)].view((float,len(Y.dtype))).mean(1)
Also, for now, the columns will given back to you in the order they're in in the array no matter which way you ask for them. A patch has been submitted for returning the order you ask that I hope gets picked up...
Skipper _______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
I posted the question on ask.scipy.org, I just posted your response, If you are using ask.scipy feel feel to post this reply and I will vote it the correct answer ,if you care :) http://ask.scipy.org/en/topic/16-accessing-a-list-of-columns-in-a-recarray#r... *Vincent Davis 720-301-3003 * vincent@vincentdavis.net my blog <http://vincentdavis.net> | LinkedIn<http://www.linkedin.com/in/vincentdavis> On Sun, Mar 21, 2010 at 8:33 PM, Skipper Seabold <jsseabold@gmail.com>wrote:
On Sun, Mar 21, 2010 at 10:20 PM, Vincent Davis <vincent@vincentdavis.net> wrote:
To many distractions let me try to write that a little better. I have a record array and a list of columns for which I would like to get
the row means. My current solution is to iterate though the list of column names and make a new "normal" array. then calculate the row means. I would like to do something like np.mean(A['x','y','z']) where x,y,z are the tiles of the columns
If you have a rec array
Y = np.rec.array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0), (7.0, 8.0, 9.0)], dtype=[('var1', '<f8'), ('var2', '<f8'), ('var3', '<f8')])
You can access the rows like,
Y[['var1','var2','var3']]
Note the list within [].
If you want a "normal" array, I like this way that Pierre recently pointed out. 3 is the number of columns, and it fills in the number of rows.
Y[['var1','var2','var3']].view((float,3))
note the tuple for the view, if they're all floats. Taking a view might not work if var# have different types, like ints and floats.
If you want the mean of the rows (mean over the columns axis = 1)
Y[['var1','var2','var3']].view((float,3)).mean(1)
Some shortcuts.
Y[list(Y.dtype.names)].view((float,len(Y.dtype))).mean(1)
Also, for now, the columns will given back to you in the order they're in in the array no matter which way you ask for them. A patch has been submitted for returning the order you ask that I hope gets picked up...
Skipper _______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
participants (2)
-
Skipper Seabold
-
Vincent Davis