On Wed, Apr 13, 2011 at 9:50 AM, Jonathan Rocher <jrocher@enthought.com> wrote:
Hi,
I assume you have this data in a txt file, correct? You can load up all of it in a numpy array using import numpy as np data = np.loadtxt("climat_file.txt", skiprows = 1)
Then you can compute the mean you want by taking it on a slice of the data array. For example, if you want to compute the mean of your data in Jan for 1950-1970 (say including 1970) mean1950_1970 = data[1950:1971,1].mean()
Then the std deviation you want could be computed using my_std = np.sqrt(np.mean((data[:,1]-mean1950_1970)**2))
Hope this helps, Jonathan
On Tue, Apr 12, 2011 at 1:48 PM, Climate Research <climateforu@gmail.com> wrote:
Hi I am purely new to python and numpy.. I am using python for doing statistical calculations to Climate data..
I have a data set in the following format..
Year Jan feb Mar Apr................. Dec 1900 1000 1001 , , , 1901 1011 1012 , , , 1902 1009 1007 , , ,,,, , ' , , , ,,,, , , 2010 1008 1002 , , ,
I actually want to standardize each of these values with corresponding standard deviations for each monthly data column.. I have found out the standard deviations for each column.. but now i need to find the standared deviation only for a prescribed mean value ie, when i am finding the standared deviation for the January data column.. the mean should be calculated only for the january data, say from 1950-1970. With this mean i want to calculate the SD for entire column. Any help will be appreciated..
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
-- Jonathan Rocher, PhD Scientific software developer Enthought, Inc. jrocher@enthought.com 1-512-536-1057 http://www.enthought.com
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
To standardize the data over each column you'll want to do: (data - data.mean(axis=0)) / data.std(axis=0, ddof=1) Note the broadcasting behavior of the (matrix - vector) operation--see NumPy documentation for more details. The ddof=1 is there to give you the (unbiased) sample standard deviation. <shameless plug> If you're looking for data structures to carry around your metadata (dates and month labels), look to pandas (my project: http://pandas.sourceforge.net/) or larry (http://larry.sourceforge.net/). </shameless plug> - Wes