Well, I would see solutions: 1- to keep how your code is, withj a python list (you can stack numpy arrays if they have the same dimensions): for filename in netCDF_list: ncfile=netCDF4.Dataset(filename) TSFC=ncfile.variables['T_SFC'][:] fillvalue=ncfile.variables['T_SFC']._FillValue TSFC=MA.masked_values(TSFC, fillvalue) TSFCWithOutNan=[] for a in TSFC: indexnonNaN=N.isfinite(a) SliceofTotoWithoutNan=a[indexnonNaN] print SliceofTotoWithoutNan TSFCWithOutNan .append( SliceofTotoWithoutNan ) for i in xrange(0,len(TSFCWithOutNan )-1,1): slice_counter +=1 #print slice_counter try: running_sum=N.add(running_sum, TSFCWithOutNan [i]) except NameError: print "Initiating the running total of my variable..." running_sum=N.array(TSFCWithOutNan [i]) ... or 2- everything in the same loop: slice_counter =0 for a in TSFC: indexnonNaN=N.isfinite(a) SliceofTotoWithoutNan=a[indexnonNaN] slice_counter +=1 #print slice_counter try: running_sum=N.add(running_sum, SliceofTotoWithoutNan ) except NameError: print "Initiating the running total of my variable..." running_sum=N.array( SliceofTotoWithoutNan ) TSFC_avg=N.true_divide(running_sum, slice_counter) N.set_printoptions(threshold='nan') print "the TSFC_avg is:", TSFC_avg See if it works. it is just a rapid guess Xavier for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder + '*/02/')+ glob.glob(MainFolder + '*/12/'):
#print dir
for ncfile in glob.glob(dir + '*.nc'): netCDF_list.append(ncfile)
slice_counter=0 print netCDF_list for filename in netCDF_list: ncfile=netCDF4.Dataset(filename) TSFC=ncfile.variables['T_SFC'][:] fillvalue=ncfile.variables['T_SFC']._FillValue TSFC=MA.masked_values(TSFC, fillvalue) for a in TSFC: indexnonNaN=N.isfinite(a) SliceofTotoWithoutNan=a[indexnonNaN] print SliceofTotoWithoutNan TSFC=SliceofTotoWithoutNan
for i in xrange(0,len(TSFC)-1,1): slice_counter +=1 #print slice_counter try: running_sum=N.add(running_sum, TSFC[i]) except NameError: print "Initiating the running total of my variable..." running_sum=N.array(TSFC[i])
TSFC_avg=N.true_divide(running_sum, slice_counter) N.set_printoptions(threshold='nan') print "the TSFC_avg is:", TSFC_avg
On Tue, Dec 6, 2011 at 9:50 AM, Xavier Barthelemy <xabart@gmail.com>wrote:
Hi, I don't know if it is the best choice, but this is what I do in my code:
for each slice: indexnonNaN=np.isfinite(SliceOf Toto) SliceOf TotoWithoutNan= SliceOf Toto [indexnonNaN]
and then perform all operation I want o on the last array.
i hope it does answer your question
Xavier
2011/12/6 questions anon <questions.anon@gmail.com>
Maybe I am asking the wrong question or could go about this another way. I have thousands of numpy arrays to flick through, could I just identify which arrays have NAN's and for now ignore the entire array. is there a simple way to do this? any feedback will be greatly appreciated.
On Thu, Dec 1, 2011 at 12:16 PM, questions anon < questions.anon@gmail.com> wrote:
I am trying to calculate the mean across many netcdf files. I cannot use numpy.mean because there are too many files to concatenate and I end up with a memory error. I have enabled the below code to do what I need but I have a few nan values in some of my arrays. Is there a way to ignore these somewhere in my code. I seem to face this problem often so I would love a command that ignores blanks in my array before I continue on to the next processing step. Any feedback is greatly appreciated.
netCDF_list=[] for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder + '*/02/')+ glob.glob(MainFolder + '*/12/'): for ncfile in glob.glob(dir + '*.nc'): netCDF_list.append(ncfile)
slice_counter=0 print netCDF_list
for filename in netCDF_list: ncfile=netCDF4.Dataset(filename) TSFC=ncfile.variables['T_SFC'][:] fillvalue=ncfile.variables['T_SFC']._FillValue TSFC=MA.masked_values(TSFC, fillvalue) for i in xrange(0,len(TSFC)-1,1): slice_counter +=1 #print slice_counter try: running_sum=N.add(running_sum, TSFC[i]) except NameError: print "Initiating the running total of my variable..." running_sum=N.array(TSFC[i])
TSFC_avg=N.true_divide(running_sum, slice_counter) N.set_printoptions(threshold='nan') print "the TSFC_avg is:", TSFC_avg
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
-- « Quand le gouvernement viole les droits du peuple, l'insurrection est, pour le peuple et pour chaque portion du peuple, le plus sacré des droits et le plus indispensable des devoirs »
Déclaration des droits de l'homme et du citoyen, article 35, 1793
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
-- « Quand le gouvernement viole les droits du peuple, l'insurrection est, pour le peuple et pour chaque portion du peuple, le plus sacré des droits et le plus indispensable des devoirs » Déclaration des droits de l'homme et du citoyen, article 35, 1793