28 Jun
2011
28 Jun
'11
9 a.m.
Thanks very much!! you are right. It's becuase the extra semicolon in the head row. I have no problems anymore. I thank you for your time. cheeers, Chao 2011/6/28 Derek Homeier <derek@astro.physik.uni-goettingen.de> > Hi Chao, > > by mistake did not reply to the list last time... > > On 27.06.2011, at 10:30PM, Chao YUE wrote: > Hi Derek! > > > > I tried with the lastest version of python(x,y) package with numpy > version of 1.6.0. I gave the data to you with reduced columns (10 column) > and rows. > > > > > b=np.genfromtxt('99Burn2003all_new.csv',delimiter=';',names=True,usecols=tuple(range(10)),dtype=['S10'] > + [ float for n in range(9)]) works. > > if you change usecols=tuple(range(10)) to usecols=range(10), it still > works. > > > > > b=np.genfromtxt('99Burn2003all_new.csv',delimiter=';',names=True,dtype=None) > works. > > > > but > b=np.genfromtxt('99Burn2003all_new.csv',delimiter=';',names=True,dtype=['S10'] > + [ float for n in range(9)]) didn't work. > > > > I use Python(x,y)-2.6.6.1 with numpy version as 1.6.0, I use windows > 32-bit system. > > > > Please don't spend too much time on this if it's not a potential problem. > > > OK, dtype=None works on 1.6.0, that's the important bit. > >From your example file it seems the dtype list does work not without > specifying usecols, because your header contains and excess semicolon in the > field "Air temperature (High; HMP45C)", thus genfromtxt expects more data > columns than actually exist. If you replace the semicolon you should be set > (or, if I may suggest, write another header line with catchier field names > so you don't have to work with array fields like "b['Water vapor density by > LiCor 7500']" ;-). > Otherwise both options work for me with python2.6+numpy-1.5.1 as well as > 1.6.0/1.6.1rc1. > > I am curious though why your python interpreter gave this error message: > > ValueError Traceback (most recent call > last) > > > > D:\data\LaThuile_ancillary\Jim_Randerson_data\<ipython console> in > <module>() > > > > C:\Python26\lib\site-packages\numpy\lib\npyio.pyc in genfromtxt(fname, > dtype, co > > mments, delimiter, skiprows, skip_header, skip_footer, converters, > missing, miss > > ing_values, filling_values, usecols, names, excludelist, deletechars, > replace_sp > > ace, autostrip, case_sensitive, defaultfmt, unpack, usemask, loose, > invalid_rais > > e) > > 1449 # Raise an exception ? > > > > 1450 if invalid_raise: > > -> 1451 raise ValueError(errmsg) > > 1452 # Issue a warning ? > > > > 1453 else: > > > > ValueError > > since ipython2.6 on my Mac reported this: > ... > 1450 if invalid_raise: > -> 1451 raise ValueError(errmsg) > 1452 # Issue a warning ? > > 1453 else: > > ValueError: Some errors were detected ! > Line #3 (got 10 columns instead of 11) > Line #4 (got 10 columns instead of 11) > etc.... > which of course provided the right lead to the problem - was the actual > errmsg really missing, or did you cut the message too soon? > > > the final thing is, when I try to do this (I want to try the > missing_values in numpy 1.6.0), it gives error: > > > > In [33]: import StringIO as StringIO > > > > In [34]: data = "1, 2, 3\n4, 5, 6" > > > > In [35]: np.genfromtxt(StringIO(data), > delimiter=",",dtype="int,int,int",missing_values=2) > > > --------------------------------------------------------------------------- > > TypeError Traceback (most recent call > last) > > > > D:\data\LaThuile_ancillary\Jim_Randerson_data\<ipython console> in > <module>() > > > > TypeError: 'module' object is not callable > > > You want to use "from StringIO import StringIO" (or write > "StringIO.StringIO(data)". > But again, this will not work the way you expect it to with int/float > numbers set as missing_values, and reading to regular arrays. I've tested > this on 1.6.1 and the current development branch as well, and the > missing_values are only considered for masked arrays. This is not likely to > change soon, and may actually be intentional, so to process those numbers on > read-in, your best option would be to define a custom set of > "converters=conv" as shown in my last mail. > > Cheers, > Derek > > > 2011/6/27 Derek Homeier <derek@astro.physik.uni-goettingen.de> > > Hi Chao, > > > > this seems to have become quite a number of different issues! > > But let's make sure I understand what's going on... > > > > > Thanks very much for your quick reply. I make a short summary of what > I've tried. Actually the ['S10'] + [ float for n in range(48) ] only works > when you explicitly specify the columns to be read, and genfromtxt cannot > automatically determine the type if you don't specify the type.... > > > > > > > > In [164]: > b=np.genfromtxt('99Burn2003all.csv',delimiter=';',names=True,usecols=tuple(range(49)),dtype=['S10'] > + [ float for n in range(48)]) > > ... > > > But if I use the following, it gives error: > > > > > > In [171]: > b=np.genfromtxt('99Burn2003all.csv',delimiter=';',names=True,dtype=['S > > > 10'] + [ float for n in range(48)]) > > > > --------------------------------------------------------------------------- > > > ValueError Traceback (most recent call > last) > > > > > And the above (without the usecols) did work if you explicitly typed > dtype=('S10', float, float....)? That by itself would be quite weird, > because the two should be completely equivalent. > > What happens if you cast the generated list to a tuple - > dtype=tuple(['S10'] + [ float for n in range(48)])? > > If you are using a recent numpy version (1.6.0 or 1.6.1rc1), could you > please file a bug report with complete machine info etc.? But I suspect this > might be an older version, you should also be able to simply use > 'usecols=range(49)' (without the tuple()). Either way, I cannot reproduce > this behaviour with the current numpy version. > > > > > If I don't specify the dtype, it will not recognize the type of the > first column (it displays as nan): > > > > > > In [172]: > b=np.genfromtxt('99Burn2003all.csv',delimiter=';',names=True,usecols=(0,1,2)) > > > > > > In [173]: b > > > Out[173]: > > > array([(nan, -999.0, -1.028), (nan, -999.0, -0.40899999999999997), > > > (nan, -999.0, 0.16700000000000001), ..., (nan, -999.0, -999.0), > > > (nan, -999.0, -999.0), (nan, -999.0, -999.0)], > > > dtype=[('TIMESTAMP', '<f8'), ('CO2_flux', '<f8'), > ('Net_radiation', '<f8') > > > ]) > > > > > You _do_ have to specify 'dtype=None', since the default is > 'dtype=float', as I have remarked in my previous mail. If this does not > work, it could be a matter of the numpy version gain - there were a number > of type conversion issues fixed between 1.5.1 and 1.6.0. > > > > > > Then the final question is, actually the '-999.0' in the data is > missing value, but I cannot display it as 'nan' by specifying the > missing_values as '-999.0': > > > but either I set the missing_values as -999.0 or using a dictionary, it > neither work... > > ... > > > > > > Even this doesn't work (suppose 2 is our missing_value), > > > In [184]: data = "1, 2, 3\n4, 5, 6" > > > > > > In [185]: np.genfromtxt(StringIO(data), > delimiter=",",dtype="int,int,int",missin > > > g_values=2) > > > Out[185]: > > > array([(1, 2, 3), (4, 5, 6)], > > > dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<i4')]) > > > > OK, same behaviour here - I found the only tests involving 'valid > numbers' as missing_values use masked arrays; for regular ndarrays they seem > to be ignored. I don't know if this is by design - the question is, what do > you need to do with the data if you know ' -999' always means a missing > value? You could certainly manipulate them after reading in... > > If you have to convert them already on reading in, and using np.mafromtxt > is not an option, your best bet may be to define a custom converter like > (note you have to include any blanks, if present) > > > > conv = dict(((n, lambda s: s==' -999' and np.nan or float(s)) for n in > range(1,49))) > > > > Cheers, > > Derek > > > > > > > > > > -- > > > *********************************************************************************** > > Chao YUE > > Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) > > UMR 1572 CEA-CNRS-UVSQ > > Batiment 712 - Pe 119 > > 91191 GIF Sur YVETTE Cedex > > Tel: (33) 01 69 08 77 30; Fax:01.69.08.77.16 > > > ************************************************************************************ > > > > <99Burn2003all_new.csv> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 77 30; Fax:01.69.08.77.16 ************************************************************************************