Problem with importing csv into datetime64
Hi, I am trying out the latest version of numpy 2.0 dev: np.__version__ Out[44]: '2.0.0.dev-aded70c' I am trying to import CSV data that looks like this: date,system,pumping,rgt,agt,sps,eskom_import,temperature,wind,pressure,weather 2007-01-01 00:30,481.9,,,,,481.9,15,SW,1040,Fine 2007-01-01 01:00,471.9,,,,,471.9,15,SW,1040,Fine 2007-01-01 01:30,455.9,,,,,455.9,,,, etc. by using the following code: convertdict = {0: lambda s: np.datetime64(s, 'm'), 1: lambda s: float(s or 0), 2: lambda s: float(s or 0), 3: lambda s: float(s or 0), 4: lambda s: float(s or 0), 5: lambda s: float(s or 0), 6: lambda s: float(s or 0), 7: lambda s: float(s or 0), 8: str, 9: str, 10: str} dt = [('date', np.datetime64),('system', float), ('pumping', float),('rgt', float), ('agt', float), ('sps', float) ,('eskom_import', float), ('temperature', float), ('wind', str), ('pressure', float), ('weather', str)] a = np.recfromcsv(fp, dtype=dt, converters=convertdict, usecols=range(0-11), names=True) The dtype it generates for a.date is 'object': array([2007-01-01T00:30+0200, 2007-01-01T01:00+0200, 2007-01-01T01:30+0200, ..., 2007-12-31T23:00+0200, 2007-12-31T23:30+0200, 2008-01-01T00:00+0200], dtype=object) But I need it to be datetime64, like in this example (but including hrs and minutes): array(['2011-07-11', '2011-07-12', '2011-07-13', '2011-07-14', '2011-07-15', '2011-07-16', '2011-07-17'], dtype='datetime64[D]') It seems that the CSV import creates an embedded object datetype for 'date' rather than a datetime64 data type. Any ideas on how to fix this? Grové
On Wed, Sep 28, 2011 at 9:15 AM, Grové
Hi,
I am trying out the latest version of numpy 2.0 dev:
np.__version__ Out[44]: '2.0.0.dev-aded70c'
I am trying to import CSV data that looks like this:
date,system,pumping,rgt,agt,sps,eskom_import,temperature,wind,pressure,weather 2007-01-01 00:30,481.9,,,,,481.9,15,SW,1040,Fine 2007-01-01 01:00,471.9,,,,,471.9,15,SW,1040,Fine 2007-01-01 01:30,455.9,,,,,455.9,,,, etc.
by using the following code:
convertdict = {0: lambda s: np.datetime64(s, 'm'), 1: lambda s: float(s or 0), 2: lambda s: float(s or 0), 3: lambda s: float(s or 0), 4: lambda s: float(s or 0), 5: lambda s: float(s or 0), 6: lambda s: float(s or 0), 7: lambda s: float(s or 0), 8: str, 9: str, 10: str} dt = [('date', np.datetime64),('system', float), ('pumping', float),('rgt', float), ('agt', float), ('sps', float) ,('eskom_import', float), ('temperature', float), ('wind', str), ('pressure', float), ('weather', str)] a = np.recfromcsv(fp, dtype=dt, converters=convertdict, usecols=range(0-11), names=True)
The dtype it generates for a.date is 'object':
array([2007-01-01T00:30+0200, 2007-01-01T01:00+0200, 2007-01-01T01:30+0200, ..., 2007-12-31T23:00+0200, 2007-12-31T23:30+0200, 2008-01-01T00:00+0200], dtype=object)
But I need it to be datetime64, like in this example (but including hrs and minutes):
array(['2011-07-11', '2011-07-12', '2011-07-13', '2011-07-14', '2011-07-15', '2011-07-16', '2011-07-17'], dtype='datetime64[D]')
It seems that the CSV import creates an embedded object datetype for 'date' rather than a datetime64 data type. Any ideas on how to fix this?
Grové
Not sure how big your file is, but you might take a look at the loadtable branch on my numpy fork: https://github.com/chrisjordansquire/numpy. It has a function loadtable, with some docs and tests. It currently only loads dates, but you could likely modify it to handle datetimes as well without too much trouble. (Well, it should be pretty simple once you kinda grok how the function works. Unfortunately it's somewhat large and complicated, so it might not be what you want if you just want to load your date quick and be done with it.) -Chris JS
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (2)
-
Christopher Jordan-Squire
-
Grové