[Numpy-discussion] Problem with importing csv into datetime64

Christopher Jordan-Squire cjordan1 at uw.edu
Thu Sep 29 02:51:33 EDT 2011


On Wed, Sep 28, 2011 at 9:15 AM, Grové <grove.steyn at gmail.com> wrote:
> Hi,
>
> I am trying out the latest version of numpy 2.0 dev:
>
> np.__version__
> Out[44]: '2.0.0.dev-aded70c'
>
> I am trying to import CSV data that looks like this:
>
> date,system,pumping,rgt,agt,sps,eskom_import,temperature,wind,pressure,weather
> 2007-01-01 00:30,481.9,,,,,481.9,15,SW,1040,Fine
> 2007-01-01 01:00,471.9,,,,,471.9,15,SW,1040,Fine
> 2007-01-01 01:30,455.9,,,,,455.9,,,,
> etc.
>
> by using the following code:
>
> convertdict = {0: lambda s: np.datetime64(s, 'm'), 1: lambda s: float(s or 0),
> 2: lambda s: float(s or 0), 3: lambda s: float(s or 0), 4: lambda s: float(s or
> 0), 5: lambda s: float(s or 0), 6: lambda s: float(s or 0), 7: lambda s: float(s
> or 0), 8: str, 9: str, 10: str}
> dt = [('date', np.datetime64),('system', float), ('pumping', float),('rgt',
> float), ('agt', float), ('sps', float) ,('eskom_import', float), ('temperature',
> float), ('wind', str), ('pressure', float), ('weather', str)]
> a = np.recfromcsv(fp, dtype=dt, converters=convertdict, usecols=range(0-11),
> names=True)
>
> The dtype it generates for a.date is 'object':
>
> array([2007-01-01T00:30+0200, 2007-01-01T01:00+0200, 2007-01-01T01:30+0200,
>       ..., 2007-12-31T23:00+0200, 2007-12-31T23:30+0200,
>       2008-01-01T00:00+0200], dtype=object)
>
> But I need it to be datetime64, like in this example (but including hrs and
> minutes):
>
> array(['2011-07-11', '2011-07-12', '2011-07-13', '2011-07-14',
>       '2011-07-15', '2011-07-16', '2011-07-17'], dtype='datetime64[D]')
>
> It seems that the CSV import creates an embedded object datetype for 'date'
> rather than a datetime64 data type.  Any ideas on how to fix this?
>
> Grové
>

Not sure how big your file is, but you might take a look at the
loadtable branch on my numpy fork:
https://github.com/chrisjordansquire/numpy.

It has a function loadtable, with some docs and tests. It currently
only loads dates, but you could likely modify it to handle datetimes
as well without too much trouble. (Well, it should be pretty simple
once you kinda grok how the function works. Unfortunately it's
somewhat large and complicated, so it might not be what you want if
you just want to load your date quick and be done with it.)

-Chris JS

>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list