[Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy

Francesc Alted faltet at pytables.org
Mon Jul 14 07:58:10 EDT 2008


A Saturday 12 July 2008, Matt Knox escrigué:
> Christopher Barker <Chris.Barker <at> noaa.gov> writes:
> >> I'm also imaging some extra utility functions/method that would be
> >> nice:
> >>
> >> aDateTimeArray.hours(dtype=float)
> >>
> >> to convert to hours (and days, and seconds, etc). And maybe some
> >> that would create a DateTimeArray from various time units.
>
> The DateArray class in the timeseries scikits can do part of what you
> want. Observe...
>
> >>> import scikits.timeseries as ts
> >>> a = ts.date_array(start_date=ts.now('hourly'), length=15)
> >>> a
>
> DateArray([12-Jul-2008 11:00, 12-Jul-2008 12:00, 12-Jul-2008 13:00,
>        12-Jul-2008 14:00, 12-Jul-2008 15:00, 12-Jul-2008 16:00,
>        12-Jul-2008 17:00, 12-Jul-2008 18:00, 12-Jul-2008 19:00,
>        12-Jul-2008 20:00, 12-Jul-2008 21:00, 12-Jul-2008 22:00,
>        12-Jul-2008 23:00, 13-Jul-2008 00:00, 13-Jul-2008 01:00],
>           freq='H')

Mmh, I like very much your notion of 'frequency' as meta-information of 
your DateArray class.  I was in fact thinking in something similar for 
the (more general) date/time in NumPy, but based on the notion 
of 'resolution' instead of 'frequency'.  I'll expand more about this in 
our next proposal.

>
> >>> a.year
>
> array([2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008,
> 2008, 2008, 2008, 2008, 2008])
>
> >>> a.hour
>
> array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,  0,  1])
>
> >>> a.day
>
> array([12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 13, 13])
>

Well, while I see the merits of the '.year', '.hour' and so on 
properties, I'm not sure whether this would be useful for a general 
date/time type.  I'd prefer what was suggested by Chris before, i.e. 
something like:

a.hours(dtype=float)

to convert to hours (and days, and seconds, etc).

> I would encourage you to take a look at the wiki
> (http://scipy.org/scipy/scikits/wiki/TimeSeries) as you may find some
> surprises in there that prove useful.

I've had a look at it, and it is clear that you guys have put a lot of 
thought on it.  We will be sure to have your implementation in mind.

> >> I often have to read/write data files that have time in various
> >> units like that -- it would be nice to use array operations to
> >> work with them.
>
> If peak performance is not a concern, parsing of most date formats
> can be done automatically using the built in parser in the timeseries
> module (borrowed from mx.DateTime). Observe...
>
> >>> dlist = ['14-jan-2001 14:34:33', '16-jan-2001 10:09:11']
> >>> a = ts.date_array(dlist, freq='secondly')
> >>> a
>
> DateArray([14-Jan-2001 14:34:33, 16-Jan-2001 10:09:11],
>           freq='S')

That's great.  However we only planned to import/export dates from the 
``datetime`` module for the time being, mainly because of efficency but 
also simplicity.  Would many people be interested in seeing this kind 
of string date parsing integrated in the native NumPy types?

Thanks,

-- 
Francesc Alted



More information about the NumPy-Discussion mailing list