[Numpy-discussion] NumPy date/time types and the resolution concept

Francesc Alted faltet at pytables.org
Mon Jul 14 14:17:18 EDT 2008


A Monday 14 July 2008, Pierre GM escrigué:
> On Monday 14 July 2008 12:50:21 Francesc Alted wrote:
> > > A very useful point that Matt Knox had coded is the possibility
> > > to specify starting points for switching from one resolution to
> > > another. For example, you can have a series with a 'ANN_MAR'
> > > frequency, that corresponds to 1 point a year, the year starting
> > > in April. When switching back to a monthly resolution, the points
> > > from January to March of the first year will be masked.
> >
> > Ok.  Ann was also suggesting that the origin of time would be
> > configurable, but then, you are talking about *masking* values. 
> > Mmm, I don't think we should try to incorporate masking
> > capabilities in the NumPy date/time types.
>
> Francesc,
> In scikits.timeseries, we have 2 different objects:
> * DateArray, which is basically a ndarray of integers with a given
> 'frequency' attribute.
> * TimeSeries, which is basically the combination of a MaskedArray
> (the data part) and a DateArray (which keeps track of the date
> corresponding to each data point. TimeSeries object have the
> resolution/origin of the companion DateArray, and when they're
> converted from one resolution to another, some masking may occur.
>
> My understanding is that you intend to define an object similar to
> DateArray. You want to define a new dtype (datetime64 or other), we
> used yet another class instead, Date. A dtype would be easier to
> manipulate, but as neither Matt nor I were particularly experienced
> with that at the time, we followed the simpler approach of a class...

Well, what we are after is precisely this: a new dtype type.  After 
integrating it in NumPy, I suppose that your DateArray would be similar 
than a NumPy array with a dtype ``datetime64`` (bar the conceptual 
differences between your 'frequency' behind DateArray and 
the 'resolution' behind the datetime64 dtype).

>
> > [N]timeunit
> >
> > where ``timeunit`` can take the values in:
> >
> > ['y', 'm', 'd', 'h', 'm', 's', 'ms', 'us', 'ns', 'fs']
> >
> > so, for example, '14d' means a resolution of 14 days, or '10ms'
> > means a resolution of 1 hundreth of second.  Sounds good to me. 
> > What other people think?
>
> Sounds pretty cool and intuitive to use. However, writing the
> conversion rules from one to another will be a lot of fun. Take
> weekly, for example: that's a period of 7 days, but when does it
> start ? On a monday ? Then, 12/31/2007 was the start of the first
> week of 2008... OK, we can leave that problem for the moment...

It would start when the origin tells that it should start.  It is 
important to note that our proposal will not force a '7d' (seven 
days) 'tick' to start on monday, or a '1m' (one month) to start the 1st 
day of a calendar month, but rather where the user decides to set its 
origin.

Cheers,

-- 
Francesc Alted



More information about the NumPy-Discussion mailing list