[Numpy-discussion] RFC: A (second) proposal for implementing some date/time types in NumPy

Pierre GM pgmdevlist at gmail.com
Fri Jul 25 16:47:02 EDT 2008


Francesc, 

Could you clarify a couple of points ?

[datetime64]
If I understand properly, your datetime64 would be time units from the POSIX 
epoch (1970/01/01 00:00:00), right ? So

+7d would be 1970/01/08 (7 days after the epoch)
-7W would be 1969/11/13 (7*7 days before the epoch)

With this approach, a series [1,2,3,7] at a resolution 'd' would correspond to 
1970/01/01, 1970/01/02, 1970/01/03 and 1970/01/07, right ?

I'm all for that, **AS LONG AS we have a business day resolution** 'b', so 
that
+7b would be 1970/01/09.


[timedelta64]
I like your idea of a timedelta64 being relative, but in that case, why not 
having the same resolutions as datetime64 ? 

[scikits.timeseries]
We can currently perform the following operations in scikits.timeseries
>>>import scikits.timeseries as ts
>>>series = ts.date_array(['1970-01', '1970-02', '1970-09'], freq='M')
>>>series
DateArray([Jan-1970, Feb-1970, Sep-1970],
          freq='M')
>>>series.asfreq('A')
DateArray([1970, 1970, 1970],
          freq='A-DEC')
>>>series.asfreq('A-MAR')
DateArray([1970, 1970, 1971],
          freq='A-MAR')
"A-MAR" means that year YY ends on 03/31 and that year (YY+1) starts on 04/01.

I use that a lot in my work, when I need to average daily data by water years 
(a water year starts usually on 04/01 and ends the following 03/31).

How would I do that with datetime64 and timedelta64 ?


Apart from that, I'd be of course quite happy to help as much as I can.
P.


############################################

On Friday 25 July 2008 07:09:33 Francesc Alted wrote:
> Hi,
>
> Well, as there were no replies to our second proposal for the date/time
> dtype, I assume that everbody agrees with it ;-)  At any rate, we would
> like to proceed with the implementation phase very soon now.
>
> However, it happens that Enthought is sponsoring this job and they
> clearly stated that the implementation should cover the needs of as
> much users as possible.  So, most in particular, we would like that one
> of the most heavier users of date/time objects, i.e. the TimeSeries
> authors, would be comfortable with the new date/time dtypes, and
> specially that they can benefit from them.
>
> For this goal, we are proposing a decoupling of the date/time use cases
> in two different groups:
>
> 1. A pure ``datetime`` dtype (absolute or relative) that would be useful
> for timestamping purposes in general (i.e. registering dates without a
> need that they be evenly spaced in time).
>
> 2. A class based on the ``frequency`` concept that would be useful for
> measurements that are done on a regular basis or in business
> applications.
>
> With this, we are preventing the dtype implementation at the core of
> NumPy from being too cluttered with the relatively complex needs of the
> ``frequency`` concept users, factoring it out to a external class
> (``Date`` to follow the TimeSeries naming convention).  More
> importantly, this decoupling will also avoid the mix of those two
> concepts that, although they are about time measurements, they have
> quite a different meanings indeed.
>
> Another important advantage of this distinction is that the ``datetime``
> timestamp requires less meta-information to worry about (basically,
> the 'resolution' property), while a ``frequency`` à la TimeSeries will
> need more additional meta-information, like the 'start' and 'end' of
> periods, as well as a more complex way to code frequencies (there
> exists much more time-periods to be coded, as it can be seen in [1]_).
> This can be utterly important to allow the NumPy data based on the
> ``datetime`` dtype to be quickly saved and retrieved on databases like
> ZODB (object database) or PyTables (HDF5-based database).
>
> Our ultimate goal is that the ``Date`` and ``DateArray`` classes in the
> TimeSeries would be rewritten in terms of the new date/time dtype so as
> to get advantage of its features but also for getting rid of duplicated
> code.  I honestly think that this can be a big advantage for TimeSeries
> indeed (at the cost of taking some time for doing the migration).
>
> Does that approach make sense for people?
>
> .. [1] http://scipy.org/scipy/scikits/wiki/TimeSeries#Frequencies





More information about the NumPy-Discussion mailing list