
A Thursday 17 July 2008, Matt Knox escrigué:
Maybe you are right, but by providing many resolutions we are trying to cope with the needs of people that are using them a lot. In particular, we are willing that the authors of the timseries scikit can find on these new dtype a fair replacement of their Date class (our proposal will be not so featured, but...).
I think a basic date/time dtype for numpy would be a nice addition for general usage.
Now as for the timeseries module using this dtype for most of the date-fu that goes on... that would be a bit more challenging. Unless all of the frequencies/resolutions currently supported in the timeseries scikit are supported with the new dtype, it is unlikely we would be able to replace our implementation. In particular, business day frequency (Monday - Friday) is of central importance for working with financial time series (which was my motivation for the original prototype of the module). But using plain integers for the DateArray class actually seems to work pretty well and I'm not sure a whole lot would be gained by using a date dtype.
Yeah, the business week. We've pondered including this, but we are not sure about the differences of such a thing and a calendar week in terms of a time unit. I see for sure its merits on the TimeSeries module, but I'm afraid that it would be non-sense in the context of a general date/time dtype. Now that I think about it, maybe we should revise our initial intention of adding a quarter too, because ISO 8601 does not offer a way to print it nicely. We can also opt by extending the ISO 8601 representation in order to allow the next sort of string representation: In [35]: array([70, 72, 19], 'datetime64[Q]') Out[35]: array([1988Q2, 1988Q4, 1975Q3], dtype="datetime64[Q]") but, I don't know if this would innecessarily complicate things (apart of representing a departure from standards :-/).
That being said, if someone creates a fork of the timeseries module using a new date dtype at it's core and it works amazingly well, then I'd probably get on board. I just think that may be difficult to do with a general purpose date dtype suitable for inclusion in the numpy core.
Yeah, I understand your reasons. In fact, it is a pity that your requeriments diverge in some key points from our proposal for the general dtypes. I have had a look at how you have integrated recarrays in your TimeSeries module, and I'm sure that by choosing a date/time dtype you would be able to reduce the complexity (and specially the efficiency too) of your code quite a few. Cheers, -- Francesc Alted