[Numpy-discussion] The date/time dtype and the casting issue

Francesc Alted faltet at pytables.org
Tue Jul 29 09:12:52 EDT 2008


Hi,

During the making of the date/time proposals and the subsequent 
discussions in this list, we have changed a couple of times our point 
of view about the way how the castings would work between different 
date/time types and the different time units (previously called 
resolutions).  So I'd like to expose this issue in detail here, and 
give yet another new proposal about this, so as to gather feedback from 
the community before consolidating it in the final date/time proposal.

Casting proposal for date/time types
====================================

The operations among the proposed date/time types can be divided in 
three groups:

* Absolute time versus relative time

* Absolute time versus absolute time

* Relative time versus relative time

Now, here are our considerations for each case:

Absolute time versus relative time
----------------------------------

We think that in this case the absolute time should have priority for 
determining the time unit of the outcome.  That would represent what 
the people wants to do most of the times.  For example, this would 
allow to do:

>>> series = numpy.array(['1970-01-01', '1970-02-01', '1970-09-01'], 
dtype='datetime64[D]')
>>> series2 = series + numpy.timedelta(1, 'Y')  # Add 2 relative years
>>> series2
array(['1972-01-01', '1972-02-01', '1972-09-01'],
dtype='datetime64[D]')  # the 'D'ay time unit has been chosen

Absolute time versus absolute time
----------------------------------

When operating (basically, only the substraction will be allowed) two 
absolute times with different unit times, we are proposing that the 
outcome would be to raise an exception.  This is because the ranges and 
timespans of the different time units can be very different, and it is 
not clear at all what time unit will be preferred for the user.  For 
example, this should be allowed:

>>> numpy.ones(3, dtype="T8[Y]") - numpy.zeros(3, dtype="T8[Y]")
array([1, 1, 1], dtype="timedelta64[Y]")

But the next should not:

>>> numpy.ones(3, dtype="T8[Y]") - numpy.zeros(3, dtype="T8[ns]")
raise numpy.IncompatibleUnitError  # what unit to choose?

Relative time versus relative time
----------------------------------

This case would be the same than the previous one (absolute vs 
absolute).  Our proposal is to forbid this operation if the time units 
of the operands are different.  For example, this should be allowed:

>>> numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[Y]")
array([4, 4, 4], dtype="timedelta64[Y]")

But the next should not:

>>> numpy.ones(3, dtype="t8[Y]") + numpy.zeros(3, dtype="t8[fs]")
raise numpy.IncompatibleUnitError  # what unit to choose?

Introducing a time casting function
-----------------------------------

As forbidding operations among absolute/absolute and relative/relative 
types can be unacceptable in many situations, we are proposing an 
explicit casting mechanism so that the user can inform about the 
desired time unit of the outcome.  For this, a new NumPy function, 
called, say, ``numpy.change_unit()`` (this name is for the purposes of 
the discussion and can be changed) will be provided.  The signature for 
the function will be:

change_unit(time_object, new_unit, reference)

where 'time_object' is the time object whose unit is to be 
changed, 'new_unit' is the desired new time unit, and 'reference' is an 
absolute date that will be used to allow the conversion of relative 
times in case of using time units with an uncertain number of smaller 
time units (relative years or months cannot be expressed in days).  For 
example, that would allow to do:

>>> numpy.change_unit( numpy.array([1,2], 'T[Y]'), 'T[d]' )
array([365, 731], dtype="datetime64[d]")

or:

>>> ref = numpy.datetime64('1971', 'T[Y]')
>>> numpy.change_unit( numpy.array([1,2], 't[Y]'), 't[d]',  ref )
array([366, 365], dtype="timedelta64[d]")

Note: we refused to use the ``.astype()`` method because of the 
additional 'time_reference' parameter that will sound strange for other 
typical uses of ``.astype()``.

Opinions?

-- 
Francesc Alted



More information about the NumPy-Discussion mailing list