[Numpy-discussion] The date/time dtype and the casting issue

Ivan Vilata i Balaguer ivan at selidor.net
Wed Jul 30 04:06:58 EDT 2008


Pierre GM (el 2008-07-29 a les 15:47:52 -0400) va dir::

> On Tuesday 29 July 2008 15:14:13 Ivan Vilata i Balaguer wrote:
> > Pierre GM (el 2008-07-29 a les 12:38:19 -0400) va dir::
> > > > Relative time versus relative time
> > > > ----------------------------------
> > > >
> > > > This case would be the same than the previous one (absolute vs
> > > > absolute).  Our proposal is to forbid this operation if the time units
> > > > of the operands are different.
> > >
> > > Mmh, less sure on this one. Can't we use a hierarchy of time units, and
> > > force to the lowest ?
> > >
> > > For example:
> > > >>>numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[M]")
> > > >>>array([15,15,15], dtype="t8['M']")
> > >
> > > I agree that adding ns to years makes no sense, but ns to s ? min to
> > > hr or days ?  In short: systematically raising an exception looks a
> > > bit too drastic. There are some simple unambiguous cases that sould be
> > > allowed (Y+M, Y+Q, M+Q, H+D...)
> >
> > Do you mean using the most precise unit for operations with "near
> > enough", different units?  I see the point, but what makes me doubt
> > about it is giving the user the false impression that the most precise
> > unit is *always* expected.  I'd rather spare the user as many surprises
> > as possible, by simplifying rules in favour of explicitness (but that
> > may be debated).
> 
> Let me rephrase:
> Adding different relative time units should be allowed when there's no 
> ambiguity on the output:
> For example, a relative year timedelta is always 12 month timedeltas, or 4 
> quarter timedeltas. In that case, I should be able to do:
> 
> >>>numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[M]")
> array([15,15,15], dtype="t8['M']")
> >>>numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[Q]")
> array([7,7,7], dtype="t8['Q']")
> 
> Similarly:
> * an hour is always 3600s, so I could add relative s/ms/us/ns timedeltas to 
> hour timedeltas, and get the result in s/ms/us/ns.
> * A day is always 24h, so I could add relative hours and days timedeltas and 
> get an hour timedelta
> * A week is always 7d, so W+D -> D 
> 
> However:
> * We can't tell beforehand how much days are in any month, so adding relative 
> days and months would raise an exception.
> * Same thing with weeks and months/quarters/years
> 
> There'll be only a limited number of time units, therefore a limited number of 
> potential combinations between time units. It'd be just a matter of listing 
> which ones are allowed and which ones will raise an exception.

That's "keep the precision" over "keep the range".  At first Francesc
and I opted for "keep the range" because that's what NumPy does, e.g.
when operating an int64 with an uint64.  Then, since we weren't sure
about what the best choice would be for the majority of users, we
decided upon letting (or forcing) the user to be explicit.  However, the
use of time units and integer values is precisely intended to "keep the
precision", and overflow won't be so frequent given the correct time
unit and the span of uint64, so you may be right in the end. :)

> > > > Note: we refused to use the ``.astype()`` method because of the
> > > > additional 'time_reference' parameter that will sound strange for other
> > > > typical uses of ``.astype()``.
> > >
> > > A method would be really, really helpful, though...
> > > [...]
> >
> > Yay, but what doesn't seem to fit for me is that the method would only
> > have sense to time values.  
> 
> Well, what about a .tounit(new_unit, reference=None) ?
> By default, the reference would be None and default to the POSIX epoch.
> We could also go for .totunit (for to time unit)

Yes, that'd be the signature for a method.  The ``reference`` argument
shouldn't be allowed for ``datetime64`` values (absolute times, no
ambiguities) but it should be mandatory for ``timedelta64`` ones.
Sorry, but I can't see the use of having a default reference, unless one
wanted to work with Epoch-based deltas, which looks like an extremely
particular case.  Could you please show me a use case for having a
reference defaulting to the POSIX epoch?

Cheers,

::

  Ivan Vilata i Balaguer   @ Intellectual Monopoly hinders Innovation! @
  http://www.selidor.net/  @     http://www.nosoftwarepatents.com/     @
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 307 bytes
Desc: Digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20080730/0e4af089/attachment.sig>


More information about the NumPy-Discussion mailing list