[Numpy-discussion] Default unit for datetime/timedelta
Mark Wiebe
mwwiebe at gmail.com
Wed Jun 8 20:22:07 EDT 2011
On Wed, Jun 8, 2011 at 6:31 PM, Pierre GM <pgmdevlist at gmail.com> wrote:
>
> On Jun 9, 2011, at 1:10 AM, Mark Wiebe wrote:
> >
> > > >>> np.timedelta64(10, 's') + 10
> > > numpy.timedelta64(20,'s')
> >
> > Here, the unit is defined: 's'
> >
> > For the first operand, the inconsistency is with the second. Here's the
> reasoning I didn't spell out:
> > We're adding a timedelta + int, so lets convert 10 into a timedelta. No
> units specified, so it's
> > 10 microseconds, so we add 10 seconds and 10 microseconds, not 10 seconds
> and 10 seconds.
>
> Ah OK. I think that your approach of taking the defined unit (here, s) as
> unit of the undefined term (here, 10) is by far the best.
>
> > >OK, here it is not. But the result makes sense... Up to a certain point.
> If you try to guess the unit from a date given as a >string, what happens in
> case of ambiguities ? Or do you restrict an input string to be strictly
> ISO8601 to remove those ?
> >
> > Yeah, I'm restricting the string to be (almost) strictly ISO8601. For
> supporting other formats, I think creating a 'fancy_date_parser' function or
> something like that would be better than having all those date string format
> ambiguities in the core type.
>
> Quite OK. But this 'fancy_date_parser' will likely crash at some point if
> the unit cannot be guessed properly. But you're right, that's not the issue
> here.
>
>
> >
> > > I'd like to make 'M8' and 'm8' be datetime data types with generic time
> units instead of microseconds as they are currently. This would also allow
> the possibility of extending the behavior of detecting the unit from the
> input string as:
> > >
> > > >>> np.datetime64('2011-03-12T13')
> > > numpy.datetime64('2011-03-12T13-0600','h')
> > >
> > > to also work with arrays, which currently work like this:
> > >
> > > >>> np.array(['2011-03-12T13', '2012'], dtype='M8')
> > > array(['2011-03-12T13:00:00.000000-0600',
> '2011-12-31T18:00:00.000000-0600'], dtype='datetime64[us]')
> >
> > Why is the second one not '2012-01-01T00:00:00-0600' ?
> >
> > This is because dates are stored at midnight UTC, and when converted to
> local time for the default time-based printing, that changes slightly.
> > ISO8601 specifies to interpret an input in local time if no "Z" or
> timezone offset is given, so that's why the first one matches. I haven't
> been able to think of a way around it other than putting warnings in the
> documentation, and have made 'today' and 'now' throw errors if you try to
> use them as times or dates respectively.
>
> I see the logic, but I don't like it at all. I would expect the date to be
> stored in the local time zone by default (that is, if no other time zone
> info is available).
>
It's not satisfying to me either, but I haven't been able to think of a
solution I like. The idea of converting from 'D' to 's' metadata depending
on the timezone setting of your computer feels worse to me than the current
approach, but if someone has an idea that's better I'm all ears.
-Mark
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110608/91bb5e47/attachment.html>
More information about the NumPy-Discussion
mailing list