[Numpy-discussion] Dates and times and Datetime64 (again)

Jeff Reback jeffreback at gmail.com
Fri Mar 28 16:39:36 EDT 2014


FYI

Here are docs for panda of timezone handling

wesm worked thru the various issues w.r.t. conversion, localization, and
ambiguous zone crossing.

http://pandas.pydata.org/pandas-docs/stable/timeseries.html#time-zone-handling

implementation is largely in here:

 (underlying impl is a datetime64[ns] dtype with a pytz as the timezone)

https://github.com/pydata/pandas/blob/master/pandas/tseries/index.py



On Fri, Mar 28, 2014 at 4:30 PM, Sankarshan Mudkavi
<smudkavi at uwaterloo.ca>wrote:

>
> Hi Nathaniel,
>
> 1- You give as an example of "naive" datetime handling:
>
> >>> np.datetime64('2005-02-25T03:00Z')
> np.datetime64('2005-02-25T03:00')
>
> This IIUC is incorrect. The Z modifier is a timezone offset, and for
> normal "naive" datetimes would cause an error.
>
>
> If what I understand from reading:
> http://thread.gmane.org/gmane.comp.python.numeric.general/53805
>
> It looks like anything other than Z, 00:00 or UTC that has a TZ adjustment
> would raise an error, and those specific conditions would not (I'm guessing
> this is because we assume it's UTC (or the same timezone) internally,
> anything that explicitly tells us it is UTC is acceptable, although that
> may be just my misreading of it.)
>
> However on output we don't use the Z modifier (which is why it's different
> from the UTC datetime64).
>
> I will change it to return an error if what I thought is incorrect and
> also include examples of conversion from datetimes as you requested.
>
> Please let me know if there are any more changes that are required! I look
> forward to further comments/questions.
>
> Cheers,
> Sankarshan
>
> On Fri, Mar 28, 2014 at 5:17 AM, Nathaniel Smith <njs at pobox.com> wrote:
>
> On 28 Mar 2014 05:00, "Sankarshan Mudkavi" <smudkavi at uwaterloo.ca> wrote:
> >
> > Hi all,
> >
> > Apologies for the delay in following up, here is an expanded version of
> the proposal, which hopefully clears up most of the details. I have not
> included specific implementation details for the code, such as which
> functions to modify etc. since I think those are not traditionally included
> in NEPs?
>
> The format seems fine to me. Really the point is just to have a document
> that we can use as reference when deciding on behaviour, and this does that
> :-).
>
> Three quick comments:
>
> 1- You give as an example of "naive" datetime handling:
>
> >>> np.datetime64('2005-02-25T03:00Z')
> np.datetime64('2005-02-25T03:00')
>
> This IIUC is incorrect. The Z modifier is a timezone offset, and for
> normal "naive" datetimes would cause an error.
>
> 2- It would be good to include explicitly examples of conversion to and
> from datetimes alongside the examples of conversions to and from strings.
>
> 3- It would be good to (eventually) include some discussion of the impact
> of the preferred proposal on existing code. E.g., will this break a lot of
> people's pipelines? (Are people currently *always* adding timezones to
> their numpy input to avoid the problem, and now will have to switch to the
> opposite behaviour depending on numpy version?) And we'll want to make sure
> to get feedback from the pydata@ (pandas) list explicitly, though that
> can wait until people here have had a chance to respond to the first draft.
>
> Thanks for pushing this forward!
> -n
>
> Hi all,
>
> Apologies for the delay in following up, here is an expanded version of
> the proposal, which hopefully clears up most of the details. I have not
> included specific implementation details for the code, such as which
> functions to modify etc. since I think those are not traditionally included
> in NEPs?
>
> Please find attached the expanded proposal, and the rendered version is
> available here:
>
> https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst
>
> <datetime-improvement-proposal.rst>
>
> I look forward to comments, agreements/disagreements with this (and
> clarification if this needs even further expansion).
>
>
> Please find attached the
> On Mar 24, 2014, at 12:39 AM, Chris Barker <chris.barker at noaa.gov> wrote:
>
> On Fri, Mar 21, 2014 at 3:43 PM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> On Thu, Mar 20, 2014 at 11:27 PM, Chris Barker <chris.barker at noaa.gov>
>> wrote:
>> > * I think there are more or less three options:
>> >    1)  a) don't have any timezone handling at all -- all datetime64s
>> are UTC. Always
>> >          b) don't have any timezone handling at all -- all datetime64s
>> are naive
>> >              (the only difference between these two is I/O of strings,
>> and maybe I/O of datetime objects with a time zone)
>> >     2) Have a time zone associated with the array -- defaulting to
>> either UTC or None, but don't provide any implementation other than the
>> tagging, with the ability to add in TZ handler if you want (can this be
>> done efficiently?)
>> >     3) Full on proper TZ handling.
>> >
>> > I think (3) is off the table for now.
>>
>> I think the first goal is to define what a plain vanilla datetime64
>> does, without any extra attributes. This is for two practical reasons:
>> First, our overriding #1 goal is to fix the nasty I/O problems that
>> default datetime64's show, so until that's done any other bells and
>> whistles are a distraction. And second, adding parameters to dtypes
>> right now is technically messy.
>>
>> This rules out (2) and (3).
>>
>
> yup -- though I'm not sure I agree that we need to do this, if we are
> going to do something more later anyway. But you have a key point - maybe
> the dtype system simply isn't ready to do it right, and then it may be
> better not to try.
>
> In which case, we are down to naive or always UTC -- and again, those
> really aren't very different. Though I prefer naive -- always UTC adds some
> complication if you don't actually want UTC, and I'm not sure it actually
> buys us anything. And maybe it's jsut me, but all my code would need to use
> naive, so I"d be doing a bit of working around to use a UTC-always system.
>
>
>> If we additionally want to keep the option of adding a timezone
>> parameter later, and have the result end up looking like stdlib
>> datetime, then I think 1(b) is the obvious choice. My guess is that
>> this is also what's most compatible with pandas, which is currently
>> keeping its own timezone object outside of the dtype.
>>
>
> Good point, all else being equal, compatability with Pandas would be a
> good thing.
>
> Any downsides? I guess this would mean that we start raising an error
>> on ISO 8601's with offsets attached, which might annoy some people?
>>
>
> yes, but errors are better than incorrect values...
>
> > Writing this made me think of a third option -- tracking, but no real
> manipulation, of TZ. This would be analogous to the ISO 8601 does -- all it
> does is note an offset. A given DateTime64 array would have a given offset
> assigned to it, and the appropriate addition and subtraction would happen
> at I/O. Offset of 0.00 would be UTC, and there would be a None option for
> naive.
>
> Please no! An integer offset is a terrible way to represent timezones,
>>
>
> well, it would solve the being able to read ISO strings problem, and being
> able to perform operations with datetimes in multiple time zones. though I
> guess you could get most of that with UTC-always.
>
>
>> and hardcoding this would just get in the way of a proper solution.
>>
>
> well, that's a point -- if we think there is any hope of a proper solution
> down the road, then yes, it would be better not to make that harder.
>
> -Chris
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> --
> Sankarshan Mudkavi
> Undergraduate in Physics, University of Waterloo
> www.smudkavi.com
>
>
>
>
>
>
>
> --
> Sankarshan Mudkavi
> Undergraduate in Physics, University of Waterloo
> www.smudkavi.com
>
>
>
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140328/8faa77ac/attachment.html>


More information about the NumPy-Discussion mailing list