[Numpy-discussion] Dates and times and Datetime64 (again)

Sankarshan Mudkavi smudkavi at uwaterloo.ca
Wed Mar 19 22:07:13 EDT 2014


On Mar 19, 2014, at 10:01 AM, Dave Hirschfeld <novin01 at gmail.com> wrote:

> Jeff Reback <jeffreback <at> gmail.com> writes:
> 
>> 
>> Dave,
>> 
>> your example is not a problem with numpy per se, rather that the default 
> generation is in local timezone (same as what python datetime does).
>> If you localize to UTC you get the results that you expect. 
>> 
> 
> The problem is that the default datetime generation in *numpy* is in local 
> time.
> 
> Note that this *is not* the case in Python - it doesn't try to guess the 
> timezone info based on where in the world you run the code, if it's not 
> provided it sets it to None.
> 
> In [7]: pd.datetime?
> Type:       type
> String Form:<type 'datetime.datetime'>
> Docstring:
> datetime(year, month, day[, hour[, minute[, second[, 
> microsecond[,tzinfo]]]]])
> 
> The year, month and day arguments are required. tzinfo may be None, or an
> instance of a tzinfo subclass. The remaining arguments may be ints or longs.
> 
> In [8]: pd.datetime(2000,1,1).tzinfo is None
> Out[8]: True
> 
> 
> This may be the best solution but as others have pointed out this is more 
> difficult to implement and may have other issues.
> 
> I don't want to wait for the best solution - the assume UTC on input/output 
> if not specified will solve the problem and this desperately needs to be 
> fixed because it's completely broken as is IMHO.
> 
> 
>> If you localize to UTC you get the results that you expect. 
> 
> That's the whole point - *numpy* needs to localize to UTC, not to whatever 
> timezone you happen to be in when running the code. 
> 
> In a real-world data analysis problem you don't start with the data in a 
> DataFrame or a numpy array it comes from the web, a csv, Excel, a database 
> and you want to convert it to a DataFrame or numpy array. So what you have 
> from whatever source is a list of tuples of strings and you want to convert 
> them into a typed array.
> 
> Obviously you can't localize a string - you have to convert it to a date 
> first and if you do that with numpy the date you have is wrong. 
> 
> In [108]: dst = np.array(['2014-03-30 00:00', '2014-03-30 01:00', '2014-03-
> 30 02:00'], dtype='M8[h]')
>     ...: dst
>     ...: 
> Out[108]: array(['2014-03-30T00+0000', '2014-03-30T00+0000', '2014-03-
> 30T02+0100'], dtype='datetime64[h]')
> 
> In [109]: dst.tolist()
> Out[109]: 
> [datetime.datetime(2014, 3, 30, 0, 0),
> datetime.datetime(2014, 3, 30, 0, 0),
> datetime.datetime(2014, 3, 30, 1, 0)]
> 
> 
> AFAICS there's no way to get the original dates back once they've passed 
> through numpy's parser!?
> 
> 
> -Dave
> 
> 
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


Hi all,

I've written a rather rudimentary NEP, (lacking in technical details which I will hopefully add after some further discussion and receiving clarification/help on this thread).

Please let me know how to proceed and what you think should be added to the current proposal (attached to this mail).

Here is a rendered version of the same:
https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst

Cheers,
Sankarshan

-- 
Sankarshan Mudkavi
Undergraduate in Physics, University of Waterloo
www.smudkavi.com






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140319/c8c9c468/attachment.html>


More information about the NumPy-Discussion mailing list