[Numpy-discussion] Dates and times and Datetime64 (again)
Sankarshan Mudkavi
smudkavi at uwaterloo.ca
Wed Mar 19 22:07:13 EDT 2014
On Mar 19, 2014, at 10:01 AM, Dave Hirschfeld <novin01 at gmail.com> wrote:
> Jeff Reback <jeffreback <at> gmail.com> writes:
>
>>
>> Dave,
>>
>> your example is not a problem with numpy per se, rather that the default
> generation is in local timezone (same as what python datetime does).
>> If you localize to UTC you get the results that you expect.
>>
>
> The problem is that the default datetime generation in *numpy* is in local
> time.
>
> Note that this *is not* the case in Python - it doesn't try to guess the
> timezone info based on where in the world you run the code, if it's not
> provided it sets it to None.
>
> In [7]: pd.datetime?
> Type: type
> String Form:<type 'datetime.datetime'>
> Docstring:
> datetime(year, month, day[, hour[, minute[, second[,
> microsecond[,tzinfo]]]]])
>
> The year, month and day arguments are required. tzinfo may be None, or an
> instance of a tzinfo subclass. The remaining arguments may be ints or longs.
>
> In [8]: pd.datetime(2000,1,1).tzinfo is None
> Out[8]: True
>
>
> This may be the best solution but as others have pointed out this is more
> difficult to implement and may have other issues.
>
> I don't want to wait for the best solution - the assume UTC on input/output
> if not specified will solve the problem and this desperately needs to be
> fixed because it's completely broken as is IMHO.
>
>
>> If you localize to UTC you get the results that you expect.
>
> That's the whole point - *numpy* needs to localize to UTC, not to whatever
> timezone you happen to be in when running the code.
>
> In a real-world data analysis problem you don't start with the data in a
> DataFrame or a numpy array it comes from the web, a csv, Excel, a database
> and you want to convert it to a DataFrame or numpy array. So what you have
> from whatever source is a list of tuples of strings and you want to convert
> them into a typed array.
>
> Obviously you can't localize a string - you have to convert it to a date
> first and if you do that with numpy the date you have is wrong.
>
> In [108]: dst = np.array(['2014-03-30 00:00', '2014-03-30 01:00', '2014-03-
> 30 02:00'], dtype='M8[h]')
> ...: dst
> ...:
> Out[108]: array(['2014-03-30T00+0000', '2014-03-30T00+0000', '2014-03-
> 30T02+0100'], dtype='datetime64[h]')
>
> In [109]: dst.tolist()
> Out[109]:
> [datetime.datetime(2014, 3, 30, 0, 0),
> datetime.datetime(2014, 3, 30, 0, 0),
> datetime.datetime(2014, 3, 30, 1, 0)]
>
>
> AFAICS there's no way to get the original dates back once they've passed
> through numpy's parser!?
>
>
> -Dave
>
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
Hi all,
I've written a rather rudimentary NEP, (lacking in technical details which I will hopefully add after some further discussion and receiving clarification/help on this thread).
Please let me know how to proceed and what you think should be added to the current proposal (attached to this mail).
Here is a rendered version of the same:
https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst
Cheers,
Sankarshan
--
Sankarshan Mudkavi
Undergraduate in Physics, University of Waterloo
www.smudkavi.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140319/c8c9c468/attachment.html>
More information about the NumPy-Discussion
mailing list