On Wednesday, June 15, 2011, Mark Wiebe <mwwiebe@gmail.com> wrote:
Towards a reasonable behavior with regard to local times, I've made the default repr for datetimes use the C standard library to print them in a local ISO format. Combined with the ISO8601-prescribed behavior of interpreting datetime strings with no timezone specifier to be in local times, this allows the following cases to behave reasonably:
np.datetime64('now')numpy.datetime64('2011-06-15T15:16:51-0500','s') np.datetime64('2011-06-15T18:00') numpy.datetime64('2011-06-15T18:00-0500','m') As noted in another thread, there can be some extremely surprising behavior as a consequence:
np.array(['now', '2011-06-15'], dtype='M')array(['2011-06-15T15:18:26-0500', '2011-06-14T19:00:00-0500'], dtype='datetime64[s]') Having the 15th of June print out as 7pm on the 14th of June is probably not what one would generally expect, so I've come up with an approach which hopefully deals with this in a good way.
One firm principal of the datetime in NumPy is that it is always stored as a POSIX time (referencing UTC), or a TAI time. There are two categories of units that can be used, which I will call date unitsĀ and time units. The date units are 'Y', 'M', 'W', and 'D', while the time units are 'h', 'm', 's', ..., 'as'. Time zones are only applied to datetimes stored in time units, so there's a qualitative difference between date and time units with respect to string conversions and calendar operations.
I would like to place an 'unsafe' casting barrier between the date units and the time units, so that the above conversion from a date into a datetime will raise an error instead of producing a confusing result. This only applies to datetimes and not timedeltas, because for timedeltas the day <-> hour case is fine, it is just the year/month <-> other units which has issues, and that is already treated with an 'unsafe' casting barrier.
Two new functions will facilitate the conversions between datetimes with date units and time units: date_as_datetime(datearray, hour, minute, second, microsecond, timezone='local', unit=None, out=None), which converts the provided dates into datetimes at the specified time, according to the specified timezone. If 'unit' is specified, it controls the output unit, otherwise it is the units in 'out' or the amount of precision specified in the function.
datetime_as_date(datetimearray, timezone='local', out=None), which converts the provided datetimes into dates according to the specified timezone. In both functions, timezone can be any of 'UTC', 'TAI', 'local', '+/-####', or a datetime.tzinfo object. The latter will allow NumPy datetimes to work with the pytz library for flexible time zone support.
I would also like to extend the 'today' input string parsing to accept strings like 'today 12:30' to allow a convenient way to express different local times occurring today, mostly useful for interactive usage.
I welcome any comments on this design, particularly if you can find a case where this doesn't produce a reasonable behavior. Cheers,Mark
Is the output for the given usecase above with the mix of 'now' and a datetime string without tz info intended to still be correct? I personally have misgivings about interpreating phrases like "now" and "today" at this level. I think it introduces a can of worms that would be difficult to handle. Consider some arbitrary set of inputs to the array function for datetime objects. If they all contain no tz info, then they are all interpreated the same as-is. However, if even one element has 'now', then the inputs are interpreated entirely differently. This will confuse people. Just thinking out loud here, What about a case where the inputs are such that some do not specify tz and some others specify a mix of timezones? Should that be any different from the case given above? It has been awhile for me, but how different is this from Perl's floating tz for its datetime module? Maybe we could combine its approach with your "unsafe" barrier for the ambiguous situations that perl's datetime module mentions? Ben Root