[Numpy-discussion] fixing up datetime

Christopher Barker Chris.Barker at noaa.gov
Thu Jun 2 15:45:37 EDT 2011


Mark Wiebe wrote:

> It is possible to implement the system so that if you don't use Y/M/B, 
> things work out unambiguously, but if you do use them you get a behavior 
> that's a little weird, but with rules to eliminate the calendar-created 
> ambiguities. 

yes, but everyone wants different rules -- so it needs to be very clear 
which rules are in place, and there needs to be a way for a user to 
specify his/her own rules.

> For the business day unit, what I'm currently trying to do 
> is get an assessment of whether my proposed design the right abstraction 
> to support all the use cases of people who want it.

Good plan.

> I rather agree here, adding the 'origin' back in is definitely worth 
> considering. How is the origin represented in the CF netcdf code?

As a ISO 1601 string. I can't recall if you have the option of 
specifying a non-standard calendar.

> So using the calendar specified by ISO 8601 as the default for the 
> calendar-based functions is undesirable? 

no -- that's fine -- but does ISO 8601 specify stuff like business day?

> I think supporting it to a 
> small extent is reasonable, and support for any other calendars or more 
> advanced calendar-based functions would go in support libraries.

yup -- what I'm trying to press here is the distinction between linear 
time units and the "weird" concepts, like business day, month, etc.

I think there are two related, but distinct issues:

1) representation/specification of a "datetime". The idea here is that 
imagine that there is a continuous property called time (which I suppose 
has a zero at the Big Bang). We need a way to define where (when) in 
that continuum a given event, or set of events occurred. This is what 
the datetime dtype is about. I think the standard of "some-time-unit 
since some-reference-datetime, in some-calendar" is fine, but that the 
time-unit should be unambiguously and clearly defined, and not change 
with when it occurs, i.e. seconds, hours, days, but not months, years, 
or business days.

2) time spans, and math with time: i.e. timedeltas --- this falls into 2 
categories:

   a) simple linear time units: seconds, hours, etc. This is quite 
straightforward, if working with other time deltas and datetimes all 
expressed in well-defined linear units.

   b) calendar manipulations: "months since", "business days since", 
once a month, "the first sunday of teh month", "next monday". These 
require a well defined and complex Calendar, and there are many possible 
such Calendars.

What I'm suggesting is that (a) and (b) should be kept quite distinct, 
and that it should be fairly easy to define and use custom Calendars 
defined for (b).

(a) and (b) could be merged, with various defaults and exceptions raised 
for poorly defined operations, but I think that'll be less clear, harder 
to implement, and more prone to error.


A little example, again from the CF mailing list (which spawned the 
discussion). In the CF standard the units available are defined as 
"those supported by the udunits library":

http://www.unidata.ucar.edu/software/udunits/

It turns out that udunits only supports time manipulation as I specified 
as (a) i.e. only clearly defined linear time units. However, they do 
define "months" and "years", as specific values (something like 365.25 
days/year and 12 months/year -- though they also have "Julian-year", 
"leap_year", etc)

So folks would specify a time axes as : "months since 2010-01" and 
expect that they were getting calandar months, like "1" would mean Feb, 
2010, instaed of January 31, 2010 (or whatever).

Anyway, lots of room for confusion, so whatever we come up with needs to 
be clearly defined.

-Chris




-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov



More information about the NumPy-Discussion mailing list