[DB-SIG] Standardized Date-Time class
Jeffrey C. Jacobs
timehorse@unforgettable.com
Tue, 9 Dec 1997 17:07:01 -0500
Hi, all!
This is my first post to the DB Sig, and I may be a day or so out of =
step with the rest of you, but I have been talking with Prof. Reingold, =
the author of the Calendar converter written in C mentioned by Christian =
Egli yesterday, and he possesses many other calendar converters which he =
is not making available to the general public, but would allow to be =
distributed freely if for non-commercial use in Python form based on the =
original Lisp code. Thus, it would be nice if we do build a Calendar =
converter into the Date-Time object, that the converters can be loaded =
modularly, so that if a non-commercial entity wishes to use the =
Reingold-Dershowitz algorithms, they may just plug them in, but =
otherwise they need not be included. In this way, each Calendar could =
be loaded optionally, so that if one was only concerned with Gregorian =
and ISO, one wouldn't have to load Hebrew, Islamic or Mayan.
Grabbing the statements from Jim Fulton mentioned before:
> It would be helpful to have a standard date-time implementation for
> use in Python database implementations (and elsewhere, of course).
> =20
> I think that the date-time implementation should:
> =20
> 1. Support conversion from strings in a very wide variety of formats
> (e.g. 'Oct, 1, 1994 12:34am EST', '1994/10/1 00:34:21.456')
This would be calendar dependent, and thus reside in a given Calendar's =
module, however, if I write "1 Juli 1972", "1er Janvier", or "15 Maggio =
44 B.C.", how would those be interpreted? And if I write "97/12/09", as =
I would in Japan, will it know what I mean?
> 2. Support subtraction of date-times, and addition and=20
> subtraction of dates and numbers.
I think the general idea that time is actually two concepts, Time and =
TimeSpan, may be the best way of looking at this, and since the way =
Times are stored may not necessarily be in Seconds since some standard =
Epoch, adding a number to a Time may be confusing: "What am I adding? =
Seconds? Milliseconds? Days? Floating point? Integer?" Where as a =
TimeSpan class could be created as a co-class of Time, using the same =
storage format, and having interfaces which make it clear the span of =
time stored in the object.
> 3. Store dates efficiently.
I would think that Dates and Time being basically one concept, should =
be combined into 1 Time class, and given the precision of most computers =
purporting to be in milliseconds, a very simple way of storing Dates =
would be a long indicating the number of Milliseconds since some epoch. =
The Gregorian epoch then would be a logical choice, making it the =
default calendar; the number of Milliseconds expired since Monday, 1 =
January 1 AD at 00:00:00.000 UTC would certainly be my suggestion.
You never know if you want to calculate the time the Earth was created =
as Tuesday, 4 November 5,371,840,012 BC at 12:36:14.009 UTC (in a =
non-secular context); a floating point will over-flow, though a long =
will maintain precision. I know, how often are you going to need to use =
the day the Earth was created, but isn't it nice knowing that you *can*? =
:)
Thus, the TimeSpan would also be a quantity in Milliseconds, though =
such an object may be initialized in Days or Seconds, I should think.
Of course, if most calculations are done based on Seconds or Days, then =
this storage method is not optimal.
> 4. Store dates immutably.
I may sound stupid with this one, but if the module is written in =
python, and not compiled, then theoretically wouldn't you always be able =
to change the internal structure of your Time class? Say I write:
> foo =3D TimeModule.Time(time.time()) # foo =3D The Current Time
> foo.MyInternalTimeStorage =3D 42
> foo.PrintDate()
'January 1, 1 AD 00:00:00.042 UTC'
>
There may be a way around this in Python, but alas I don't know it. =
Thus, the only way I know to achieve true immutability would be to write =
the Time and TimeSpan classes in C++ and then compile that to a Python =
module. I guess I can leave this issue to someone else who would know =
more about this aspect of Python than I.
> 5. Represent dates to a specified minimum precision=20
> (e.g. milliseconds).
See 3.
> 6. Handle all dates in the Gregorian calendar. (e.g. there should
> not be problems storing dates from the 18th or 21st centuries.)
As I said above, supporting non-Gregorian calendars may be made =
optional, but the Gregorian module should be loaded with the TimeModule, =
and therefore support of arbitrary Gregorian dates should be feasible.
> 7. Provide read-access to date-components (e.g. year, month, second,
> day-of-week, etc.)
That would also be an aspect of each Calendar module, such as the =
default Gregorian type, though given how I have set up the Calendar =
definition, each quantity would have to be calculated dynamically.
> I'm afraid the implementation should also address issues like:
> =20
> 8. Support for time-zones,
>
> 9. Support for daylight-savings time.
Having examined and pilfered the definitions for 51 standard TimeZones =
and their corresponding rules for DST, this should be doable as a simple =
look-up, with some minor calculations. The C-Library support for DST =
that I have is based on these rules and doing a lookup for which is the =
local TimeZone using an OS API. Since I think normally the TimeZone is =
stored in the environment variable "TZ" as a string of the form =
"GMTh[:m[:s]][GDT]", I am lead to believe that if the GDT string is =
included, indicating that this TimeZone supports Daylight Savings, then =
the offset will always be 1 hour, begins the first Sunday in April at =
02:00:00 and ending the last Sunday in October at 02:00:00. Since this =
is not at all valid universally, especially in countries in the southern =
hemisphere or outside Europe and North America, storing the DST rules as =
well as the name would seem wise.
However, given the list of TimeZones that I have, all the names are in =
full format, "Greenwich Mean Time", not "GMT", which is what is =
reflected in Python when I type "> time.tzname". If some other system =
is using the 3-Character form, the lookup will have to be modified so =
that that TimeZone can be found in the list of TimeZones.
Another issue which may be important is that in the US, the rules for =
when DST begins I believe were changed from beginning the last Sunday in =
April to the current rules in 1987, so what do we calculate for the Time =
in mid April in the US before 1987? Could TimeZones have more than one =
set of Daylight Savings rules depending on the year being calculated?
What I'm getting at here, is if we are going to support Universal Time =
from 5 billion BC to 1 billion AD or what have you, we will have to =
replace the C-Library TimeZone routines with something more robust. I'm =
currently examining the GNU implementation of TimeZones and I may =
elaborate on this in a later post.
Just as in interesting aside, one thing I have found is that since Leap =
Seconds have been instituted in 1972, there have been 21 occurrences as =
of 30 June 1997, thus any times before 1972 will be off by about 21 =
seconds.
One final and potential To Do, coming full circle on the whole Calendar =
concept: since in Hebrew and Islamic calendars the day begins at =
Sundown, wouldn't implementation of Sunrise and Sunset calculations make =
things more precise, if they could also be included optionally? =
Certainly, this would be a very low priority.
Anyway, those are my thoughts on Time based on the work I have been =
doing so far in Python building my own Time and Calendar modules, which =
I will try to post when I work out some structural issues, such as =
standardizing the Calendar modules and getting everything to be in =
milliseconds.
Be Seeing You,
Jeffrey.
---
~,-;` The TimeHorse
Sometimes also a Dragon. . .
_______________
DB-SIG - SIG on Tabular Databases in Python
send messages to: db-sig@python.org
administrivia to: db-sig-request@python.org
_______________