Re: [Python-Dev] proposal: add basic time type to the standard library
Hello everyone, what a coincidence. I just was discussing this issue with Jason O. today. Here is my original mail: Hey Jason, I also want to start to think about a DateTime module. PostGres has a nice discussion of their impementation: http://www.ca.postgresql.org/users-lounge/docs/7.1/user/datatype-datetime.ht... Here is Java's stuff on it: http://www3.cbu.edu/sciences/jdk1.3/docs/api/java/sql/Date.html Low Level Data Types: Date Time DateTime TimeStamp - Timestamps are always in UTC. * Intervals can be added or subtracted from themselves and the types above. DateInterval TimeInterval DateTimeInterval TimeStampInterval Notes: - The basic data type must be as small as possible, so that applications that save these types often (i.e. ZODB/Persistent) then it should not increase the amount of data by much. - We should then have high-level classes that put in all the functionality, such as lower-level in/output. I think high-level in/output should be handled by functions inside the module, such as getDateTimeFromString(str, someI18NspecificInfo=None). - We need flexible i18n support!!! This is very important, especially for Zope. By default the system should come with a gettext implementation, but I would like to have the module generic enough that we can define other types of translation and localization mechanisms. Mh, the more I think of it, the more I think we will end up building our own stuff and then exposing that via an API. - The parsing of Date, Time and DateTimes as well as their Intervals (PostGreSQL has some very nice ways for that) should be tremendously flexible. I am thinking here about a plugin-type architecture, where you can create your own plugins for parsing. For example, while the "." notation was reserved for the European Date Formats until now, more and more American companies (which are totally ignorant that there might be another country besides the US in the world) use this notation to write the American Date Format this way too. Therefore we need to have a mechanism to switch between the two. I thought of some sort of a list of regex expressions which try to resolve a string. Oh yeah, we need internationalization here as well of course, even though the parser should be generic enough. - The tough part will be time zones. I am almost thinking that we need our own object for handling that. Timezones are horribly complex, but we need to handle them well. I know Zope's current DateTime implementation has a good handle on that, even though I think the code is horrible (sorry Jim). - A professor just mentioned that we should also handle daylight saving. This is not even that trivial, but I agree with him; there needs to be support for that, even though most apps handle that via the time zone, which is ok for the numeric version, but not if you say "CST" for example. PS: Jim, I cc'ed you so that you might be able to comment in some of the points I made. FYI, Jason and I think about implementing a DateTime module for Python in general, which is small and sweet. We are shooting for our calendar system only. Regards, Stephan -- Stephan Richter CBU - Physics and Chemistry Student Web2k - Web Design/Development & Technical Project Management
[You should never post HTML email to mailing lists...] Stephan Richter wrote:
Hello everyone,
what a coincidence. I just was discussing this issue with Jason O. today. Here is my original mail:
Hey Jason,
I also want to start to think about a DateTime module. PostGres has a nice discussion of their impementation: http://www.ca.postgresql.org/users-lounge/docs/7.1/user/datatype-datetime.ht...
Here is Java's stuff on it: http://www3.cbu.edu/sciences/jdk1.3/docs/api/java/sql/Date.html
Low Level Data Types:
Date Time DateTime TimeStamp - Timestamps are always in UTC.
See below... you don't need that many types.
* Intervals can be added or subtracted from themselves and the types above. DateInterval TimeInterval DateTimeInterval TimeStampInterval
Intervals are a bad idea. You really only need two types: one referencing fixed points in time and another one for storing the delta between two such fixed points. Everything else can be modeled on top of those two. Please have a look at mxDateTime. It has these two types and much of what you described in your notes. BTW, you wouldn't believe how complicated dealing with date and time really is... ah, yes, and don't even think of ever getting DST to work properly :-/ -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
At 12:40 PM 2/9/2002 +0100, M.-A. Lemburg wrote:
[You should never post HTML email to mailing lists...]
I know. I noticed it only after I had seen the archive entry. Did you guys could still read it? If not, I will resend it. Sorry! Regards, Stephan -- Stephan Richter CBU - Physics and Chemistry Student Web2k - Web Design/Development & Technical Project Management
* Intervals can be added or subtracted from themselves and the types above. DateInterval TimeInterval DateTimeInterval TimeStampInterval
Intervals are a bad idea.
import DateTime DateTime.parseInterval('6 mins 3 secs') # DateTime.DateTimeInterval is
Why? They are the same as your Deltas. Interval is the more common term I think, therefore I chose it. Maybe having a Time/Date/DateTime{Interval} is too much and they should be really one. So you would have DateTimeInterval and TimeStampInterval for the same reasons I describe below. On the other hand Java does not seem to implement intervals at all, which I think is a bad idea, since RDBs support it. the default 6 minutes 3 seconds
DateTime.parseInterval('50 secs 3 millis', type=DateTime.TimeStampInterval) # returns ticks 50.003
I still think that many types are a good thing; it leaves the developer with choice. However the module should be smart and hide some of the choice from you, if you are a beginner. For example I imagine this to work:
import DateTime date = DateTime.parseDateTime('2.1.2001') type(date).__name__ Date time = DateTime.parseDateTime('12:00:00') type(time).__name__ Time datetime = DateTime.parseDateTime('2.1.2001 12:00:00') type(datetime).__name__ DateTime
You really only need two types: one referencing fixed points in time and another one for storing the delta between two such fixed points. Everything else can be modeled on top of those two.
Well yes, but this is a reason why I have such a hard rime to get mxDateTime into Zope. Your module is well suited for certain tasks, but not everybody wants to use mxDateTime for Date/Time manipulation. So, saving components of a date is for some uses much better than saving ticks and vice versa. I also talked with Jim Fulton about it, and he agrees that there is a need for more than one Date/Time type. However it should be easy of course to convert between both, the Timestamp and the DateTime type. Here are some more examples:
import DateTime date = DateTime.parseDateTime('2.1.2001') type(date).__name__ Date stamp = DateTime.TimeStamp(date) type(stamp).__name__ TimeStamp
BTW, something I do not want to support is:
import DateTime date = DateTime.DateTime('2.1.2001')
Since putting parsing into the object itself is a big mess, as we noticed in the Zope 2.x DateTime implementation. I think there should be only two ways to initialize a DateTime object, one of which I showed above, which is responsible of converting TimeStamps to DateTimes (mmh, maybe that should be a module function as well). The other one is:
import DateTime DateTime.DateTime(2001, 2, 3) February 3, 2001 DateTime.DateTime('2001', '02', '03') # Of course it also supports strings here February 3, 2001 DateTime.DateTime(2001, 2, 3, 12, 0) February 3, 2001 12:00:00 DateTime.DateTime(2001, hour=12) # missing pieces will be replaced by 1 or 0 January 1, 2001 12:00:00 DateTime.DateTime(year=2001, month=2, day=3, hour=1, minute=2, second=3, millisecond=4, timezone=-6) # max amount of arguments February 3, 2001 01:02:03.004 -06:00
Please have a look at mxDateTime. It has these two types and much of what you described in your notes.
I know mxDateTime very well and have even suggested before to make it the Zope DateTime module and even put it in the standard Python distribution. Here is the mail from the Zope-Coders list: http://lists.zope.org/pipermail/zope-coders/2001-October/000100.html. You can follow the thread to see some responses. Also, the list of notes was made from my experience working with mxDateTime, Zope DateTime and PostGreSQL Dates/Times. I know it was not complete, but it had some of the hotspots in it.
BTW, you wouldn't believe how complicated dealing with date and time really is... ah, yes, and don't even think of ever getting DST to work properly :-/
Oh, I have seen and fixed the Zope DateTime implementation plenty and I have thought of the problem for 2.5 years now. The problem is that the US starts to use the German "." notation (as mentioned in my original mail) and other issues, which make it much harder. That is the reason why I want to build an ultra-flexible parsing engine. So you can do things like:
import DateTime DateTime.parseDateTime('03/02/01', format=DateTime.ISO) February 1, 2003 DateTime.parseDateTime('03/02/01', format=DateTime.US) March 2, 2001 DateTime.parseDateTime('03.02.01', format=DateTime.US) March 2, 2001 DateTime.parseDateTime('03/02/01', format=DateTime.GERMAN) # just in case Europe/Germany goes insane as well. February 3, 2001
But by default:
DateTime.parseDateTime('03/02/01') March 2, 2001 DateTime.parseDateTime('03.02.01') February 3, 2001
Regards, Stephan -- Stephan Richter CBU - Physics and Chemistry Student Web2k - Web Design/Development & Technical Project Management
Stephan Richter wrote:
* Intervals can be added or subtracted from themselves and the types above. DateInterval TimeInterval DateTimeInterval TimeStampInterval
Intervals are a bad idea.
Why? They are the same as your Deltas. Interval is the more common term I think, therefore I chose it. Maybe having a Time/Date/DateTime{Interval} is too much and they should be really one. So you would have DateTimeInterval and TimeStampInterval for the same reasons I describe below.
As I explained my reply, most of these intervals are not needed as *base types*. You can easily model them on top of the two types I have in mxDateTime. Some may not like this model because it comes from a more mathematical point of view, but in reality it works quite nicely and simplifies the API structure significantly. A time interval is basically just an amount of seconds, nothing more. There's no need to have 4 different types to wrap a single double ;-)
On the other hand Java does not seem to implement intervals at all, which I think is a bad idea, since RDBs support it.
import DateTime DateTime.parseInterval('6 mins 3 secs') # DateTime.DateTimeInterval is the default 6 minutes 3 seconds DateTime.parseInterval('50 secs 3 millis', type=DateTime.TimeStampInterval) # returns ticks 50.003
I still think that many types are a good thing; it leaves the developer with choice. However the module should be smart and hide some of the choice from you, if you are a beginner. For example I imagine this to work:
import DateTime date = DateTime.parseDateTime('2.1.2001') type(date).__name__ Date time = DateTime.parseDateTime('12:00:00') type(time).__name__ Time datetime = DateTime.parseDateTime('2.1.2001 12:00:00') type(datetime).__name__ DateTime
Just think of all the possible combinations you have in operations like '+', '-' and comparisons. You don't want to go down this road...
You really only need two types: one referencing fixed points in time and another one for storing the delta between two such fixed points. Everything else can be modeled on top of those two.
Well yes, but this is a reason why I have such a hard rime to get mxDateTime into Zope. Your module is well suited for certain tasks, but not everybody wants to use mxDateTime for Date/Time manipulation.
Uhm, where did you get the impression that I want all the world to use mxDateTime :-? I wrote it for use in mxODBC since at the time there was no DateTime type around which could handle dates prior to 1970. As a result, mxDateTime was written to provide everything you need for database interfacing. That's also the reason why there is no time zone support in mxDateTime's base types: databases don't have time zone support built into their date/time types either (and for a good reason: time zones are better handled at application level).
So, saving components of a date is for some uses much better than saving ticks and vice versa. I also talked with Jim Fulton about it, and he agrees that there is a need for more than one Date/Time type. However it should be easy of course to convert between both, the Timestamp and the DateTime type.
That's why mxDateTime provides so many interfaces to other forms of storing and reading date/time values, e.g. COMDate, ticks, doubles, tuples, strings, various scientific formats, in two different calendars etc.
Here are some more examples:
import DateTime date = DateTime.parseDateTime('2.1.2001') type(date).__name__ Date stamp = DateTime.TimeStamp(date) type(stamp).__name__ TimeStamp
BTW, something I do not want to support is:
import DateTime date = DateTime.DateTime('2.1.2001')
Since putting parsing into the object itself is a big mess, as we noticed in the Zope 2.x DateTime implementation. I think there should be only two ways to initialize a DateTime object, one of which I showed above, which is responsible of converting TimeStamps to DateTimes (mmh, maybe that should be a module function as well). The other one is:
import DateTime DateTime.DateTime(2001, 2, 3) February 3, 2001 DateTime.DateTime('2001', '02', '03') # Of course it also supports strings here February 3, 2001 DateTime.DateTime(2001, 2, 3, 12, 0) February 3, 2001 12:00:00 DateTime.DateTime(2001, hour=12) # missing pieces will be replaced by 1 or 0 January 1, 2001 12:00:00 DateTime.DateTime(year=2001, month=2, day=3, hour=1, minute=2, second=3, millisecond=4, timezone=-6) # max amount of arguments February 3, 2001 01:02:03.004 -06:00
You really just want to support one way for the type constructor (broken down numbers). All other possibilities can be had via factory functions.
Please have a look at mxDateTime. It has these two types and much of what you described in your notes.
I know mxDateTime very well and have even suggested before to make it the Zope DateTime module and even put it in the standard Python distribution. Here is the mail from the Zope-Coders list: http://lists.zope.org/pipermail/zope-coders/2001-October/000100.html. You can follow the thread to see some responses. Also, the list of notes was made from my experience working with mxDateTime, Zope DateTime and PostGreSQL Dates/Times. I know it was not complete, but it had some of the hotspots in it.
BTW, you wouldn't believe how complicated dealing with date and time really is... ah, yes, and don't even think of ever getting DST to work properly :-/
Oh, I have seen and fixed the Zope DateTime implementation plenty and I have thought of the problem for 2.5 years now. The problem is that the US starts to use the German "." notation (as mentioned in my original mail) and other issues, which make it much harder. That is the reason why I want to build an ultra-flexible parsing engine. So you can do things like:
import DateTime DateTime.parseDateTime('03/02/01', format=DateTime.ISO) February 1, 2003 DateTime.parseDateTime('03/02/01', format=DateTime.US) March 2, 2001 DateTime.parseDateTime('03.02.01', format=DateTime.US) March 2, 2001 DateTime.parseDateTime('03/02/01', format=DateTime.GERMAN) # just in case Europe/Germany goes insane as well. February 3, 2001
But by default:
DateTime.parseDateTime('03/02/01') March 2, 2001 DateTime.parseDateTime('03.02.01') February 3, 2001
You can do all this with Parser module in mxDateTime. It allows you to specify a list of parsers to try and in which order to try them. Chuck Esterbrook has kept me working on it for quite some time, so it should be very complete by now :-) For more specific (and strict) formats, there are two other modules ISO and ARPA which can handle the respective formats used in Internet standards. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
As I explained my reply, most of these intervals are not needed as *base types*. You can easily model them on top of the two types I have in mxDateTime. Some may not like this model because it comes from a more mathematical point of view, but in reality it works quite nicely and simplifies the API structure significantly.
A time interval is basically just an amount of seconds, nothing more. There's no need to have 4 different types to wrap a single double ;-)
Well, this is an okay representation, if you want to do a lot of math and use it mainly for this reason. On the other hand it might be fairly expensive, if I always want to extract components. In fact, only 10% of my usage requires mathematical operations. Most of the time I get the interval out of the database and want to simply display it (localized), such as in calendars.
import DateTime date = DateTime.parseDateTime('2.1.2001') type(date).__name__ Date time = DateTime.parseDateTime('12:00:00') type(time).__name__ Time datetime = DateTime.parseDateTime('2.1.2001 12:00:00') type(datetime).__name__ DateTime
Just think of all the possible combinations you have in operations like '+', '-' and comparisons. You don't want to go down this road...
Well, but again, 90% of the time I do not need to do any manipulation whatsoever. For this reason you would have time stamps or you know (because you used this type) that it will be less efficient to do '+' and '-' with DateTime objects, since it does need some more conversions.
Uhm, where did you get the impression that I want all the world to use mxDateTime :-? I wrote it for use in mxODBC since at the time there was no DateTime type around which could handle dates prior to 1970. As a result, mxDateTime was written to provide everything you need for database interfacing. That's also the reason why there is no time zone support in mxDateTime's base types: databases don't have time zone support built into their date/time types either (and for a good reason: time zones are better handled at application level).
Well, back then (when I wrote the mail) I thought so. But now I see the limitations and have a better idea what people need; hence this proposal. For the same reason you say mxDateTime is not good for everything we need a solution that works for more situations.
You really just want to support one way for the type constructor (broken down numbers). All other possibilities can be had via factory functions.
Probably so. I will have to think about it some more and look at some applications.
But by default:
DateTime.parseDateTime('03/02/01') March 2, 2001 DateTime.parseDateTime('03.02.01') February 3, 2001
You can do all this with Parser module in mxDateTime. It allows you to specify a list of parsers to try and in which order to try them. Chuck Esterbrook has kept me working on it for quite some time, so it should be very complete by now :-)
For more specific (and strict) formats, there are two other modules ISO and ARPA which can handle the respective formats used in Internet standards.
Right. And I am not saying that we will not reuse some of the mxDateTime or the Zope DateTime code. I certainly do not want to reimplement stuff that already works very well. Also, we need to support I18N, which means the module needs to understand things like "February", but also "Februar" if the German locale was requested. I have no desire to compete with the mxDateTime implementation. I want to look at some of the solutions out there and take the best from everyone and provide a module that will suit 95-100% of the people. For several reasons, which I tried to point out in my mails, mxDateTime or Zope's Datetime in its current states is not suitable. Regards, Stephan -- Stephan Richter CBU - Physics and Chemistry Student Web2k - Web Design/Development & Technical Project Management
I have no desire to compete with the mxDateTime implementation. I want to look at some of the solutions out there and take the best from everyone and provide a module that will suit 95-100% of the people. For several reasons, which I tried to point out in my mails, mxDateTime or Zope's Datetime in its current states is not suitable.
That's a strange conclusion since both of these modules have been around for quite some time (mxDateTime was started in Dec. 1997) and obviously *are* quite suitable for a large share of Python's users :-) BTW, mxDateTime can do quite a bit in terms of i18n:
from mx.DateTime import * DateTimeFrom('11. Februar 2002') <DateTime object for '2002-02-11 00:00:00.00' at 816cc48> DateTimeFrom('February, 11 2002') <DateTime object for '2002-02-11 00:00:00.00' at 81307c8>
from mx.DateTime import Locale Locale.French.str(now()) 'lundi 11 f\xe9vrier 2002 19:07:12' Locale.Spanish.str(now()) 'lunes 11 febrero 2002 19:07:19' Locale.German.str(now()) 'Montag 11 Februar 2002 19:07:25'
(hmm, I ought to insert some extra interpunctation...) Nevermind, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
participants (2)
-
M.-A. Lemburg
-
Stephan Richter