datetime nanosecond support

Hi all, This is the first time I write to this list so thank you for considering this message (if you will) :) I know that this has been debated many times but until now there was no a real use case. If you look on google about "python datetime nanosecond" you can find more than 141k answer about that. They all say that "you can't due to hardware imprecisions" or "you don't need it" even if there is a good amount of people looking for this feature. But let me explain my use case: most OSes let users capture network packets (using tools like tcpdump or wireshark) and store them using file formats like pcap or pcap-ng. These formats include a timestamp for each of the captured packets, and this timestamp usually has nanosecond precision. The reason is that on gigabit and 10 gigabit networks the frame rate is so high that microsecond precision is not enough to tell two frames apart. pcap (and now pcap-ng) are extremely popular file formats, with millions of files stored around the world. Support for nanoseconds in datetime would make it possible to properly parse these files inside python to compute precise statistics, for example network delays or round trip times. More about this issue at http://bugs.python.org/issue15443 I completely agree with the YAGNI principle that seems to have driven decisions in this area until now but It is the case to reconsider it since this real use case has shown up? Thank you for your attention Best Regards, -- Vincenzo Ampolo http://vincenzo-ampolo.net http://goshawknest.wordpress.com

On Tue, Jul 24, 2012 at 5:58 PM, Vincenzo Ampolo <vincenzo.ampolo@gmail.com> wrote:
You're welcome.
Have you read PEP 410 and my rejection of it (http://mail.python.org/pipermail/python-dev/2012-February/116837.html)? Even though that's about using Decimal for timestamps, it could still be considered related.
Not every use case deserves an API change. :-) First you will have to show how you'd have to code this *without* nanosecond precision in datetime and how tedious that is. (I expect that representing the timestamp as a long integer expressing a posix timestamp times a billion would be very reasonable.) I didn't read the entire bug, but it mentioned something about storing datetimes in databases. Do databases support nanosecond precision? -- --Guido van Rossum (python.org/~guido)

On Tue, Jul 24, 2012 at 9:46 PM, Guido van Rossum <guido@python.org> wrote:
I didn't read the entire bug, but it mentioned something about storing datetimes in databases. Do databases support nanosecond precision?
MS SQL Server 2008 R2 has the datetime2 data type which supports 100 nanosecond (.1 microsecond) precision: http://msdn.microsoft.com/en-us/library/bb677335(v=sql.105) PostgreSQL does 1 microsecond: http://www.postgresql.org/docs/8.0/static/datatype-datetime.html If I am reading this correctly the Oracle TIMESTAMP type allows up to 9 digits of fractional seconds (1 nanosecond): http://docs.oracle.com/cd/B19306_01/server.102/b14195/sqlqr06.htm#r9c1-t3 -Chris -- Christopher Lambacher chris@kateandchris.net

On 07/24/2012 06:46 PM, Guido van Rossum wrote:
You're welcome.
Hi Guido, I'm glad you spent your time reading my mail. I would have never imagined that my mail could come to your attention.
I've read it and point 5 is very like in this issue. You said: "[...] I see only one real use case for nanosecond precision: faithful copying of the mtime/atime recorded by filesystems, in cases where the filesystem (like e.g. ext4) records these times with nanosecond precision. Even if such timestamps can't be trusted to be accurate, converting them to floats and back loses precision, and verification using tools not written in Python will flag the difference. But for this specific use case a much simpler set of API changes will suffice; only os.stat() and os.utime() need to change slightly (and variants of os.stat() like os.fstat()). [...]" I think that's based on a wrong hypothesis: just one case -> let's handle in a different way (modifying os.stat() and os.utime()). I would say: It's not just one case, there are at lest other two scenarios. One is like mine, parsing network packets, the other one is in parsing stock trading data. But in this case there is no os.stat() or os.utime() that can be modified. I've to write my own class to deal with time and loose all the power and flexibility that the datetime module adds to the python language.
Yeah that's exactly how we built our Time class to handle this, and we wrote also a Duration class to represent timedelta. The code we developed is 383 python lines long but is not comparable with all the functionalities that the datetime module offers and it's also really slow compared to native datetime module which is written in C. As you may think using that approach in a web application is very limiting since there is no strftime() in this custom class. I cannot share the code right now since It's copyrighted by the company I work for but I've asked permission to do so. I just need to wait tomorrow morning (PDT time) so they approve my request. Looking at the code you can see how tedious is to try to remake all the conversions that are already implemented on the datetime module. Just let me know if you actually want to have a look at the code.
I didn't read the entire bug, but it mentioned something about storing datetimes in databases. Do databases support nanosecond precision?
Yeah. According to http://wiki.ispirer.com/sqlways/postgresql/data-types/timestamp at least Oracle support timestamps with nanoseconds accuracy, SQL server supports 100 nanosecond accuracy. Since I use Postgresql personally the best way to accomplish it (also suggested by the #postgresql on freenode) is to store the timestamp with nanosecond (like 1343158283.880338907242) as bigint and let the ORM (so a python ORM) do all the conversion job. An yet again, having nanosecond support in datetime would make things much more easy. While writing this mail Chris Lambacher answered with more data about nanosecond support on databases Best Regards, -- Vincenzo Ampolo http://vincenzo-ampolo.net http://goshawknest.wordpress.com

On Tue, Jul 24, 2012 at 8:25 PM, Vincenzo Ampolo <vincenzo.ampolo@gmail.com> wrote:
Stop brownnosing already. :-) If you'd followed python-dev you'd known I read it.
Also, this use case is unlike the PEP 410 use case, because the timestamps there use a numeric type, not datetime (and that was separately argued).
So what functionality specifically do you require? You speak in generalities but I need specifics.
As you may think using that approach in a web application is very limiting since there is no strftime() in this custom class.
Apparently you didn't need it? :-) Web frameworks usually have their own date/time formatting anyway.
I believe you.
How so, given that the database you use doesn't support it?
While writing this mail Chris Lambacher answered with more data about nanosecond support on databases
Thanks, Chris. TBH, I think that adding nanosecond precision to the datetime type is not unthinkable. You'll have to come up with some clever backward compatibility in the API though, and that will probably be a bit ugly (you'd have a microsecond parameter with a range of 0-1000000 and a nanosecond parameter with a range of 0-1000). Also the space it takes in memory would probably increase (there's no room for an extra 10 bits in the carefully arranged 8-byte internal representation). But let me be clear -- are you willing to help implement any of this? You can't just order a feature, you know... -- --Guido van Rossum (python.org/~guido)

On 07/24/2012 08:47 PM, Guido van Rossum wrote:
So what functionality specifically do you require? You speak in generalities but I need specifics.
The ability of thinking datetime.datetime as a flexible class that can give you the representation you need when you need. To be more specific think about this case: User selects year, month, day, hour, minute, millisecond, nanosecond of a network event from a browser the javascript code does a ajax call with time of this format (variant of iso8601): YYYY-MM-DDTHH:MM:SS.mmmmmmnnn (where nnn is the nanosecond representation). The python server takes that string, converts to a datetime, does all the math with its data and gives the output back using labeling data with int(nano_datetime.strftime('MMSSmmmmmmnnn')) so I've a sequence number that javascript can sort and handle easily. It's this flexibility of conversion I'm talking about.
Which is usually derived from python's datetime, like in web2py ( http://web2py.com/books/default/chapter/29/6#Record-representation ) in which timestamps are real python datetime objects and It's ORM responsability to find the right representation of that data at database level. This lead, as you know, to one of the main advantages of any ORM: abstract from the database layer and the SQL syntax. The same applies for another well known framework, Django ( your personal favorite :) ) in which DateTimeField ( https://docs.djangoproject.com/en/dev/ref/models/fields/#django.db.models.Da... ) is a date and time represented in Python by a datetime.datetime instance. We didn't need to build a webapp yet. I've been hired for it :) So I'll do very soon. Unluckly if datetime does not support nanoseconds, I cannot blame any ORM for not supporting it natively.
Wasn't the job of an ORM to abstract from actual database (either relational or not relational) such that people who use the ORM do not care about how data is represented behind it? If yes It's job of the ORM to figure out what's the best representation of a data on the given relational or non relational database.
Sure, that are all open issues but as soon as you are in favour of adding nanosecond support we can start addressing them. I'm sure there would be other people here that would like to participate to those issues too.
But let me be clear -- are you willing to help implement any of this? You can't just order a feature, you know...
Of course, as I wrote in my second message in the issue ( http://bugs.python.org/issue15443#msg166333 ) I'm ready and excited to contribute to the python core if I can. Best Regards, -- Vincenzo Ampolo http://vincenzo-ampolo.net http://goshawknest.wordpress.com

Am 25.07.2012 03:46, schrieb Guido van Rossum:
I'd vote for two separate numbers, the first similar to JDN (Julian Day Number [1]), the second for nanoseconds per day. 3600 * 1000000 fit nicely into an unsigned 32bit int. This approach has the neat benefit that we'd get rid of the timestamp_t limitations and year 2038 bug at once. IIRC datetime used to break for dates before 1970 on some system because timestamp_t was unsigned. Python could finally support dates BC! JDN is widely used by astronomers and historians to supports a wide range of years as well as convert between calendar systems. Its day 0 is January 1, 4713 BC in proleptic Julian calendar. The conversion between Julian and Gregorian calendar makes JDN hard to use. Rata Die (Januar 1, 1 AD at midnight in proleptic Gregorian calender) sounds like a good idea. People in need for a high precision timer should also consinder TAI [2] instead of UTC as TAI doesn't have leap seconds. DJB's daemontools specifies a tai64n log format [3] that is similar to your idea. Christian [1] http://en.wikipedia.org/wiki/Julian_Day_Number [2] http://en.wikipedia.org/wiki/International_Atomic_Time [3] http://cr.yp.to/daemontools/tai64n.html

On Wed, 25 Jul 2012 11:24:14 +0200 Christian Heimes <lists@cheimes.de> wrote:
But 24 * 3600 * 1e9 doesn't. Perhaps I didn't understand your proposal. Regards Antoine. -- Software development and contracting: http://pro.pitrou.net

Am 25.07.2012 13:48, schrieb Antoine Pitrou:
What the h... was I thinking? I confused nano with micro and forgot the hours, how embarrassing. :( days ---- 32bit signed integer numbers of days since Jan 1, 1 AD in proleptic Gregorian calendar (aka modern civil calendar). That's Rata Die minus one day since it defines Jan 1, 1 AD as day 1. This allows days between year 5.8 Mio in the past and 5.8 Mio in the future ((1<<31) // 365.242 ~ 5879618). nanoseconds ----------- 64bit signed or unsigned integer more than enough for nanosecond granularity (47bits), we could easily push it to pico seconds resolution (57bits) in the future. Christian

Christian Heimes <lists@cheimes.de> wrote:
An alternate strategy might be to use tai64/tai64n/tai64na, which can represent any time over the course of a few hundred billion years to second/nanosecond/attosecond, respectively. They're well-defined, and there's a fair bit of software that can use or manipulate dates in these formats. tai64 is defined similar to propleptic Gregorian in that it uses an idealized 24*60*60 second day, etc. Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://pyropus.ca/software/ -----------------------------------------------------------------------

On Wed, Jul 25, 2012 at 7:24 PM, Christian Heimes <lists@cheimes.de> wrote:
Alternatively, use Decimal as the internal representation (backed by cdecimal if additional speed is needed). However, rather than getting buried in the weeds right here: 1. For the reasons presented, I think it's worth attempting to define a common API that is based on datetime, but is tailored towards high precision time operations (at least using a different internal representation, perhaps supporting TAI). 2. I don't think the stdlib is the right place to define the initial version of this. It seems most sensible to first fork the pure Python version of datetime, figure out the details to get that working as a new distribution on PyPI, and then fork the C implementation to make the PyPI version faster. Assuming it can be completed in time, the revised API could then be brought back as a PEP (alternatively, depending on the details of the proposal, the use case may be deemed sufficiently rare that it is just kept as a specialist module on PyPI). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Am 25.07.2012 14:11, schrieb Nick Coghlan:
This is a great opportunity to implement two requests at once. Some people want high precision datetime objects while others would like to see a datetime implementation that works with dates BC.
2. I don't think the stdlib is the right place to define the initial version of this.
+1

On Wed, Jul 25, 2012 at 7:11 AM, Christian Heimes <lists@cheimes.de> wrote:
Beware, people requesting dates BC rarely know what they are asking for. (E.g. Jesus wasn't born on 12/25/0001.) The calendrical ambiguities are such that representing dates that far in the past is better left to a specialized class. Read the original discussions about the datetime type; it loses meaning for dates long ago even if it can represent them, but the choice was made to ignore these and to offer a uniform abstraction for 1 <= year <= 9999. TBH I'm more worried about years >= 10000. :-)
+1 -- --Guido van Rossum (python.org/~guido)

Am 25.07.2012 16:38, schrieb Guido van Rossum:
For starters. Calendars have more subtle edge cases, for example TAI has a 10 second offset from UTC plus 15 leap seconds. Or the leap year errors in Julian calendar that are handled differently in proleptic Julian calendar which has unsystematic leap years between 45 BC and 4 AD. The rotation velocity of the Earth isn't constant, too. It's a major PITB!
TBH I'm more worried about years >= 10000. :-)
Why life in the past? The future is ... err the future! :) Christian

Hi all again, I've been quite busy these days and I collected all the suggestions about the proposal. Here is a small summary: Christian Heimes: two numbers: Julian Day Number (Rata Die) 32 bit signed integer nanoseconds in a day 64 bit signed or unsigned integer pro: fix 2038 bug cons: hard conversion to Gregorian calendar Charles Cazabon: use tai64/tai64n/tai64na pro: well defined libraries available cons: ? As ways to implement the idea there are these advices: Nick Coghlan: define common API based on datetime maybe use TAI fork the pure Python version of datetime, then fork the C implementation to make PyPI version faster, then make a PEP Guido van Rossum: must do: clever backward compatibility use fewer bits as possible stdlib is not the right place for first implementation Since I'm not a big expert of calendars and date representation I'm going to study the Julian Calendar and the TAI representation. As a first read from Wikipedia the TAI solution looks very promising. For the ways to implement the idea I also believe that It's better to have a pure python implementation (so It can be used on python 2.x and distributed on PyPI) and then a Python 3.x C implementation and a PEP submission. I'm open to any other idea/advice. If there are other people that would like to implement this with me, just write me a mail. Thank you. Best Regards, -- Vincenzo Ampolo http://vincenzo-ampolo.net http://goshawknest.wordpress.com

On Wed, Jul 25, 2012 at 9:11 PM, Christian Heimes <lists@cheimes.de> wrote:
Back when the datetime library was being designed, a limiting factor was size of the pickle (for reasons that I think no longer apply). Support for the is_dst flag was never in there, only because the extra single bit required overflowed the pickle size limit. If api changes are being considered, please consider adding this bit back to match the standard libraries. This will let me make the pytz timezone library's API saner, and allow Python to do wallclock datetime arithmetic without ambiguous cases. -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/

It can't be *that* easy. DST never is... For one, the dst flag is two bits -- it can be on, off, or undefined. Also it should probably only apply when a tzinfo is present. I don't recall that pickling ever was the reason, but it could have been the size of the in-memory representation (for reasons I can't fully recall we were *very* concerned about the memory size of datetime objects). Anyway, I don't want to be the limiting factor here, and I think this (as well as nanoseconds) should be considered, but I don't want to have to hand-hold the design. A PEP is in order. --Guido On Mon, Jul 30, 2012 at 6:56 AM, Stuart Bishop <stuart@stuartbishop.net> wrote:
-- --Guido van Rossum (python.org/~guido)

On Tue, Jul 24, 2012 at 5:58 PM, Vincenzo Ampolo <vincenzo.ampolo@gmail.com> wrote:
You're welcome.
Have you read PEP 410 and my rejection of it (http://mail.python.org/pipermail/python-dev/2012-February/116837.html)? Even though that's about using Decimal for timestamps, it could still be considered related.
Not every use case deserves an API change. :-) First you will have to show how you'd have to code this *without* nanosecond precision in datetime and how tedious that is. (I expect that representing the timestamp as a long integer expressing a posix timestamp times a billion would be very reasonable.) I didn't read the entire bug, but it mentioned something about storing datetimes in databases. Do databases support nanosecond precision? -- --Guido van Rossum (python.org/~guido)

On Tue, Jul 24, 2012 at 9:46 PM, Guido van Rossum <guido@python.org> wrote:
I didn't read the entire bug, but it mentioned something about storing datetimes in databases. Do databases support nanosecond precision?
MS SQL Server 2008 R2 has the datetime2 data type which supports 100 nanosecond (.1 microsecond) precision: http://msdn.microsoft.com/en-us/library/bb677335(v=sql.105) PostgreSQL does 1 microsecond: http://www.postgresql.org/docs/8.0/static/datatype-datetime.html If I am reading this correctly the Oracle TIMESTAMP type allows up to 9 digits of fractional seconds (1 nanosecond): http://docs.oracle.com/cd/B19306_01/server.102/b14195/sqlqr06.htm#r9c1-t3 -Chris -- Christopher Lambacher chris@kateandchris.net

On 07/24/2012 06:46 PM, Guido van Rossum wrote:
You're welcome.
Hi Guido, I'm glad you spent your time reading my mail. I would have never imagined that my mail could come to your attention.
I've read it and point 5 is very like in this issue. You said: "[...] I see only one real use case for nanosecond precision: faithful copying of the mtime/atime recorded by filesystems, in cases where the filesystem (like e.g. ext4) records these times with nanosecond precision. Even if such timestamps can't be trusted to be accurate, converting them to floats and back loses precision, and verification using tools not written in Python will flag the difference. But for this specific use case a much simpler set of API changes will suffice; only os.stat() and os.utime() need to change slightly (and variants of os.stat() like os.fstat()). [...]" I think that's based on a wrong hypothesis: just one case -> let's handle in a different way (modifying os.stat() and os.utime()). I would say: It's not just one case, there are at lest other two scenarios. One is like mine, parsing network packets, the other one is in parsing stock trading data. But in this case there is no os.stat() or os.utime() that can be modified. I've to write my own class to deal with time and loose all the power and flexibility that the datetime module adds to the python language.
Yeah that's exactly how we built our Time class to handle this, and we wrote also a Duration class to represent timedelta. The code we developed is 383 python lines long but is not comparable with all the functionalities that the datetime module offers and it's also really slow compared to native datetime module which is written in C. As you may think using that approach in a web application is very limiting since there is no strftime() in this custom class. I cannot share the code right now since It's copyrighted by the company I work for but I've asked permission to do so. I just need to wait tomorrow morning (PDT time) so they approve my request. Looking at the code you can see how tedious is to try to remake all the conversions that are already implemented on the datetime module. Just let me know if you actually want to have a look at the code.
I didn't read the entire bug, but it mentioned something about storing datetimes in databases. Do databases support nanosecond precision?
Yeah. According to http://wiki.ispirer.com/sqlways/postgresql/data-types/timestamp at least Oracle support timestamps with nanoseconds accuracy, SQL server supports 100 nanosecond accuracy. Since I use Postgresql personally the best way to accomplish it (also suggested by the #postgresql on freenode) is to store the timestamp with nanosecond (like 1343158283.880338907242) as bigint and let the ORM (so a python ORM) do all the conversion job. An yet again, having nanosecond support in datetime would make things much more easy. While writing this mail Chris Lambacher answered with more data about nanosecond support on databases Best Regards, -- Vincenzo Ampolo http://vincenzo-ampolo.net http://goshawknest.wordpress.com

On Tue, Jul 24, 2012 at 8:25 PM, Vincenzo Ampolo <vincenzo.ampolo@gmail.com> wrote:
Stop brownnosing already. :-) If you'd followed python-dev you'd known I read it.
Also, this use case is unlike the PEP 410 use case, because the timestamps there use a numeric type, not datetime (and that was separately argued).
So what functionality specifically do you require? You speak in generalities but I need specifics.
As you may think using that approach in a web application is very limiting since there is no strftime() in this custom class.
Apparently you didn't need it? :-) Web frameworks usually have their own date/time formatting anyway.
I believe you.
How so, given that the database you use doesn't support it?
While writing this mail Chris Lambacher answered with more data about nanosecond support on databases
Thanks, Chris. TBH, I think that adding nanosecond precision to the datetime type is not unthinkable. You'll have to come up with some clever backward compatibility in the API though, and that will probably be a bit ugly (you'd have a microsecond parameter with a range of 0-1000000 and a nanosecond parameter with a range of 0-1000). Also the space it takes in memory would probably increase (there's no room for an extra 10 bits in the carefully arranged 8-byte internal representation). But let me be clear -- are you willing to help implement any of this? You can't just order a feature, you know... -- --Guido van Rossum (python.org/~guido)

On 07/24/2012 08:47 PM, Guido van Rossum wrote:
So what functionality specifically do you require? You speak in generalities but I need specifics.
The ability of thinking datetime.datetime as a flexible class that can give you the representation you need when you need. To be more specific think about this case: User selects year, month, day, hour, minute, millisecond, nanosecond of a network event from a browser the javascript code does a ajax call with time of this format (variant of iso8601): YYYY-MM-DDTHH:MM:SS.mmmmmmnnn (where nnn is the nanosecond representation). The python server takes that string, converts to a datetime, does all the math with its data and gives the output back using labeling data with int(nano_datetime.strftime('MMSSmmmmmmnnn')) so I've a sequence number that javascript can sort and handle easily. It's this flexibility of conversion I'm talking about.
Which is usually derived from python's datetime, like in web2py ( http://web2py.com/books/default/chapter/29/6#Record-representation ) in which timestamps are real python datetime objects and It's ORM responsability to find the right representation of that data at database level. This lead, as you know, to one of the main advantages of any ORM: abstract from the database layer and the SQL syntax. The same applies for another well known framework, Django ( your personal favorite :) ) in which DateTimeField ( https://docs.djangoproject.com/en/dev/ref/models/fields/#django.db.models.Da... ) is a date and time represented in Python by a datetime.datetime instance. We didn't need to build a webapp yet. I've been hired for it :) So I'll do very soon. Unluckly if datetime does not support nanoseconds, I cannot blame any ORM for not supporting it natively.
Wasn't the job of an ORM to abstract from actual database (either relational or not relational) such that people who use the ORM do not care about how data is represented behind it? If yes It's job of the ORM to figure out what's the best representation of a data on the given relational or non relational database.
Sure, that are all open issues but as soon as you are in favour of adding nanosecond support we can start addressing them. I'm sure there would be other people here that would like to participate to those issues too.
But let me be clear -- are you willing to help implement any of this? You can't just order a feature, you know...
Of course, as I wrote in my second message in the issue ( http://bugs.python.org/issue15443#msg166333 ) I'm ready and excited to contribute to the python core if I can. Best Regards, -- Vincenzo Ampolo http://vincenzo-ampolo.net http://goshawknest.wordpress.com

Am 25.07.2012 03:46, schrieb Guido van Rossum:
I'd vote for two separate numbers, the first similar to JDN (Julian Day Number [1]), the second for nanoseconds per day. 3600 * 1000000 fit nicely into an unsigned 32bit int. This approach has the neat benefit that we'd get rid of the timestamp_t limitations and year 2038 bug at once. IIRC datetime used to break for dates before 1970 on some system because timestamp_t was unsigned. Python could finally support dates BC! JDN is widely used by astronomers and historians to supports a wide range of years as well as convert between calendar systems. Its day 0 is January 1, 4713 BC in proleptic Julian calendar. The conversion between Julian and Gregorian calendar makes JDN hard to use. Rata Die (Januar 1, 1 AD at midnight in proleptic Gregorian calender) sounds like a good idea. People in need for a high precision timer should also consinder TAI [2] instead of UTC as TAI doesn't have leap seconds. DJB's daemontools specifies a tai64n log format [3] that is similar to your idea. Christian [1] http://en.wikipedia.org/wiki/Julian_Day_Number [2] http://en.wikipedia.org/wiki/International_Atomic_Time [3] http://cr.yp.to/daemontools/tai64n.html

On Wed, 25 Jul 2012 11:24:14 +0200 Christian Heimes <lists@cheimes.de> wrote:
But 24 * 3600 * 1e9 doesn't. Perhaps I didn't understand your proposal. Regards Antoine. -- Software development and contracting: http://pro.pitrou.net

Am 25.07.2012 13:48, schrieb Antoine Pitrou:
What the h... was I thinking? I confused nano with micro and forgot the hours, how embarrassing. :( days ---- 32bit signed integer numbers of days since Jan 1, 1 AD in proleptic Gregorian calendar (aka modern civil calendar). That's Rata Die minus one day since it defines Jan 1, 1 AD as day 1. This allows days between year 5.8 Mio in the past and 5.8 Mio in the future ((1<<31) // 365.242 ~ 5879618). nanoseconds ----------- 64bit signed or unsigned integer more than enough for nanosecond granularity (47bits), we could easily push it to pico seconds resolution (57bits) in the future. Christian

Christian Heimes <lists@cheimes.de> wrote:
An alternate strategy might be to use tai64/tai64n/tai64na, which can represent any time over the course of a few hundred billion years to second/nanosecond/attosecond, respectively. They're well-defined, and there's a fair bit of software that can use or manipulate dates in these formats. tai64 is defined similar to propleptic Gregorian in that it uses an idealized 24*60*60 second day, etc. Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://pyropus.ca/software/ -----------------------------------------------------------------------

On Wed, Jul 25, 2012 at 7:24 PM, Christian Heimes <lists@cheimes.de> wrote:
Alternatively, use Decimal as the internal representation (backed by cdecimal if additional speed is needed). However, rather than getting buried in the weeds right here: 1. For the reasons presented, I think it's worth attempting to define a common API that is based on datetime, but is tailored towards high precision time operations (at least using a different internal representation, perhaps supporting TAI). 2. I don't think the stdlib is the right place to define the initial version of this. It seems most sensible to first fork the pure Python version of datetime, figure out the details to get that working as a new distribution on PyPI, and then fork the C implementation to make the PyPI version faster. Assuming it can be completed in time, the revised API could then be brought back as a PEP (alternatively, depending on the details of the proposal, the use case may be deemed sufficiently rare that it is just kept as a specialist module on PyPI). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Am 25.07.2012 14:11, schrieb Nick Coghlan:
This is a great opportunity to implement two requests at once. Some people want high precision datetime objects while others would like to see a datetime implementation that works with dates BC.
2. I don't think the stdlib is the right place to define the initial version of this.
+1

On Wed, Jul 25, 2012 at 7:11 AM, Christian Heimes <lists@cheimes.de> wrote:
Beware, people requesting dates BC rarely know what they are asking for. (E.g. Jesus wasn't born on 12/25/0001.) The calendrical ambiguities are such that representing dates that far in the past is better left to a specialized class. Read the original discussions about the datetime type; it loses meaning for dates long ago even if it can represent them, but the choice was made to ignore these and to offer a uniform abstraction for 1 <= year <= 9999. TBH I'm more worried about years >= 10000. :-)
+1 -- --Guido van Rossum (python.org/~guido)

Am 25.07.2012 16:38, schrieb Guido van Rossum:
For starters. Calendars have more subtle edge cases, for example TAI has a 10 second offset from UTC plus 15 leap seconds. Or the leap year errors in Julian calendar that are handled differently in proleptic Julian calendar which has unsystematic leap years between 45 BC and 4 AD. The rotation velocity of the Earth isn't constant, too. It's a major PITB!
TBH I'm more worried about years >= 10000. :-)
Why life in the past? The future is ... err the future! :) Christian

Hi all again, I've been quite busy these days and I collected all the suggestions about the proposal. Here is a small summary: Christian Heimes: two numbers: Julian Day Number (Rata Die) 32 bit signed integer nanoseconds in a day 64 bit signed or unsigned integer pro: fix 2038 bug cons: hard conversion to Gregorian calendar Charles Cazabon: use tai64/tai64n/tai64na pro: well defined libraries available cons: ? As ways to implement the idea there are these advices: Nick Coghlan: define common API based on datetime maybe use TAI fork the pure Python version of datetime, then fork the C implementation to make PyPI version faster, then make a PEP Guido van Rossum: must do: clever backward compatibility use fewer bits as possible stdlib is not the right place for first implementation Since I'm not a big expert of calendars and date representation I'm going to study the Julian Calendar and the TAI representation. As a first read from Wikipedia the TAI solution looks very promising. For the ways to implement the idea I also believe that It's better to have a pure python implementation (so It can be used on python 2.x and distributed on PyPI) and then a Python 3.x C implementation and a PEP submission. I'm open to any other idea/advice. If there are other people that would like to implement this with me, just write me a mail. Thank you. Best Regards, -- Vincenzo Ampolo http://vincenzo-ampolo.net http://goshawknest.wordpress.com

On Wed, Jul 25, 2012 at 9:11 PM, Christian Heimes <lists@cheimes.de> wrote:
Back when the datetime library was being designed, a limiting factor was size of the pickle (for reasons that I think no longer apply). Support for the is_dst flag was never in there, only because the extra single bit required overflowed the pickle size limit. If api changes are being considered, please consider adding this bit back to match the standard libraries. This will let me make the pytz timezone library's API saner, and allow Python to do wallclock datetime arithmetic without ambiguous cases. -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/

It can't be *that* easy. DST never is... For one, the dst flag is two bits -- it can be on, off, or undefined. Also it should probably only apply when a tzinfo is present. I don't recall that pickling ever was the reason, but it could have been the size of the in-memory representation (for reasons I can't fully recall we were *very* concerned about the memory size of datetime objects). Anyway, I don't want to be the limiting factor here, and I think this (as well as nanoseconds) should be considered, but I don't want to have to hand-hold the design. A PEP is in order. --Guido On Mon, Jul 30, 2012 at 6:56 AM, Stuart Bishop <stuart@stuartbishop.net> wrote:
-- --Guido van Rossum (python.org/~guido)
participants (9)
-
Antoine Pitrou
-
Charles Cazabon
-
Chris Lambacher
-
Christian Heimes
-
Guido van Rossum
-
martin@v.loewis.de
-
Nick Coghlan
-
Stuart Bishop
-
Vincenzo Ampolo