Reduce platform dependence of date and time related functions

The practice of using OS functions for time handling has its worst effects on Windows, where many functions are unable to process times from before 1970-01-01 even though there is no reason for Python to have such a limitation. It also results in uneven support for strftime specifiers. Some of these functions also suffer from the Year 2038 problem on OSes with a 32-bit time_t type. I propose supplying pure-python implementations (in accordance with PEP 399) for the entire datetime module, and additionally the asctime, strftime, strptime, and gmtime functions in the time module, and calendar.timegm. Unfortunately, functions dealing with local time stamps in the system's idea of local time are still dependent on the platform's C library functions (localtime, mktime, ctime) Or, if this is not practical, supplying alternate implementations of the relevant C functions, and calling these instead wherever these are used. If it is practical to do so, these functions should use python integers as the type for timestamps; if not, they should use 64-bit integers in preference to the platform time_t. Is it reasonable to expose the possibility of an epoch other than 1970 (or of timestamps that handle leap seconds in a different manner than POSIX) at a python level? Even if such a platform ever comes to be supported, it could be done so with a layer that hides these differences.

On Mon, Sep 16, 2013 at 02:49:11PM -0400, random832@fastmail.us wrote:
I propose supplying pure-python implementations (in accordance with PEP 399) for the entire datetime module [...] Or, if this is not practical, supplying alternate implementations of the relevant C functions
There is a well-known module mx.DateTime. It is not a drop-in replacement for module datetime, but it's quite good for its task and has excellent documentation. eGenix provides binaries for all major OSes and Python versions under a liberal open source license. Take a look at: http://www.egenix.com/products/python/mxBase/mxDateTime/ Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On Mon, Sep 16, 2013 at 2:49 PM, <random832@fastmail.us> wrote:
I propose supplying pure-python implementations (in accordance with PEP 399) for the entire datetime module
We already have that in python 3.x: http://bugs.python.org/issue7989 I believe it still has some platform dependencies through the time module. The idea to provide pure python implementation of the time module was proposed and rejected: http://bugs.python.org/issue9528 If you would like to improve cross-platform compatibility in this area, I would start with re-implementation of strftime(). See http://bugs.python.org/issue3173

On Mon, Sep 16, 2013, at 15:02, Alexander Belopolsky wrote:
On Mon, Sep 16, 2013 at 2:49 PM, <random832@fastmail.us> wrote:
I propose supplying pure-python implementations (in accordance with PEP 399) for the entire datetime module
We already have that in python 3.x:
Sorry - it was unclear to me that simply clicking "browse" from http://hg.python.org/cpython/ did not result in browsing the latest source. (What branch is that? It's not "default")
The idea to provide pure python implementation of the time module was proposed and rejected:
This is a much more limited scope than that. I was merely proposing a limited set of functions - this could be implemented in the same way as the posix module, with a small pure python module that imports everything from the larger C module. These could simply be implemented in C instead - are we guaranteed to have a 64-bit integer type available? My main concern (for pure python vs C) was whether or not it is possible to work with greater than 32 bit values on a 32 bit system. If necessary we could do some of the work in double - the input is double, anyway, so it won't be outside that range. Do you have any thoughts on the rest of the proposal (that gmtime, timegm, and strftime should have unlimited - or at least not limited to low platform-specific limits like 1970 or 2038 - range, that python "epoch timestamps" should be defined as beginning in 1970 and not including leap seconds regardless of hypothetical [I don't believe any currently supported systems actually do, except to the extent that individual Unix sites can use so-called "right" tz data] systems that may have a time_t that behaves otherwise, that tm_gmtoff and tm_zone should always be provided)? One concern for strftime in particular is locale support. It may be difficult to query the relevant locale data in a portable manner.

On Tue, Sep 17, 2013 at 12:01 PM, <random832@fastmail.us> wrote:
We already have that in python 3.x:
Sorry - it was unclear to me that simply clicking "browse" from http://hg.python.org/cpython/ did not result in browsing the latest source. (What branch is that? It's not "default")

On Tue, Sep 17, 2013 at 12:01 PM, <random832@fastmail.us> wrote:
On Mon, Sep 16, 2013, at 15:02, Alexander Belopolsky wrote:
On Mon, Sep 16, 2013 at 2:49 PM, <random832@fastmail.us> wrote:
I propose supplying pure-python implementations (in accordance with PEP 399) for the entire datetime module
We already have that in python 3.x:
Sorry - it was unclear to me that simply clicking "browse" from http://hg.python.org/cpython/ did not result in browsing the latest source. (What branch is that? It's not "default")
Depends on the last commit (it's an hgweb thing; always specify the branch).
The idea to provide pure python implementation of the time module was proposed and rejected:
This is a much more limited scope than that. I was merely proposing a limited set of functions - this could be implemented in the same way as the posix module, with a small pure python module that imports everything from the larger C module. These could simply be implemented in C instead - are we guaranteed to have a 64-bit integer type available? My main concern (for pure python vs C) was whether or not it is possible to work with greater than 32 bit values on a 32 bit system. If necessary we could do some of the work in double - the input is double, anyway, so it won't be outside that range.
Do you have any thoughts on the rest of the proposal (that gmtime, timegm, and strftime should have unlimited - or at least not limited to low platform-specific limits like 1970 or 2038 - range, that python "epoch timestamps" should be defined as beginning in 1970 and not including leap seconds regardless of hypothetical [I don't believe any currently supported systems actually do, except to the extent that individual Unix sites can use so-called "right" tz data] systems that may have a time_t that behaves otherwise, that tm_gmtoff and tm_zone should always be provided)?
One concern for strftime in particular is locale support. It may be difficult to query the relevant locale data in a portable manner.
You also have the issue that if you port strftime then you lose the pure Python port of strptime: http://hg.python.org/cpython/file/default/Lib/_strptime.py

On Tue, Sep 17, 2013, at 12:19, Brett Cannon wrote:
You also have the issue that if you port strftime then you lose the pure Python port of strptime: http://hg.python.org/cpython/file/default/Lib/_strptime.py
Why would that make you lose that? I'm not sure I understand.

On Tue, Sep 17, 2013 at 12:27 PM, <random832@fastmail.us> wrote:
On Tue, Sep 17, 2013, at 12:19, Brett Cannon wrote:
You also have the issue that if you port strftime then you lose the pure Python port of strptime: http://hg.python.org/cpython/file/default/Lib/_strptime.py
Why would that make you lose that? I'm not sure I understand.
strptime is implemented using strftime to get the locale information. As you pointed out, getting the locale details is essentially not possible in a cross-platform way unless you use strptime or strftime, so you have to choose which is implemented in Python and relies on the other.

On Tue, Sep 17, 2013 at 1:02 PM, Brett Cannon <brett@python.org> wrote:
As you pointed out, getting the locale details is essentially not possible in a cross-platform way unless you use strptime or strftime, so you have to choose which is implemented in Python and relies on the other.
What we can do is to implement "C" locale behavior. In fact, in many uses of strftime() its locale-dependence is a problem. I would much rather have strftime_l()-like function and "C" locale implemented in stdlib. This is somewhat similar to the situation we have with timezone support: include utc timezone and leave it to third parties to supply interfaces to platform tz databases.

On Sep 17, 2013, at 10:41, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Tue, Sep 17, 2013 at 1:02 PM, Brett Cannon <brett@python.org> wrote:
As you pointed out, getting the locale details is essentially not possible in a cross-platform way unless you use strptime or strftime, so you have to choose which is implemented in Python and relies on the other.
What we can do is to implement "C" locale behavior. In fact, in many uses of strftime() its locale-dependence is a problem.
But in many cases it's useful. And the platform doesn't give us any way to get enough information about the locale to implement it ourselves. It's the same reason we have naive local times--local times are useful, the platform doesn't give us enough information about the local timezone, so we have to use what it gives us.
I would much rather have strftime_l()-like function and "C" locale implemented in stdlib.
I agree that having both would be useful. If you're suggesting renaming platform-dependent locale-handling strftime to strftime_l, and adding a new "C"-locale-only strftime, I don't like the naming. The function that acts just like the POSIX function strftime, and like the Python function in every version up to now, should be called strftime; give the new function a different name instead. Otherwise, I can't see a problem.

On Thu, Sep 19, 2013 at 12:07 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
If you're suggesting renaming platform-dependent locale-handling strftime to strftime_l, ...
I was thinking of changing datetime.strftime(fmt) signature to strftime(fmt, locale=None) with default behavior being the same as now and d.strftime(fmt, "C") invoking new internal C-locale implementation.

On Sep 19, 2013, at 10:57, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Thu, Sep 19, 2013 at 12:07 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
If you're suggesting renaming platform-dependent locale-handling strftime to strftime_l, ...
I was thinking of changing datetime.strftime(fmt) signature to strftime(fmt, locale=None) with default behavior being the same as now and d.strftime(fmt, "C") invoking new internal C-locale implementation.
But that API implies that you could call, e.g., d.strftime(fmt, "pt_BR"), which I assume isn't something anyone is planning on implementing.

On Thu, Sep 19, 2013, at 18:00, Andrew Barnert wrote:
But that API implies that you could call, e.g., d.strftime(fmt, "pt_BR"), which I assume isn't something anyone is planning on implementing.
Well, you could implement it by acquiring the GIL, setting the locale (putenv + setlocale), calling the platform strftime, and then resetting the locale afterward - all while locked, to prevent exposing the temporary strftime change to other code. (This also suggests a way to implement a tzinfo object in terms of native timezones) Long-term it would be nice to have python ship its own locale data, and/or to acquire platform-specific locale data via GetLocaleInfo[Ex] on windows and nl_langinfo on POSIX OSes where it is provided. (Note that the latter still would require stopping everything and setting the global locale to acquire the data, but since you've got to translate a locale name to a handle to use GetLocaleInfo or xlocale, it'd make sense to encapsulate this in a locale object which does all this upon being created. With platform strftime as a fallback. The issue with using platform strftime to populate things in advance is that %O is difficult and %E may be intractable.

On Sep 20, 2013, at 7:46, random832@fastmail.us wrote:
On Thu, Sep 19, 2013, at 18:00, Andrew Barnert wrote:
But that API implies that you could call, e.g., d.strftime(fmt, "pt_BR"), which I assume isn't something anyone is planning on implementing.
Well, you could implement it by acquiring the GIL, setting the locale (putenv + setlocale), calling the platform strftime, and then resetting the locale afterward - all while locked, to prevent exposing the temporary strftime change to other code. (This also suggests a way to implement a tzinfo object in terms of native timezones)
OK, yes, you could do that, but are you actually proposing that the stdlib should do so? If not, it's a misleading API. If so, it's a much larger proposal than what we initially started with. And I think providing C-locale str[fp]time with very wide, platform-independent limits is a useful idea even without this much more radical idea.
Long-term it would be nice to have python ship its own locale data, and/or to acquire platform-specific locale data via GetLocaleInfo[Ex] on windows and nl_langinfo on POSIX OSes where it is provided.
IIRC, OS X has a different set of (CoreFoundation-based?) APIs that take the system preferences into account as well as the locale setting, which might be worth using if you're designing the ultimate locale handling system; otherwise your apps won't act like native Cocoa apps. For that matter, both Windows and OS X have more than one notion of the local date format (long vs. short names, etc.); do you want to expose that as well, or just stick to the POSIX-like subset of each platform's capabilities?
(Note that the latter still would require stopping everything and setting the global locale to acquire the data, but since you've got to translate a locale name to a handle to use GetLocaleInfo or xlocale, it'd make sense to encapsulate this in a locale object which does all this upon being created. With platform strftime as a fallback. The issue with using platform strftime to populate things in advance is that %O is difficult and %E may be intractable.

On Fri, Sep 20, 2013, at 11:59, Andrew Barnert wrote:
OK, yes, you could do that, but are you actually proposing that the stdlib should do so? If not, it's a misleading API. If so, it's a much larger proposal than what we initially started with. And I think providing C-locale str[fp]time with very wide, platform-independent limits is a useful idea even without this much more radical idea.
We've basically got five "kinds" of locale we are talking about: "C" locale - this is the easiest one to implement in a platform-independent way, but probably the least useful (if you're not intending locale-specific display, you should probably be using numeric values) Current platform locale, including all the subtleties like user preferences you mentioned, when available. This is what we support now. Specified platform locale (e.g. pt_BR, and we may still want to translate from a single format rather than needing to specify 0x0416 or "PTB" on Windows) Platform-independent version of a specified locale, using e.g. CLDR. This is the second-easiest to implement in a platform-independent way. Platform-independent version of user's current locale. There are limits to what can be achieved with this, for example Windows (and maybe Mac OS - I know the pre-OSX versions did) lets you set certain things individually. For example, I have my short date format set to yyyy-MM-dd, but otherwise I'm in the en-US locale. Anyway, this should be separate from the discussion of removing the limitations of the platform code. Locale-specific data can be acquired by calling the platform's strftime for a platform-independent strftime just as it's done for strptime now - and we'd need it as a fallback anyway. You can reduce the impact of platform's range limitations and incompatible repertoire of format specifiers by doing them individually, with a "safe" value for the year if needed, rather than throwing the whole format string to the platform function. For local time on windows, incidentally, we could extend the usable range by calling SystemTimeToTzSpecificLocalTime, but that loses the ability to use MSVCRT's version of the POSIX TZ variable.

On Tue, Sep 17, 2013 at 12:01 PM, <random832@fastmail.us> wrote:
Do you have any thoughts on the rest of the proposal (that gmtime, timegm, and strftime should have unlimited - or at least not limited to low platform-specific limits like 1970 or 2038 - range, that python "epoch timestamps" should be defined as beginning in 1970 and not including leap seconds regardless of hypothetical [I don't believe any currently supported systems actually do, except to the extent that individual Unix sites can use so-called "right" tz data] systems that may have a time_t that behaves otherwise, that tm_gmtoff and tm_zone should always be provided)?
You should review what's new in 3.x documents. Many of the features that you ask for have already been implemented.

On Tue, Sep 17, 2013, at 12:23, Alexander Belopolsky wrote:
You should review what's new in 3.x documents. Many of the features that you ask for have already been implemented.
To what are you referring? 3.4 what's new mentions no changes related to the time module. 3.3 mentions only new functions unrelated to time conversions. The change mentioned in 3.2 does not fix limitations caused by the platform. 32-bit platforms are still limited by the range of time_t for gmtime [and e.g. datetime.fromtimestamp], and MSVC, while having a 64-bit time_t, is limited to positive values (and arbitrarily imposes the same limitation on functions that accept a struct tm, rejecting any time that would, interpreted as local time, result in a value before 1970-01-01 00:00:00 GMT) 3.1 and 3.0 mention no changes to the time module. All of the issues I mentioned apply to 3.3 (You may not have noticed the range issue as it may not apply to your platform, and by "should always be provided" i meant _always_, even if the platform doesn't provide them - they can be populated from timezone/altzone and tzname in that case), and the epoch/leap second thing is still clearly present in the 3.4 docs. I personally confirmed every single issue I mentioned except for the one about a pure-python implementation of datetime (which was because I was misled by the web hg browser), and except for the year 2038 limitation that does not apply on this system, on this version: Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:06:53) [MSC v.1600 64 bit (AMD64)] on win32

On Tue, Sep 17, 2013 at 12:49 PM, <random832@fastmail.us> wrote:
32-bit platforms are still limited by the range of time_t for gmtime [and e.g. datetime.fromtimestamp],
datetime.fromtimestamp() is not the same as gmtime. You should use datetime.utcfromtimestamp() which is only limited by supported date range (years 1-9999).

On Tue, Sep 17, 2013, at 13:29, Alexander Belopolsky wrote:
On Tue, Sep 17, 2013 at 12:49 PM, <random832@fastmail.us> wrote:
32-bit platforms are still limited by the range of time_t for gmtime [and e.g. datetime.fromtimestamp],
datetime.fromtimestamp() is not the same as gmtime. You should use datetime.utcfromtimestamp() which is only limited by supported date range (years 1-9999).
fromtimestamp(timestamp, timezone.utc). And anyway, I was listing it as _another example_ of a function in datetime which is limited by the range of time_t, not as one that is somehow "the same as" gmtime. And even if you want to play this game, you are WRONG WRONG WRONG about utcfromtimestamp: Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:06:53) [MSC v.1600 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.
from datetime import * datetime.utcfromtimestamp(-100000) # should be 1969-12-30 20:13:20 Traceback (most recent call last): File "<stdin>", line 1, in <module> OSError: [Errno 22] Invalid argument datetime.utcfromtimestamp(2**63) Traceback (most recent call last): File "<stdin>", line 1, in <module> OverflowError: timestamp out of range for platform time_t
(I don't care, per se, about 300 billion years from now, but I am 99% certain I'd get the same result for the latter with 2**31 on 32-bit Unix. This was to illustrate that it requires it to be in the range of the platform time_t type.) I feel like you're being deliberately obtuse at this point.

On Tue, Sep 17, 2013 at 3:08 PM, <random832@fastmail.us> wrote:
fromtimestamp(timestamp, timezone.utc).
And anyway, I was listing it as _another example_ of a function in datetime which is limited by the range of time_t, not as one that is somehow "the same as" gmtime. And even if you want to play this game, you are WRONG WRONG WRONG about utcfromtimestamp:
I would say this is a bug. Is fromtimestamp(timestamp, timezone.utc) similarly affected? Please submit a bug report.

From the sublime to the, er ... plebeian? Just an idea for Python 4: Is there any good reason to have separate time and datetime modules? I sometimes find myself spinning my wheels converting between a format supported by one and a format supported by the other. Rob Cliffe On 16/09/2013 19:49, random832@fastmail.us wrote:
The practice of using OS functions for time handling has its worst effects on Windows, where many functions are unable to process times from before 1970-01-01 even though there is no reason for Python to have such a limitation. It also results in uneven support for strftime specifiers. Some of these functions also suffer from the Year 2038 problem on OSes with a 32-bit time_t type.
I propose supplying pure-python implementations (in accordance with PEP 399) for the entire datetime module, and additionally the asctime, strftime, strptime, and gmtime functions in the time module, and calendar.timegm. Unfortunately, functions dealing with local time stamps in the system's idea of local time are still dependent on the platform's C library functions (localtime, mktime, ctime)
Or, if this is not practical, supplying alternate implementations of the relevant C functions, and calling these instead wherever these are used. If it is practical to do so, these functions should use python integers as the type for timestamps; if not, they should use 64-bit integers in preference to the platform time_t.
Is it reasonable to expose the possibility of an epoch other than 1970 (or of timestamps that handle leap seconds in a different manner than POSIX) at a python level? Even if such a platform ever comes to be supported, it could be done so with a layer that hides these differences. _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas
----- No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.2242 / Virus Database: 3222/6171 - Release Date: 09/16/13

Rob Cliffe <rob.cliffe@btinternet.com> writes:
From the sublime to the, er ... plebeian?
When changing the subject of discussion, please change the Subject field accordingly.
Just an idea for Python 4: Is there any good reason to have separate time and datetime modules?
That's how it's been for a long time. There is now a lot of existing Python code that uses those two modules as they are. This would not be a good reason for *introducing* such a pair of modules with confusingly-different APIs. But that's not the decision we face today, many years after those modules entered the standard library. Changes to the standard library API, especially for modules that are in long-established use, must be considered conservatively. And that *is* a good reason to continue having ‘time’ and ‘datetime’ modules which both support the existing behaviour.
I sometimes find myself spinning my wheels converting between a format supported by one and a format supported by the other.
That's a different matter, and does not challenge the continued existence of separate ‘time’ and ‘datetime’ modules. The ‘datetime’ module has grown functionality for working with the data types of the ‘time’ module. What conversions are you lacking from the current ‘datetime’ <URL:http://docs.python.org/3/library/datetime.html>? -- \ “Pinky, are you pondering what I'm pondering?” “I think so, | `\ Brain, but if the plural of mouse is mice, wouldn't the plural | _o__) of spouse be spice?” —_Pinky and The Brain_ | Ben Finney

On Tue, Sep 17, 2013 at 09:14:11AM +1000, Ben Finney wrote:
Rob Cliffe <rob.cliffe@btinternet.com> writes:
Just an idea for Python 4: Is there any good reason to have separate time and datetime modules?
That's how it's been for a long time. There is now a lot of existing Python code that uses those two modules as they are. [...] Changes to the standard library API, especially for modules that are in long-established use, must be considered conservatively. And that *is* a good reason to continue having ‘time’ and ‘datetime’ modules which both support the existing behaviour.
Agreed. But I suggest to Rob, or anyone else who likes the idea of merging the two modules and is willing to do the work, to start off by creating an interface module that wraps the two. Call it (for lack of a better name) "mytime". When the "mytime" module is sufficiently mature, which may require publishing it on PyPI for the public to use, it could potentially be added to the standard library as a high level interface to the lower-level time and datetime modules. That doesn't need to wait for Python 4000. I'm +0 on the general idea. I don't use either module enough to be annoyed by there being two of them. (Three if you include calendar.) -- Steven

I have an addition to this proposal: struct_time should always provide tm_gmtoff and tm_zone, gmtime should populate them with 0 and GMT*, and if the platform does not provide values localtime should populate them with timezone or altzone and values from tzname depending on if isdst is true after calling the platform localtime function. *The current practice of the reference code of the "tz" project and of at least glibc is to use GMT. If anyone has an argument that it should be UTC or some other value on some platforms, please speak up.

2013/9/17 <random832@fastmail.us>:
I have an addition to this proposal: struct_time should always provide tm_gmtoff and tm_zone, gmtime should populate them with 0 and GMT*, and if the platform does not provide values localtime should populate them with timezone or altzone and values from tzname depending on if isdst is true after calling the platform localtime function.
In Python, "unknown" is usually written None. It's safer than filling the structure with invalid values. Victor

On Tue, Sep 17, 2013, at 9:31, Victor Stinner wrote:
2013/9/17 <random832@fastmail.us>:
I have an addition to this proposal: struct_time should always provide tm_gmtoff and tm_zone, gmtime should populate them with 0 and GMT*, and if the platform does not provide values localtime should populate them with timezone or altzone and values from tzname depending on if isdst is true after calling the platform localtime function.
In Python, "unknown" is usually written None. It's safer than filling the structure with invalid values.
They're not unknown. The values are provided by the system in global variables. If timezone, altzone, and tzname should not be used, then they should not be provided. You can also determine gmtoff empirically by calling timegm and subtracting the original timestamp from the result. Or you could look at the seconds, minutes, hours, year, and yday members after calling both gmtime and localtime in the first place.

On 09/17/2013 01:58 PM, random832 wrote:
On Tue, Sep 17, 2013, at 9:31, Victor Stinner wrote:
In Python, "unknown" is usually written None. It's safer than filling the structure with invalid values.
You can also determine gmtoff empirically by calling timegm and subtracting the original timestamp from the result. Or you could look at the seconds, minutes, hours, year, and yday members after calling both gmtime and localtime in the first place.
Is timegm/gmtime provided and consistent across all Python platforms? -- ~Ethan~

On Tue, Sep 17, 2013, at 17:15, Ethan Furman wrote:
Is timegm/gmtime provided and consistent across all Python platforms?
Part of what I was proposing was _to_ provide a consistent implementation - there's no reason (if we define timestamps as being objectively based in 1970 and having no leap seconds) that it couldn't be provided in python itself instead of using the system's version.

On 18 September 2013 11:30, <random832@fastmail.us> wrote:
On Tue, Sep 17, 2013, at 17:15, Ethan Furman wrote:
Is timegm/gmtime provided and consistent across all Python platforms?
Part of what I was proposing was _to_ provide a consistent implementation - there's no reason (if we define timestamps as being objectively based in 1970 and having no leap seconds) that it couldn't be provided in python itself instead of using the system's version.
Yeah, this is a similar change to the one that was made for math.c years ago - stepping up from merely relying on the system libraries to ensuring a consistent cross-platform experience. It's just a concern with initial development and long term maintenance effort, rather than a fundamental desire to expose the raw platform behaviour (there are *some* modules where we want to let developers have access to the underlying platform specific behaviour, but the datetime APIs aren't really one of them) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 18.09.2013 03:37, Nick Coghlan wrote:
On 18 September 2013 11:30, <random832@fastmail.us> wrote:
On Tue, Sep 17, 2013, at 17:15, Ethan Furman wrote:
Is timegm/gmtime provided and consistent across all Python platforms?
Part of what I was proposing was _to_ provide a consistent implementation - there's no reason (if we define timestamps as being objectively based in 1970 and having no leap seconds) that it couldn't be provided in python itself instead of using the system's version.
Yeah, this is a similar change to the one that was made for math.c years ago - stepping up from merely relying on the system libraries to ensuring a consistent cross-platform experience. It's just a concern with initial development and long term maintenance effort, rather than a fundamental desire to expose the raw platform behaviour (there are *some* modules where we want to let developers have access to the underlying platform specific behaviour, but the datetime APIs aren't really one of them)
I wonder why you'd want to use Unix ticks (what datetime calls a timestamp) as basis for cross-platform date/time calculations. If you really need a time_t representation of date/time values, you're stuck with the platform dependent limitations anyway. The time C functions are useful to tap into the OS's time zone library, but time zone data changes regularly, so predictions that go even only a few years into the future are bound to fail for some zones. You can only reliably use UTC/GMT for absolute future date/time values. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 18 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2013-09-11: Released eGenix PyRun 1.3.0 ... http://egenix.com/go49 2013-09-20: PyCon UK 2013, Coventry, UK ... 2 days to go 2013-09-28: PyDDF Sprint ... 10 days to go eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Wed, Sep 18, 2013, at 3:42, M.-A. Lemburg wrote:
I wonder why you'd want to use Unix ticks (what datetime calls a timestamp) as basis for cross-platform date/time calculations.
Because we've already got half a dozen APIs that use them. And there's no particular reason to consider it _worse_ than any other scalar time representation. If we were defining the library from scratch today, we could argue the merits of using days vs seconds vs microseconds as the unit, of 1970 vs 1904 vs 1600 vs 0000 for the epoch, and whether leap seconds should be supported. But we've already got APIs that use time_t (and all supported platforms define time_t as seconds since 1970)

On 18.09.2013 15:25, random832@fastmail.us wrote:
On Wed, Sep 18, 2013, at 3:42, M.-A. Lemburg wrote:
I wonder why you'd want to use Unix ticks (what datetime calls a timestamp) as basis for cross-platform date/time calculations.
Because we've already got half a dozen APIs that use them. And there's no particular reason to consider it _worse_ than any other scalar time representation.
If we were defining the library from scratch today, we could argue the merits of using days vs seconds vs microseconds as the unit, of 1970 vs 1904 vs 1600 vs 0000 for the epoch, and whether leap seconds should be supported. But we've already got APIs that use time_t (and all supported platforms define time_t as seconds since 1970)
Right, but those APIs are all limited to what the platforms defines as t_time and like you say: those values are often limited to certain ranges. If you want platform independent representations, use one of the available conversion routines to turn the time_t values into e.g. datetime objects and ideally convert the values to UTC to avoid time zone issues. Then use those objects for date/time calculations. time_t values are really not a good basis for doing date/time calculations. Ideally, they should only be used and regarded as containers holding a platform dependent date/time value, nothing more. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 18 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2013-09-11: Released eGenix PyRun 1.3.0 ... http://egenix.com/go49 2013-09-20: PyCon UK 2013, Coventry, UK ... 2 days to go 2013-09-28: PyDDF Sprint ... 10 days to go eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Wed, Sep 18, 2013, at 9:34, M.-A. Lemburg wrote:
Right, but those APIs are all limited to what the platforms defines as t_time and like you say: those values are often limited to certain ranges.
We're going around in circles. I'm proposing _removing_ those limitations, so that for example code written for Unix systems (that assumes it can use negative values before 1970) will work on Windows, and code written for 64-bit systems will work on systems whose native time_t is 32 bits. It occurs to me that you might have misunderstood me. By "APIs" I was not referring to the platform functions themselves (which, obviously, are limited to what the platform's type can represent, and sometimes impose arbitrary limits on top of that), I was talking about datetime.fromtimestamp, the various functions in the time module, calendar.timegm, os.stat, and so on. There's no reason _those_ should be limited to what the platform defines. Just because "seconds since 1970" was invented by a platform does not mean it should be considered to be a platform-dependent representation. There's nothing _wrong_ with it as a representation of UTC, except for the fact that it can't represent leap seconds, and I suspect a lot of other things break in the presence of leap seconds anyway. The fact that timedelta is defined as a days/seconds combination, for example. In the presence of leap seconds, it shouldn't be possible to normalize them any more than if there were a months or years field.
If you want platform independent representations, use one of the available conversion routines to turn the time_t values into e.g. datetime objects and ideally convert the values to UTC to avoid time zone issues. Then use those objects for date/time calculations.
time_t values are really not a good basis for doing date/time calculations. Ideally, they should only be used and regarded as containers holding a platform dependent date/time value, nothing more.
That ship sailed long ago. This isn't a Python 4000 thread; we're talking about the API we have, not the one we want.

On Wed, Sep 18, 2013 at 1:20 PM, <random832@fastmail.us> wrote:
We're going around in circles. I'm proposing _removing_ those limitations, so that for example code written for Unix systems (that assumes it can use negative values before 1970) will work on Windows, and code written for 64-bit systems will work on systems whose native time_t is 32 bits.
That's a sign that this discussion should move to the tracker where a concrete patch can be proposed and discussed. There is at least one proposal that seems to be controversial: remove platform-dependent code from datetime.utcfromtimestamp(). The change is trivial: def utcfromtimestamp(seconds): return datetime(1970, 1, 1) + timedelta(seconds=seconds) I will gladly apply such patch once it is complete with tests and C code. The case for changing time.gmtime() is weaker. We would have to add additional dependency of time module on datetime or move or duplicate a sizable chunk of C code. If someone wants to undertake this project, I would like to see an attempt to remove circular dependency between time and datetime modules rather than couple the two modules even more tightly.

On 18.09.2013 19:37, Alexander Belopolsky wrote:
On Wed, Sep 18, 2013 at 1:20 PM, <random832@fastmail.us> wrote:
We're going around in circles. I'm proposing _removing_ those limitations, so that for example code written for Unix systems (that assumes it can use negative values before 1970) will work on Windows, and code written for 64-bit systems will work on systems whose native time_t is 32 bits.
That's a sign that this discussion should move to the tracker where a concrete patch can be proposed and discussed. There is at least one proposal that seems to be controversial: remove platform-dependent code from datetime.utcfromtimestamp().
The change is trivial:
def utcfromtimestamp(seconds): return datetime(1970, 1, 1) + timedelta(seconds=seconds)
I will gladly apply such patch once it is complete with tests and C code.
If you do apply this change, you will have to clearly state that the datetime module's understanding of a timestamp may differ from the platform definition of Unix ticks.
The case for changing time.gmtime() is weaker. We would have to add additional dependency of time module on datetime or move or duplicate a sizable chunk of C code. If someone wants to undertake this project, I would like to see an attempt to remove circular dependency between time and datetime modules rather than couple the two modules even more tightly.
-1 on changing the time module APIs. People expect those to be wrappers of the C APIs and thus also expect these APIs to implement the platform specific behavior, e.g. supporting leap seconds with gmtime(). POSIX called for not supporting leap seconds in e.g. gmtime(), but they are part of the definition of GMT/UTC and it's possible to enable support for them: http://en.wikipedia.org/wiki/Leap_second Platform comparison: http://k5wiki.kerberos.org/wiki/Leap_second_handling That said, it's very rare to find a system that actually does not implement POSIX gmtime(). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 19 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2013-09-11: Released eGenix PyRun 1.3.0 ... http://egenix.com/go49 2013-09-20: PyCon UK 2013, Coventry, UK ... tomorrow 2013-09-28: PyDDF Sprint ... 9 days to go eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Tue, Sep 17, 2013 at 4:58 PM, <random832@fastmail.us> wrote:
You can also determine gmtoff empirically by calling timegm and subtracting the original timestamp from the result. Or you could look at the seconds, minutes, hours, year, and yday members after calling both gmtime and localtime in the first place.
How is this different from what we do in datetime.astimezone()? # Compute UTC offset and compare with the value implied # by tm_isdst. If the values match, use the zone name # implied by tm_isdst. delta = local - datetime(*_time.gmtime(ts)[:6]) dst = _time.daylight and localtm.tm_isdst > 0 gmtoff = -(_time.altzone if dst else _time.timezone) if delta == timedelta(seconds=gmtoff): tz = timezone(delta, _time.tzname[dst]) else: tz = timezone(delta) http://hg.python.org/cpython/file/default/Lib/datetime.py#l1500

On Tue, Sep 17, 2013, at 17:21, Alexander Belopolsky wrote:
On Tue, Sep 17, 2013 at 4:58 PM, <random832@fastmail.us> wrote:
You can also determine gmtoff empirically by calling timegm and subtracting the original timestamp from the result. Or you could look at the seconds, minutes, hours, year, and yday members after calling both gmtime and localtime in the first place.
How is this different from what we do in datetime.astimezone()?
Not very different at all, except for the fact where I want the functionality in struct_time to populate tm_gmtoff and tm_zone where it's not available. My goal is to normalize the functionality available on all platforms, to the extent that it's possible, so that people are less likely to write non-portable code and encounter example code that doesn't work.
participants (12)
-
Alexander Belopolsky
-
Andrew Barnert
-
Ben Finney
-
Brett Cannon
-
Ethan Furman
-
M.-A. Lemburg
-
Nick Coghlan
-
Oleg Broytman
-
random832@fastmail.us
-
Rob Cliffe
-
Steven D'Aprano
-
Victor Stinner