Checking input range in time.asctime and time.ctime
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
There are several reports of bugs caused by the fact that the behavior of C functions asctime and ctime is undefined when they are asked to format time for more than 4-digit years: http://bugs.python.org/issue8013 http://bugs.python.org/issue6608 (closed) http://bugs.python.org/issue10563 (superseded by #8013) I have a patch ready at issue 8013 that adds a check for large values and causes time.asctime and time.ctime raise ValueError instead of producing system-dependent results or in some cases crashing or corrupting the python process. There is little dispute that python should not crash on invalid input, but I would like to ask for a second opinion on whether it would be better to produce some distinct 24-character string, say 'Mon Jan 1 00:00:00 *999', instead of raising an exception. Note that on some Windows systems, the current behavior is to produce '%c999' % (year // 1000 + ord('0')) for at least some large values of year. Linux asctime produces strings that are longer than 26 characters, but I don't think we should support this behavior because POSIX defines asctime() result as a 26 character string and Python manual defines time.asctime() result as a 24 character string. Producing longer timestamps is likely to break as many applications as accepting large years will fix. OSX asctime returns a NULL pointer for large years. My position is that raising an error is the right solution. This is consistent with year range supported by datetime. Another small issue that I would like to raise here is issue6608 patch resulting in time.asctime() accepting 0 as a valid entry at any position of the timetuple. This is consistent with the behavior of time.strftime(), but was overlooked when issue6608 was reviewed. I find the case for accepting say 0 month or 0 day in time.asctime() weaker than that for time.strftime() where month or day values may be ignored.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Given the rule garbage in -> garbage out, I'd do the most useful thing, which would be to produce a longer output string (and update the docs). This would match the behavior of e.g. '%04d' % y when y > 9999. If that means the platform libc asctime/ctime can't be used, too bad. --Guido On Mon, Jan 3, 2011 at 4:06 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
There are several reports of bugs caused by the fact that the behavior of C functions asctime and ctime is undefined when they are asked to format time for more than 4-digit years:
http://bugs.python.org/issue8013 http://bugs.python.org/issue6608 (closed) http://bugs.python.org/issue10563 (superseded by #8013)
I have a patch ready at issue 8013 that adds a check for large values and causes time.asctime and time.ctime raise ValueError instead of producing system-dependent results or in some cases crashing or corrupting the python process.
There is little dispute that python should not crash on invalid input, but I would like to ask for a second opinion on whether it would be better to produce some distinct 24-character string, say 'Mon Jan 1 00:00:00 *999', instead of raising an exception.
Note that on some Windows systems, the current behavior is to produce '%c999' % (year // 1000 + ord('0')) for at least some large values of year. Linux asctime produces strings that are longer than 26 characters, but I don't think we should support this behavior because POSIX defines asctime() result as a 26 character string and Python manual defines time.asctime() result as a 24 character string. Producing longer timestamps is likely to break as many applications as accepting large years will fix. OSX asctime returns a NULL pointer for large years.
My position is that raising an error is the right solution. This is consistent with year range supported by datetime.
Another small issue that I would like to raise here is issue6608 patch resulting in time.asctime() accepting 0 as a valid entry at any position of the timetuple. This is consistent with the behavior of time.strftime(), but was overlooked when issue6608 was reviewed. I find the case for accepting say 0 month or 0 day in time.asctime() weaker than that for time.strftime() where month or day values may be ignored. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Mon, Jan 3, 2011 at 7:47 PM, Guido van Rossum <guido@python.org> wrote:
Given the rule garbage in -> garbage out, I'd do the most useful thing, which would be to produce a longer output string (and update the docs).
I did not know that GIGO was a design rule, but after thinking about it some more, I agree. It is very unlikely that a Python program would care about precise length of the string produced by time.asctime() and these strings are not well suited for passing timestamps to other programs that may care. (Use of asctime() timestamps in internet protocols has been long deprecated and surely won't be in use in 10-th millennium :-)
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Mon, Jan 3, 2011 at 7:47 PM, Guido van Rossum <guido@python.org> wrote:
Given the rule garbage in -> garbage out, I'd do the most useful thing, which would be to produce a longer output string (and update the docs). This would match the behavior of e.g. '%04d' % y when y > 9999. If that means the platform libc asctime/ctime can't be used, too bad.
I've committed code that does not use platform libc asctime/ctime anymore. Now it seems odd that we support years > 9999 but not years < 1900. A commonly given explanation for rejecting years < 1900 is that Python has to support POSIX standard for 2-digit years. However, this support is conditional on the value of time.accept2dyear and several people argued that when it is set to false, full range of years should be supported. Furthermore, in order to support 2-digit years, there is no need to reject years < 1900. It may be confusing to map 99 to 1999 while accepting 100 as is, but I don't see much of the problem in accepting 4-digit years from 1000 through 1899 while mapping [0 - 99] to present times according to POSIX standard. See http://bugs.python.org/issue10827 for more.
data:image/s3,"s3://crabby-images/fef1e/fef1ed960ef8d77a98dd6e2c2701c87878206a2e" alt=""
On Wed, 5 Jan 2011 12:33:55 -0500 Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Mon, Jan 3, 2011 at 7:47 PM, Guido van Rossum <guido@python.org> wrote:
Given the rule garbage in -> garbage out, I'd do the most useful thing, which would be to produce a longer output string (and update the docs). This would match the behavior of e.g. '%04d' % y when y > 9999. If that means the platform libc asctime/ctime can't be used, too bad.
I've committed code that does not use platform libc asctime/ctime anymore. Now it seems odd that we support years > 9999 but not years < 1900. A commonly given explanation for rejecting years < 1900 is that Python has to support POSIX standard for 2-digit years. However, this support is conditional on the value of time.accept2dyear and several people argued that when it is set to false, full range of years should be supported.
Couldn't we deprecate and remove time.accept2dyear? It has been there for "backward compatibility" since Python 1.5.2. Not to mention that global settings affecting behaviour are generally bad, since multiple libraries could have conflicting expectations about it. And parsing times and dates is the kind of thing that a library will often rely on. Regards Antoine.
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Wed, Jan 5, 2011 at 12:48 PM, Antoine Pitrou <solipsis@pitrou.net> wrote: ..
Couldn't we deprecate and remove time.accept2dyear? It has been there for "backward compatibility" since Python 1.5.2.
It will be useful for another 50 years or so. (POSIX 2-digit years cover 1969 - 2068.) In any case, this is not an option for 3.2 while extending accepted range is a borderline case IMO.
Not to mention that global settings affecting behaviour are generally bad, since multiple libraries could have conflicting expectations about it. And parsing times and dates is the kind of thing that a library will often rely on.
Yes, for 3.3 I am going to propose an optional accept2dyear argument to time.{asctime, strftime} in addition to or instead of a global variable. This is also necessary to implement a pure python version of datetime.strftime that would support full range of datetime. See http://bugs.python.org/issue1777412 .
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Wed, Jan 5, 2011 at 10:12 AM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Wed, Jan 5, 2011 at 12:48 PM, Antoine Pitrou <solipsis@pitrou.net> wrote: ..
Couldn't we deprecate and remove time.accept2dyear? It has been there for "backward compatibility" since Python 1.5.2.
It will be useful for another 50 years or so. (POSIX 2-digit years cover 1969 - 2068.) In any case, this is not an option for 3.2 while extending accepted range is a borderline case IMO.
I like accepting all years >= 1 when accept2dyear is False. In 3.3 we should switch its default value to False (in addition to the keyword arg you are proposing below, maybe). Maybe we can add a deprecation warning in 3.2 when a 2d year is actually received? The posix standard notwithstanding they should be rare, and it would be better to make this the app's responsibility if we could.
Not to mention that global settings affecting behaviour are generally bad, since multiple libraries could have conflicting expectations about it. And parsing times and dates is the kind of thing that a library will often rely on.
Yes, for 3.3 I am going to propose an optional accept2dyear argument to time.{asctime, strftime} in addition to or instead of a global variable. This is also necessary to implement a pure python version of datetime.strftime that would support full range of datetime. See http://bugs.python.org/issue1777412 .
I wish we didn't have to do that -- isn't it easy enough for the app to do the 2d -> 4d conversion itself before calling the library function? The only exception would be when parsing a string -- but strptime can tell whether a 2d or 4d year is requested by the format code (%y or %Y). -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Wed, Jan 5, 2011 at 2:19 PM, Guido van Rossum <guido@python.org> wrote: ..
extending accepted range is a borderline case IMO.
I like accepting all years >= 1 when accept2dyear is False.
Why >= 1? Shouldn't it be >= 1900 - maxint? Also, what is your take on always accepting [1000 - 1899]? Now, to play the devil's advocate a little, with the new logic accept2dyear would actually mean "map 2-digit year" because 2-digit years will be accepted when accept2dyear is False, just not mapped to reasonable range. I don't have much of a problem with having a deprecated setting that does not have the meaning that its name suggests. (At the moment accept2dyear = True is actually treated as accept2dyear = 0!) I am mentioning this because I think the logic should be if accept2dyear: if 0 <= y < 69: y += 2000 elif 69 <= y < 100: y += 1900 elif 100 <= y < 1000: raise ValueError("3-digit year in map 2-digit year mode") and even the last elif may not be necessary.
In 3.3 we should switch its default value to False (in addition to the keyword arg you are proposing below, maybe).
Note that time.accept2dyear is controlled by PYTHONY2K environment variable. If we switch the default, we may need to add a variable with the opposite meaning.
Maybe we can add a deprecation warning in 3.2 when a 2d year is actually received?
+1, but only when with accept2dyear = 1. When accept2dyear = 0, any year should just pass through and this should eventually become the only behavior.
The posix standard notwithstanding they should be rare, and it would be better to make this the app's responsibility if we could.
..
I wish we didn't have to do that -- isn't it easy enough for the app to do the 2d -> 4d conversion itself before calling the library function?
Note that this is already done at least in two places in stdlib: in email package parsedate_tz and in _strptime.py. Given that the POSIX convention is arbitrary and unintuitive, maybe we should provide time.posix2dyear() function for this purpose.
The only exception would be when parsing a string -- but strptime can tell whether a 2d or 4d year is requested by the format code (%y or %Y).
Existing stdlib date parsing code already does that and ignores accept2dyear setting.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Wed, Jan 5, 2011 at 12:58 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Wed, Jan 5, 2011 at 2:19 PM, Guido van Rossum <guido@python.org> wrote: ..
extending accepted range is a borderline case IMO.
I like accepting all years >= 1 when accept2dyear is False.
Why >= 1?
Because that's what the datetime module accepts.
Shouldn't it be >= 1900 - maxint? Also, what is your take on always accepting [1000 - 1899]?
Now, to play the devil's advocate a little, with the new logic accept2dyear would actually mean "map 2-digit year" because 2-digit years will be accepted when accept2dyear is False, just not mapped to reasonable range. I don't have much of a problem with having a deprecated setting that does not have the meaning that its name suggests. (At the moment accept2dyear = True is actually treated as accept2dyear = 0!) I am mentioning this because I think the logic should be
if accept2dyear: if 0 <= y < 69: y += 2000 elif 69 <= y < 100: y += 1900 elif 100 <= y < 1000: raise ValueError("3-digit year in map 2-digit year mode")
and even the last elif may not be necessary.
Shouldn't the logic be to take the current year into account? By the time 2070 comes around, I'd expect "70" to refer to 2070, not to 1970. In fact, I'd expect it to refer to 2070 long before 2070 comes around. All of which makes me think that this is better left to the app, which can decide for itself whether it is more important to represent dates in the future or dates in the past.
In 3.3 we should switch its default value to False (in addition to the keyword arg you are proposing below, maybe).
Note that time.accept2dyear is controlled by PYTHONY2K environment variable. If we switch the default, we may need to add a variable with the opposite meaning.
Yeah, but who sets that variable? Couldn't we make it so that if PYTHONY2K is set (even to the empty string) it wins, but if it's not set (at all) we can make the default adjust over time?
Maybe we can add a deprecation warning in 3.2 when a 2d year is actually received?
+1, but only when with accept2dyear = 1. When accept2dyear = 0, any year should just pass through and this should eventually become the only behavior.
The posix standard notwithstanding they should be rare, and it would be better to make this the app's responsibility if we could.
..
I wish we didn't have to do that -- isn't it easy enough for the app to do the 2d -> 4d conversion itself before calling the library function?
Note that this is already done at least in two places in stdlib: in email package parsedate_tz and in _strptime.py. Given that the POSIX convention is arbitrary and unintuitive, maybe we should provide time.posix2dyear() function for this purpose.
The only exception would be when parsing a string -- but strptime can tell whether a 2d or 4d year is requested by the format code (%y or %Y).
Existing stdlib date parsing code already does that and ignores accept2dyear setting.
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/9dd1d/9dd1dec091b1b438e36e320a5558f7d624f6cb3e" alt=""
On Jan 5, 2011, at 4:33 PM, Guido van Rossum wrote:
Shouldn't the logic be to take the current year into account? By the time 2070 comes around, I'd expect "70" to refer to 2070, not to 1970. In fact, I'd expect it to refer to 2070 long before 2070 comes around.
All of which makes me think that this is better left to the app, which can decide for itself whether it is more important to represent dates in the future or dates in the past.
The point of this somewhat silly flag (as I understood its description earlier in the thread) is to provide compatibility with POSIX 2-year dates. As per http://pubs.opengroup.org/onlinepubs/007908799/xsh/strptime.html - %y is the year within century. When a century is not otherwise specified, values in the range 69-99 refer to years in the twentieth century (1969 to 1999 inclusive); values in the range 00-68 refer to years in the twenty-first century (2000 to 2068 inclusive). Leading zeros are permitted but not required. So, "70" means "1970", forever, in programs that care about this nonsense. Personally, by the time 2070 comes around, I hope that "70" will just refer to 70 A.D., and get you odd looks if you use it in a written date - you might as well just write '0' :).
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Wed, Jan 5, 2011 at 4:33 PM, Guido van Rossum <guido@python.org> wrote: ..
Why >= 1?
Because that's what the datetime module accepts.
What the datetime module accepts is irrelevant here. Note that functions affected by accept2dyear are: time.mktime(), time.asctime(), time.strftime() and indirectly time.ctime(). Neither of them produces result that is directly usable by the datetime module. Furthermore, this thread started with me arguing that year > 9999 should raise ValueError and if we wanted to restrict time module functions to datetime-supported year range, that would be the right thing to do. If I understand your "garbage in garbage out" principle correctly, time-processig functions should not introduce arbitrary limits unless there is a specific reason for them. In datetime module, calendar calculations would be too complicated if we had to support date range that does not fit in 32-bit integer. There is no such consideration in the time module, so we should support whatever the underlying system can. This said, I would be perfectly happy with just changing y >= 1900 to y >= 1000. Doing so will spare us from making a choice between '0012', '12' and ' 12' in time.asctime(). Time-series that extend back to 19th century are not unheard of and in many places modern calendar was already in use back then. Anything prior to year 1000 would certainly require a custom calendar module anyways.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Wed, Jan 5, 2011 at 2:55 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Wed, Jan 5, 2011 at 4:33 PM, Guido van Rossum <guido@python.org> wrote: ..
Why >= 1?
Because that's what the datetime module accepts.
What the datetime module accepts is irrelevant here.
Not completely -- they are both about dates and times, there are some links between them (time tuples are used by both), both have a strftime() method. If they both impose some arbitrary limits, it would be easier for users to remember the limits if they were the same for both modules. (In fact datetime.strftime() is currently limited by what time.strftime() can handle -- more linkage.)
Note that functions affected by accept2dyear are: time.mktime(), time.asctime(), time.strftime() and indirectly time.ctime(). Neither of them produces result that is directly usable by the datetime module.
But the latter calls strftime() -- although never with a 2d year of course.
Furthermore, this thread started with me arguing that year > 9999 should raise ValueError and if we wanted to restrict time module functions to datetime-supported year range, that would be the right thing to do.
I'd be fine with a ValueError too, if that's what it takes to align the two modules better.
If I understand your "garbage in garbage out" principle correctly, time-processig functions should not introduce arbitrary limits unless there is a specific reason for them. In datetime module, calendar calculations would be too complicated if we had to support date range that does not fit in 32-bit integer. There is no such consideration in the time module, so we should support whatever the underlying system can.
(Except that the *originally* underlying system, libc, was too poorly standardized and too buggy on some platforms, so we have ended up reimplementing more and more of it.)
This said, I would be perfectly happy with just changing y >= 1900 to y >= 1000. Doing so will spare us from making a choice between '0012', '12' and ' 12' in time.asctime(). Time-series that extend back to 19th century are not unheard of and in many places modern calendar was already in use back then. Anything prior to year 1000 would certainly require a custom calendar module anyways.
Yeah, but datetime takes a position here (arbitrarily extending the Gregorian calendar all the way back to the year 1, and forward to the year 9999). I'd be happiest if time took the same position. For example it would fix the problem that datetime accepts years < 1900 but then you cannot call strftime() on those. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Wed, Jan 5, 2011 at 6:12 PM, Guido van Rossum <guido@python.org> wrote: ..
If they both impose some arbitrary limits, it would be easier for users to remember the limits if they were the same for both modules.
Unfortunately, that is not possible on 32-bit systems where range supported by say time.ctime() is limited by the range of time_t.
(In fact datetime.strftime() is currently limited by what time.strftime() can handle -- more linkage.)
Not really. There is a patch at http://bugs.python.org/issue1777412 that removes this limit for datetime.strftime. There is an issue for pure python implementation that does depend on time.strftime(), but that can be addressed in several ways including ignoring it until time modules is fixed. ..
Furthermore, this thread started with me arguing that year > 9999 should raise ValueError and if we wanted to restrict time module functions to datetime-supported year range, that would be the right thing to do.
I'd be fine with a ValueError too, if that's what it takes to align the two modules better.
Do you want to *add* year range checks to say time.localtime(t) so that it would not produce time tuple with out of range year? IMO, range checks are justified when they allow simpler implementation. As far as users are concerned, I don't think anyone would care about precise limits if they are wider than [1000 - 9999]. ..
This said, I would be perfectly happy with just changing y >= 1900 to y >= 1000. Doing so will spare us from making a choice between '0012', '12' and ' 12' in time.asctime(). Time-series that extend back to 19th century are not unheard of and in many places modern calendar was already in use back then. Anything prior to year 1000 would certainly require a custom calendar module anyways.
Yeah, but datetime takes a position here (arbitrarily extending the Gregorian calendar all the way back to the year 1, and forward to the year 9999). I'd be happiest if time took the same position.
Doesn't it already? On my system, $ cal 9 1752 September 1752 Su Mo Tu We Th Fr Sa 1 2 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 but
(datetime(1752, 9, 2) - datetime(1970,1,1))//timedelta(0, 1) -6858259200 time.gmtime(-6858259200)[:3] (1752, 9, 2) datetime(1752, 9, 2).weekday() 5 time.gmtime(-6858259200).tm_wday 5
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
I'm sorry, but at this point I'm totally confused about what you're asking or proposing. You keep referring to various implementation details and behaviors. Maybe if you summarized how the latest implementation (say python 3.2) works and what you propose to change that would be quicker than this back-and-forth about whether or not datetime and time behave the same or should behave the same or whatever. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Wed, Jan 5, 2011 at 9:18 PM, Guido van Rossum <guido@python.org> wrote:
I'm sorry, but at this point I'm totally confused about what you're asking or proposing. You keep referring to various implementation details and behaviors. Maybe if you summarized how the latest implementation (say python 3.2) works and what you propose to change
I'll try. The current implementation is of time.asctime and time.strftime is roughly if y < 1900: if accept2dyear: if 69 <= y <= 99: y += 1900 elif 0 <= y <= 68: y += 2000 else: raise ValueError("year out of range") else: raise ValueError("year out of range") # call system function with tm_year = y - 1900 I propose to change that to if y < 1000: if accept2dyear: if 69 <= y <= 99: y += 1900 elif 0 <= y <= 68: y += 2000 else: raise ValueError("year out of range") # call system function with tm_year = y - 1900
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Wed, Jan 5, 2011 at 6:46 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Wed, Jan 5, 2011 at 9:18 PM, Guido van Rossum <guido@python.org> wrote:
I'm sorry, but at this point I'm totally confused about what you're asking or proposing. You keep referring to various implementation details and behaviors. Maybe if you summarized how the latest implementation (say python 3.2) works and what you propose to change
I'll try. The current implementation is of time.asctime and time.strftime is roughly
if y < 1900: if accept2dyear: if 69 <= y <= 99: y += 1900 elif 0 <= y <= 68: y += 2000 else: raise ValueError("year out of range") else: raise ValueError("year out of range") # call system function with tm_year = y - 1900
I propose to change that to
if y < 1000: if accept2dyear: if 69 <= y <= 99: y += 1900 elif 0 <= y <= 68: y += 2000 else: raise ValueError("year out of range") # call system function with tm_year = y - 1900
The new logic doesn't look right, am I right that this is what you meant? if accept2dyear and 0 <= y < 100: (convert to year >= 1970) if y < 1000: raise ... But what guarantees do we have that the system functions accept negative values for tm_year on all relevant platforms? The 1000 limit still seems pretty arbitrary to me -- if it's only because you don't want to decide whether to use leading spaces or zeros for numbers shorter than 4 digits, let me propose leading zeros since we use those uniformly for months, days, hours, minutes and seconds < 10, and then you can make the year range accepted the same for these as for datetime (i.e. 1 <= y <= 9999). Tim Peters picked those at least in part because they are right round numbers... -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Wed, Jan 5, 2011 at 10:50 PM, Guido van Rossum <guido@python.org> wrote: ..
I propose to change that to
if y < 1000: if accept2dyear: if 69 <= y <= 99: y += 1900 elif 0 <= y <= 68: y += 2000 else: raise ValueError("year out of range") # call system function with tm_year = y - 1900
The new logic doesn't look right, am I right that this is what you meant?
if accept2dyear and 0 <= y < 100: (convert to year >= 1970) if y < 1000: raise ...
Not quite. My proposed logic would not do any range checking if accept2dyear == 0.
But what guarantees do we have that the system functions accept negative values for tm_year on all relevant platforms?
I've already committed an implementation of asctime, so time.asctime and time.ctime don't call system functions anymore. This leaves time.mktime and time.strftime. The latter caused Tim Peters to limit year range to >= 1900 eight years ago: http://svn.python.org/view?view=rev&revision=30224 For these functions, range checks are necessary only when system functions may crash on out of range values. If we can detect error return and convert it to an exception, there is no need to look before you leap. (Note that asctime was different because the relevant standards specifically allowed it to have undefined behavior for out of range values.) I cannot rule out that there are systems out there with buggy strftime, but the situation has improved in the last eight years and we have buildbots and unittests to check behavior on relevant platforms. If we do find a platform with buggy strftime which crashes or produces nonsense with negative tm_year, we can add a platform specific range check to work around platform bug, or just ask users to bug their OS vendor. :-)
The 1000 limit still seems pretty arbitrary to me -- if it's only because you don't want to decide whether to use leading spaces or zeros for numbers shorter than 4 digits, let me propose leading zeros since we use those uniformly for months, days, hours, minutes and seconds < 10,
Except we don't:
time.asctime((2000, 1, 1, 0, 0, 0, 0, 0, -1)) 'Sat Jan 1 00:00:00 2000'
(note that day is space-filled.) I am not sure, however, what you are proposing here. Are you arguing for a wider or a narrower year range? I would be happy with just if accept2dyear: if 69 <= y <= 99: y += 1900 elif 0 <= y <= 68: y += 2000 # call system function with tm_year = y - 1900 but I thought that would be too radical.
data:image/s3,"s3://crabby-images/5cf10/5cf100eb7be8654d88ee7e9556b8bfacc6e59205" alt=""
Le mercredi 05 janvier 2011 à 23:48 -0500, Alexander Belopolsky a écrit :
I would be happy with just
if accept2dyear: if 69 <= y <= 99: y += 1900 elif 0 <= y <= 68: y += 2000 # call system function with tm_year = y - 1900
Perfect. That's what I expect from a "2 digits" option: it should not touch 3 (100..999) or 4 digits digits (>= 1000). Remember that the "2 digit option" is a hack to workaround the y2k bug. It is maybe time to try to remove the workaround: disable accept2dyear by default and remove PYTHONY2K env var.
but I thought that would be too radical.
Why ? Victor
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Thu, Jan 6, 2011 at 6:47 AM, Victor Stinner <victor.stinner@haypocalc.com> wrote:
Le mercredi 05 janvier 2011 à 23:48 -0500, Alexander Belopolsky a écrit :
I would be happy with just
if accept2dyear: if 69 <= y <= 99: y += 1900 elif 0 <= y <= 68: y += 2000 # call system function with tm_year = y - 1900 .. but I thought that would be too radical.
Why ?
ISTM that time.asctime() called with a 3-digit year, particularly a low 3-digit, one is much more likely to be a manifestation of a lingering Y2K bug than a real intent to print an ancient date. I do remember that many devices were showing Jan 1, 100 back in early 2000. The same logic does not apply to programs that run with PYTHONY2K set because presumably users who bothered to set PYTHONY2K know that their program does not use 2-digit years. That's why, I think for 3.2 we should do the following: 1. Keep PYTHONY2K logic and accept2dyear = 1 default. 2. With default accept2dyear = 1: - for 0 <= year < 100 issue a deprecation warning and supply century according to POSIX rules - for 100 <= year < 1000 raise ValueError - for year >= 1000 leave year unchanged 3. With accept2dyear = 0 leave year unchanged regardless of value. For 3.3, remove PYTHONY2K and accept2dyear and leave year unchanged regardless of value. Can we agree that this is reasonable for time.asctime()? In time/datetime.strftime we can impose stricter limits if necessary to work around platform bugs.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
I think I've said all I can say in this thread; I'm sure you will come up with a satisfactory solution. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Wed, Jan 5, 2011 at 10:50 PM, Guido van Rossum <guido@python.org> wrote: ..
But what guarantees do we have that the system functions accept negative values for tm_year on all relevant platforms?
Also note that the subject of this thread is limited to "time.asctime and time.ctime." The other functions came into discussion only because the year range checking code is shared inside the time module. If calling specific system functions such as strftime with tm_year < 0 is deemed unsafe, we can move the check to where the system function is called. No system function is called from time.asctime anymore and time.ctime(t) is now time.asctime(localtime(t)).
data:image/s3,"s3://crabby-images/5cf10/5cf100eb7be8654d88ee7e9556b8bfacc6e59205" alt=""
Le jeudi 06 janvier 2011 à 00:10 -0500, Alexander Belopolsky a écrit :
If calling specific system functions such as strftime with tm_year < 0 is deemed unsafe, we can move the check to where the system function is called.
What do you mean by "unsafe"? Does it crash? On my Linux box, strftime("%Y") is able to format integers in [-2^31-1900; 2^31-1-1900] (full range of the int type). Can't we add a test in the configure script to check for "broken" strftime() implementation? Victor
data:image/s3,"s3://crabby-images/9feec/9feec9ccf6e52c7906cac8f7d082e9df9f5677ac" alt=""
On Thu, 06 Jan 2011 12:55:24 +0100, Victor Stinner <victor.stinner@haypocalc.com> wrote:
Le jeudi 06 janvier 2011 à 00:10 -0500, Alexander Belopolsky a écrit :
If calling specific system functions such as strftime with tm_year < 0 is deemed unsafe, we can move the check to where the system function is called.
What do you mean by "unsafe"? Does it crash? On my Linux box, strftime("%Y") is able to format integers in [-2^31-1900; 2^31-1-1900] (full range of the int type).
I believe that we have had several cases where Windows "crashed" when out-of-range values were passed to the CRT that other platforms accepted. -- R. David Murray www.bitdance.com
data:image/s3,"s3://crabby-images/5cf10/5cf100eb7be8654d88ee7e9556b8bfacc6e59205" alt=""
Le jeudi 06 janvier 2011 à 10:47 -0500, R. David Murray a écrit :
On Thu, 06 Jan 2011 12:55:24 +0100, Victor Stinner <victor.stinner@haypocalc.com> wrote:
Le jeudi 06 janvier 2011 à 00:10 -0500, Alexander Belopolsky a écrit :
If calling specific system functions such as strftime with tm_year < 0 is deemed unsafe, we can move the check to where the system function is called.
What do you mean by "unsafe"? Does it crash? On my Linux box, strftime("%Y") is able to format integers in [-2^31-1900; 2^31-1-1900] (full range of the int type).
I believe that we have had several cases where Windows "crashed" when out-of-range values were passed to the CRT that other platforms accepted.
If there are only issues on Windows, we can add a #ifdef _MSC_VER and raise a ValueError("Stupid OS, install Linux or recompile with Cygwin") for year < 1900. Does Cygwin and MinGW have the same issues? Victor
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 01/06/2011 11:08 AM, Victor Stinner wrote:
Le jeudi 06 janvier 2011 à 10:47 -0500, R. David Murray a écrit :
On Thu, 06 Jan 2011 12:55:24 +0100, Victor Stinner<victor.stinner@haypocalc.com> wrote:
Le jeudi 06 janvier 2011 à 00:10 -0500, Alexander Belopolsky a écrit :
If calling specific system functions such as strftime with tm_year< 0 is deemed unsafe, we can move the check to where the system function is called.
What do you mean by "unsafe"? Does it crash? On my Linux box, strftime("%Y") is able to format integers in [-2^31-1900; 2^31-1-1900] (full range of the int type).
I believe that we have had several cases where Windows "crashed" when out-of-range values were passed to the CRT that other platforms accepted.
If there are only issues on Windows, we can add a #ifdef _MSC_VER and raise a ValueError("Stupid OS, install Linux or recompile with Cygwin") for year< 1900.
Is strftime really so complex that we shouldn't just write our own? I'd be willing to do it. Over the years the platform strftime has caused any number of problems. The last time I looked at it we already have to do some work pre-parsing the format string and passing it off to platform strftime, so it's not like it's not already a maintenance hassle. I understand strptime is probably more complex and there's some value to having strptime/strftime coming from the same library. But I'd be willing to look at it, too. Eric.
data:image/s3,"s3://crabby-images/b96f7/b96f788b988da8930539f76bf56bada135c1ba88" alt=""
R. David Murray writes:
I believe that we have had several cases where Windows "crashed" when out-of-range values were passed to the CRT that other platforms accepted.
XEmacs had crashes due to strftime on Windows native with VC++. Never went so far as to BSOD, but a couple of users lost recently input data. :-( IIRC Cygwin was OK (their libc uses a different code base). Dunno mingw, almost all of our users either want the full Cygwin environment or they don't care about GCC, but since mingw uses MSFT runtime, it's probably vulnerable too.
participants (8)
-
Alexander Belopolsky
-
Antoine Pitrou
-
Eric Smith
-
Glyph Lefkowitz
-
Guido van Rossum
-
R. David Murray
-
Stephen J. Turnbull
-
Victor Stinner