Boundary checks on arguments to time.strftime()
Bug #897625 (http://python.org/sf/897625) deals with giving time.strftime() date information that is outside of proper bounds (e.g., using a negative number for the day of the week). This can lead to a crash since it seems some strftime implementations just index into an array for values for text and thus a negative value can lead to bad things happening. I would like to raise a ValueError if the argument is out of range. Problem is that this will break code. I could just force all negative values to all values outside the proper bounds to a reasonable value, but that seems to go against the path of least surprise. That is question 1. Question 2 is what to really check. This really is only a concern for month and day of the week since everything else is just a number and doesn't have some name representation. I could check all 9 values, though, or just these two. Question 3 is whether to extend this to time.asctime() . I have talked to Tim about this and his thoughts are to just deal with time.strftime() and leave everything else alone. That's fine with me, but there is the same possibility of having problems with asctime(). But then again, checking value for asctime() would potentially break even more code. So, thoughts? -Brett
[Brett]
Bug #897625 (http://python.org/sf/897625) deals with giving time.strftime() date information that is outside of proper bounds (e.g., using a negative number for the day of the week). This can lead to a crash since it seems some strftime implementations just index into an array for values for text and thus a negative value can lead to bad things happening.
I would like to raise a ValueError if the argument is out of range. Problem is that this will break code.
That is the right way to go. The docs have long specified the proper range of input values. If someone has created working code outside those bounds, raising an exception may be their only clue that their code is non-portable (best case) or simply buggy (worst case).
I could just force all negative values to all values outside the proper bounds to a reasonable value, but that seems to go against the path of least surprise. That is question 1.
-1. That is slower and more likely to mask genuine coding bugs. Also, it introduces new behavior that would need to be supported in perpetuity.
Question 2 is what to really check. This really is only a concern for month and day of the week since everything else is just a number and doesn't have some name representation. I could check all 9 values, though, or just these two.
Question 3 is whether to extend this to time.asctime() . I have talked to Tim about this and his thoughts are to just deal with time.strftime() and leave everything else alone. That's fine with me, but there is
The month and day of week are the most important since they have word name equivalents that are found by indexing into an array of pointers. But, if you're feeling bold, go ahead and check all of the ranges specified in the docs. That will make it more likely that programmer errors and non-portable wrapping tricks are detected early. the
same possibility of having problems with asctime(). But then again, checking value for asctime() would potentially break even more code.
I would leave time.asctime() alone. Its argument is typically one returned by another time function. So, it is less susceptible than strftime() where the arguments are constructed piecemeal. Raymond Hettinger
[Brett]
Bug #897625 (http://python.org/sf/897625) deals with giving time.strftime() date information that is outside of proper bounds (e.g., using a negative number for the day of the week). This can lead to a crash since it seems some strftime implementations just index into an array for values for text and thus a negative value can lead to bad things happening.
I would like to raise a ValueError if the argument is out of range. Problem is that this will break code.
[RH] That is the right way to go. The docs have long specified the proper
range of input values. If someone has created working code outside those bounds, raising an exception may be their only clue that their code is non-portable (best case) or simply buggy (worst case).
Agreed. How could an out-of-range causes crashes in the implementation *and* be a required feature?
I could just force all negative values to all values outside the proper bounds to a reasonable value, but that seems to go against the path of least surprise. That is question 1.
-1. That is slower and more likely to mask genuine coding bugs. Also, it introduces new behavior that would need to be supported in perpetuity.
Agreed.
Question 2 is what to really check. This really is only a concern for month and day of the week since everything else is just a number and doesn't have some name representation. I could check all 9 values, though, or just these two.
The month and day of week are the most important since they have word name equivalents that are found by indexing into an array of pointers. But, if you're feeling bold, go ahead and check all of the ranges specified in the docs. That will make it more likely that programmer errors and non-portable wrapping tricks are detected early.
To be consistent you should check all values. You could go oveboard and check for things like February 29, but I recommend against this. After all strftime() only does formatting.
Question 3 is whether to extend this to time.asctime() . I have talked to Tim about this and his thoughts are to just deal with time.strftime() and leave everything else alone. That's fine with me, but there is the same possibility of having problems with asctime(). But then again, checking value for asctime() would potentially break even more code.
Why? The question is, do we need to check to protect the implementation? Then I'd say we have no choice. Otherwise, the question is, should we let bad input cause bad output (the GIGO principle) or should we try to flag it? Which is better for the average application? Given that asctime() is likely used for printing messages, causing new exceptions in code that "worked" before is going to get complaints from users who deploy buggy programs and get embarrassed by the new exception. Has happened before.
I would leave time.asctime() alone. Its argument is typically one returned by another time function. So, it is less susceptible than strftime() where the arguments are constructed piecemeal.
That I don't know. More likely, asctime() is simply not used as much, because its (fixed) output formats has few virtues except being an ancient Unix standard, and in today's internet world that's not enough. --Guido van Rossum (home page: http://www.python.org/~guido/)
[Guido]
... Agreed. How could an out-of-range causes crashes in the implementation *and* be a required feature?
The intersection of "causes crashes" and "required features" consists of platform C library bugs. The potential behavior changes go beyond that intersection, though: timemodule.c uses gettmarg() to build struct tm thingies. mktime() specifically allows struct tm fields to be out of range, but time.mktime() uses gettmarg(), so if the latter starts complaining, *some* existing uses of time.mktime() will start complaining too. We hit this in a small way when calender.timegm() was changed to use the datetime module: days = datetime.date(year, month, day).toordinal() - _EPOCH_ORD started complaining about senseless y,m,d triples then, and a user did file a bug report against that. The code was changed to the goofy: days = datetime.date(year, month, 1).toordinal() - _EPOCH_ORD + day - 1 instead so they could continue using senseless (wrt the specified year and month) days. "Legacy API"s are always touchy. We have one report of a crash now, due to using a negative tm_wday value in a tuple passed to strftime(). strftime() may want to gripe about that, but mktime() may not (the C std specifically requires that the mktime() implementation ignore the input value of tm_wday, so if any platform crashes on a negative tm_wday input to mktime(), that would be a platform C bug).
Guido van Rossum wrote:
[Brett]
Bug #897625 (http://python.org/sf/897625) deals with giving time.strftime() date information that is outside of proper bounds
(e.g.,
using a negative number for the day of the week). This can lead to a crash since it seems some strftime implementations just index into an array for values for text and thus a negative value can lead to bad things happening.
I would like to raise a ValueError if the argument is out of range. Problem is that this will break code.
[RH] That is the right way to go. The docs have long specified the proper
range of input values. If someone has created working code outside those bounds, raising an exception may be their only clue that their code is non-portable (best case) or simply buggy (worst case).
Agreed. How could an out-of-range causes crashes in the implementation *and* be a required feature?
As Tim pointed out, time.mktime() is spec'ed to take funky values. But I was going to leave it out of this discussion and just leave it alone. OK, so question 1 is answered: raise the exception.
I could just force all negative values to all values outside the proper bounds to a reasonable value, but that seems to go against the path of least surprise. That is question 1.
-1. That is slower and more likely to mask genuine coding bugs. Also, it introduces new behavior that would need to be supported in perpetuity.
Agreed.
I didn't like it either, but I thought I would throw it out as an option.
Question 2 is what to really check. This really is only a concern for month and day of the week since everything else is just a number and doesn't have some name representation. I could check all 9 values, though, or just these two.
The month and day of week are the most important since they have word name equivalents that are found by indexing into an array of pointers. But, if you're feeling bold, go ahead and check all of the ranges specified in the docs. That will make it more likely that programmer errors and non-portable wrapping tricks are detected early.
To be consistent you should check all values.
OK. The only reason I brought it up was that people who don't have an explicit value for a field tend to set them to 0 (test_strftime actually does this). This means that code that doesn't set month, day of the month, and day of the year to at least 1 will break. I am not too upset over breaking code that doesn't set day of the year, but it is the 0 value month and day of month that I am a little worried about. But I would rather do a full-on check on all values.
You could go oveboard and check for things like February 29, but I recommend against this. After all strftime() only does formatting.
No thanks. =)
Question 3 is whether to extend this to time.asctime() . I have talked to Tim about this and his thoughts are to just deal with time.strftime() and leave everything else alone. That's fine with me, but there is the same possibility of having problems with asctime(). But then again, checking value for asctime() would potentially break even more code.
Why? The question is, do we need to check to protect the implementation? Then I'd say we have no choice. Otherwise, the question is, should we let bad input cause bad output (the GIGO principle) or should we try to flag it? Which is better for the average application? Given that asctime() is likely used for printing messages, causing new exceptions in code that "worked" before is going to get complaints from users who deploy buggy programs and get embarrassed by the new exception. Has happened before.
Well, if you look at the C99 spec (http://std.dkuug.dk/jtc1/sc22/open/n2794/ for those who care )for asctime(), the example uses straight indexes of strings so it could explode. But in asctime's case only the month value is of any worry. That I could just force to a valid value since it is the only thing that needs a check; shouldn't break very much code. But I am leaning toward your and Tim's view that this isn't really necessary and anyone using asctime() probably don't want to deal with fixing their code.
I would leave time.asctime() alone. Its argument is typically one returned by another time function. So, it is less susceptible than strftime() where the arguments are constructed piecemeal.
That I don't know. More likely, asctime() is simply not used as much, because its (fixed) output formats has few virtues except being an ancient Unix standard, and in today's internet world that's not enough.
Agreed. OK, so I will raise TypeError and check everything for time.strftime(). asctime() is the only iffy thing. I say either leave it be or force the month value to January if it is out of the proper range. Any opinions on one or the other? I am leaning towards the former personally. -Brett
Brett C. wrote:
OK, so I will raise TypeError and check everything for time.strftime(). asctime() is the only iffy thing. I say either leave it be or force the month value to January if it is out of the proper range. Any opinions on one or the other? I am leaning towards the former personally.
OK, added the checks and checked them in. I left asctime() alone. -Brett
On Sun, Feb 22, 2004, Brett C. wrote:
I would like to raise a ValueError if the argument is out of range. Problem is that this will break code.
+1 After all, code is already breaking.
I could just force all negative values to all values outside the proper bounds to a reasonable value, but that seems to go against the path of least surprise. That is question 1.
-1
Question 2 is what to really check. This really is only a concern for month and day of the week since everything else is just a number and doesn't have some name representation. I could check all 9 values, though, or just these two.
+0
Question 3 is whether to extend this to time.asctime() . I have talked to Tim about this and his thoughts are to just deal with time.strftime() and leave everything else alone. That's fine with me, but there is the same possibility of having problems with asctime(). But then again, checking value for asctime() would potentially break even more code.
+1 Behavior of similar functions should be as similar as possible. While I see Raymond's point about this being a less likely bug to hit, why not make the fix now while a pair of eyeballs that knows the code is already looking? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "Do not taunt happy fun for loops. Do not change lists you are looping over." --Remco Gerlich, comp.lang.python
participants (5)
-
Aahz
-
Brett C.
-
Guido van Rossum
-
Raymond Hettinger
-
Tim Peters