Let me present the issue differently. On Sunday, Oct 27, 2002, both 5:30 UTC and 6:30 UTC map to 1:30 am US Eastern time: 5:30 UTC maps to 1:30 am EDT, and 6:30 UTC maps to 1:30 am EST. (UTC uses a 24 hour clock.) We have a tzinfo subclass representing the US Eastern (hybrid) timezone whose primary responsibility is to translate from local time in that timezone to UTC: Eastern.utcoffset(dt). It can also tell us how much of that offset is due to DST: Eastern.dst(dt). It is crucial to understand that with "Eastern" as tzinfo, there is only *one* value for 1:30 am on Oct 27, 2002. The Eastern tzinfo object arbitrarily decides this is EDT, so it maps to 5:30 UTC. (But the problem would still exist if it arbitrarily decided that it was EST.) It is also crucial to understand that we have no direct way to translate UTC to Eastern. We only have a direct way to translate Eastern to UTC (by subtracting Eastern.utcoffset(dt)). Usually, the utcoffset() for times that differ only a few hours is the same, so we can approximate the reverse mapping (i.e. from UTC to Eastern) by using the utcoffset() for the input time and assuming that it is the same as that for the output time. Code for this is: # Initially dt is a datetimetz expressed in UTC whose tzinfo is None dt = dt.replace(tzinfo=Eastern) dt = dt + Eastern.utcoffset(dt) This is not sufficient, however, close to the DST switch. For example, let's try this with an initial dt value of 4:30 am UTC on Oct 27, 2002. The code applies the UTC offset corresponding to 4:30 am Eastern, which is -5 hours (EST), so the result is 11:30 pm the previous day. But this is wrong! 11:30 pm Eastern that day is in DST, so the UTC offset should be -4 hours. We can know we must make a correction, because we can compare the UTC offset of the result to the UTC offset of the input, and see that they differ. But what correction to make? The problem is that when the input is 6:30 UTC, the result is 1:30 am Eastern, which is still taken to be EDT. If we apply the same correction as we did for 4:30 UTC, we get 2:30 am Eastern, but that's wrong, because that's in EST, corresponding to 7:30 UTC. But if we don't apply a correction, and stick with 1:30 am Eastern, we've got a time that corresponds to to 5:30 UTC. So what time corresponds to 6:30 UTC? The problem for astimezone() is to come up with the correct result whenever it can, and yet somehow to fudge things so that 6:30 UTC gets translated to 1:30 Eastern. And astimezone() must not make many assumptions about the nature of DST. It can assume that the DST correction is >= 0 and probably less than 10 hours or so, and that DST changes don't occur more frequently than twice a year (once on and once off), and that the DST correction is constant during the DST period, and that the only variation in UTC offset is due to DST. But Tim has already taken all of that into account -- read his proof at the end of datetime.py: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/nondist/sandbox/datetime/datetime.py?rev=1.140&content-type=text/vnd.viewcvs-markup Can you do better? --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
Let me present the issue differently.
On Sunday, Oct 27, 2002, both 5:30 UTC and 6:30 UTC map to 1:30 am US Eastern time: 5:30 UTC maps to 1:30 am EDT, and 6:30 UTC maps to 1:30 am EST. (UTC uses a 24 hour clock.)
We have a tzinfo subclass representing the US Eastern (hybrid) timezone whose primary responsibility is to translate from local time in that timezone to UTC: Eastern.utcoffset(dt). It can also tell us how much of that offset is due to DST: Eastern.dst(dt).
It is crucial to understand that with "Eastern" as tzinfo, there is only *one* value for 1:30 am on Oct 27, 2002. The Eastern tzinfo object arbitrarily decides this is EDT, so it maps to 5:30 UTC. (But the problem would still exist if it arbitrarily decided that it was EST.)
It is also crucial to understand that we have no direct way to translate UTC to Eastern. We only have a direct way to translate Eastern to UTC (by subtracting Eastern.utcoffset(dt)).
Why don't you take a look at how this is done in mxDateTime ? It has support for the C lib API timegm() (present in many C libs) and includes a work-around which works for most cases; even close to the DST switch time. BTW, you should also watch out for broken mktime() implementations and whether the C lib support leap seconds or not. That has bitten me a few times too. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
Guido van Rossum wrote:
Let me present the issue differently.
On Sunday, Oct 27, 2002, both 5:30 UTC and 6:30 UTC map to 1:30 am US Eastern time: 5:30 UTC maps to 1:30 am EDT, and 6:30 UTC maps to 1:30 am EST. (UTC uses a 24 hour clock.)
We have a tzinfo subclass representing the US Eastern (hybrid) timezone whose primary responsibility is to translate from local time in that timezone to UTC: Eastern.utcoffset(dt). It can also tell us how much of that offset is due to DST: Eastern.dst(dt).
It is crucial to understand that with "Eastern" as tzinfo, there is only *one* value for 1:30 am on Oct 27, 2002. The Eastern tzinfo object arbitrarily decides this is EDT, so it maps to 5:30 UTC. (But the problem would still exist if it arbitrarily decided that it was EST.)
It is also crucial to understand that we have no direct way to translate UTC to Eastern. We only have a direct way to translate Eastern to UTC (by subtracting Eastern.utcoffset(dt)).
Why don't you take a look at how this is done in mxDateTime ?
I looked at the code, but I couldn't find where it does conversion between arbitrary timezones -- almost all timezone-related code seems to have to do with parsing timezone names and specifications.
It has support for the C lib API timegm() (present in many C libs) and includes a work-around which works for most cases; even close to the DST switch time.
A goal of the new datetime module is to avoid all dependency on the C library's time facilities -- we must support calculataions outside the range that the C library can deal with.
BTW, you should also watch out for broken mktime() implementations and whether the C lib support leap seconds or not. That has bitten me a few times too.
Ditto. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
Guido van Rossum wrote:
Let me present the issue differently.
On Sunday, Oct 27, 2002, both 5:30 UTC and 6:30 UTC map to 1:30 am US Eastern time: 5:30 UTC maps to 1:30 am EDT, and 6:30 UTC maps to 1:30 am EST. (UTC uses a 24 hour clock.)
We have a tzinfo subclass representing the US Eastern (hybrid) timezone whose primary responsibility is to translate from local time in that timezone to UTC: Eastern.utcoffset(dt). It can also tell us how much of that offset is due to DST: Eastern.dst(dt).
It is crucial to understand that with "Eastern" as tzinfo, there is only *one* value for 1:30 am on Oct 27, 2002. The Eastern tzinfo object arbitrarily decides this is EDT, so it maps to 5:30 UTC. (But the problem would still exist if it arbitrarily decided that it was EST.)
It is also crucial to understand that we have no direct way to translate UTC to Eastern. We only have a direct way to translate Eastern to UTC (by subtracting Eastern.utcoffset(dt)).
Why don't you take a look at how this is done in mxDateTime ?
I looked at the code, but I couldn't find where it does conversion between arbitrary timezones -- almost all timezone-related code seems to have to do with parsing timezone names and specifications.
It doesn't do conversion between time zone, but it does provide you with the offset information from UTC to local time in both directions.
It has support for the C lib API timegm() (present in many C libs) and includes a work-around which works for most cases; even close to the DST switch time.
A goal of the new datetime module is to avoid all dependency on the C library's time facilities -- we must support calculataions outside the range that the C library can deal with.
I don't see how that can be done for time zones and DST. Timezones and even more the DST settings change more often for various locales than you think, so assumptions about the offset between UTC and local time for the future as well as for historical dates can easily be wrong. The tz data used by most C libs has tables which account for many of the known offsets in the past; they can only guess about the future. The only usable time scale for historic and future date/time is UTC. The same is true if you're interested in date/time calculations in terms of absolute time. Now, for current time zones, the C lib is a good source of information, so I don't see why you wouldn't want to use it.
BTW, you should also watch out for broken mktime() implementations and whether the C lib support leap seconds or not. That has bitten me a few times too.
Ditto.
-- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
Now, for current time zones, the C lib is a good source of information, so I don't see why you wouldn't want to use it.
It seems you haven't been following this discussion. The issue is not how to get information about timezones. The issue is, given an API that implements an almost-but-not-quite-reversible function from local time to UTC, how to invert that function. Please go read the datetime Wiki before commenting further in this thread. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
Now, for current time zones, the C lib is a good source of information, so I don't see why you wouldn't want to use it.
It seems you haven't been following this discussion. The issue is not how to get information about timezones. The issue is, given an API that implements an almost-but-not-quite-reversible function from local time to UTC, how to invert that function.
I have to admit I've also had trouble following the discussion also, until you described it that way. Correct me if I'm wrong: you need the inverse of a mathematical function. utc_time = f(local_time) I'll represent the inverse as "g": local_time = g(utc_time) The graph of "f" looks like an upward slope, but it has little holes and overlaps at daylight savings boundaries. The inverse is almost as strange, with periodic small jumps up and down. You'd like to use the time zone information provided by the C library in the computation of "f", but the C library doesn't quite provide all the information you need to compute "g" correctly. With those requirements, your only hope of computing "g" is to make some assumptions about "f". That sounds perfectly reasonable, but may I suggest moving the assumption by changing the interface of the tzinfo class. The utcoffset() method leads one to naively assume that functions f and g can both depend reliably on utcoffset(). Instead, tzinfo might have two methods, to_local(utc_date) and to_utc(local_date). That way, the tzinfo object encapsulates the madness. One downside is that then you can't expect normal programmers to write a correct tzinfo based on the C libraries. They'll never get it right. :-) It would have to be supplied with Python. Shane
The issue is, given an API that implements an almost-but-not-quite-reversible function from local time to UTC, how to invert that function.
I have to admit I've also had trouble following the discussion also, until you described it that way. Correct me if I'm wrong: you need the inverse of a mathematical function.
utc_time = f(local_time)
I'll represent the inverse as "g":
local_time = g(utc_time)
The graph of "f" looks like an upward slope, but it has little holes and overlaps at daylight savings boundaries. The inverse is almost as strange, with periodic small jumps up and down.
Yes, exactly. The bizarre thing is that g() is a true function, with a shape like this: . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . o . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . u . . . . . . . . . . o . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . * . . . . . . . . . . . . . . . . . . . . . . . . t . . . . . . . . * . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . q . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p . . . . . o . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A B C D Here the x-axis is UTC, and the y-axis is local time. The 'o' points are a feeble attempt at drawing the end points of a half-open interval. Points A and B mark vertical lines at the DST switch points: DST is active between A and B (and again between C and D, etc.). This makes f(), its inverse, not quite a true function in the mathematical sense: in [p, q) it has no value, and in [t, u) it is two-valued. (To see f()'s graph, just transpose the above graph in your head. :-) But our tzinfo implementation in fact implements f(), and makes it into a real function by assigning some output value to inputs in [p, q) and by picking one of the two possible output for inputs in the range [u, v). Now when we want to translate from UTC to local time, we have to recover the parameters of g() by interpreting f()! (There's more, but I don't want to spend all day writing this.)
You'd like to use the time zone information provided by the C library in the computation of "f", but the C library doesn't quite provide all the information you need to compute "g" correctly. With those requirements, your only hope of computing "g" is to make some assumptions about "f".
Yes, except the C library doesn't enter into it.
That sounds perfectly reasonable, but may I suggest moving the assumption by changing the interface of the tzinfo class. The utcoffset() method leads one to naively assume that functions f and g can both depend reliably on utcoffset(). Instead, tzinfo might have two methods, to_local(utc_date) and to_utc(local_date). That way, the tzinfo object encapsulates the madness.
This is similar to Tim's suggestion of letting the tzinfo subclass implement fromutc(). We may have to do this.
One downside is that then you can't expect normal programmers to write a correct tzinfo based on the C libraries. They'll never get it right. :-) It would have to be supplied with Python.
This is one of the reasons against it; we can't possibly supply timezone implementations for every country, so we really need people to write their own. Maybe this is what Marc-Andre was hinting at: apparently mxDateTime knows how to access the C library's timezone tables. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
Now when we want to translate from UTC to local time, we have to recover the parameters of g() by interpreting f()!
(There's more, but I don't want to spend all day writing this.)
That was a great ASCII art graph, though. It made me smile. :-)
Maybe this is what Marc-Andre was hinting at: apparently mxDateTime knows how to access the C library's timezone tables.
I was working on a possible solution when I stumbled across the fact that the current tzinfo documentation doesn't seem to specify whether the dst() method expects the "dt" argument to be in terms of UTC or local time. Sometimes when working on the problem I assumed dt was in UTC, making the conversion from UTC to local time easy, and at other times I assumed dt was in local time, making the conversion from local time to UTC easy. Which is it? Once that's decided, it seems like the "hard" case (whichever is the hard one) could be solved by first computing the UTC offset at the time requested, then computing the UTC offset at a time adjusted by the offset. If the two computed offsets are different, you know you've straddled a daylight savings boundary, and maybe the second offset is the correct offset. That's just a guess. feebly-trying-to-catch-up-to-tims-genius-ly y'rs, Shane
I was working on a possible solution when I stumbled across the fact that the current tzinfo documentation doesn't seem to specify whether the dst() method expects the "dt" argument to be in terms of UTC or local time. Sometimes when working on the problem I assumed dt was in UTC, making the conversion from UTC to local time easy, and at other times I assumed dt was in local time, making the conversion from local time to UTC easy. Which is it?
Local time. This is the source of most problems!
Once that's decided, it seems like the "hard" case (whichever is the hard one) could be solved by first computing the UTC offset at the time requested, then computing the UTC offset at a time adjusted by the offset. If the two computed offsets are different, you know you've straddled a daylight savings boundary, and maybe the second offset is the correct offset. That's just a guess.
You're slowly rediscovering the guts of datetimetz.astimezone(). Have a look at the python code in python/nondist/sandbox/datetime/dateyime.py before you go any further. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)
[Shane Hathaway]
I was working on a possible solution when I stumbled across the fact that the current tzinfo documentation doesn't seem to specify whether the dst() method expects the "dt" argument to be in terms of UTC or local time.
I'm not keeping the plain-text docs up to date anymore, but the doc.txt under Zope3's src/datetime/ sez: An instance of (a concrete subclass of) tzinfo can be passed to the constructors for datetimetz and timetz objects. The latter objects view their fields as being in local time, and the tzinfo object supports ... ... These methods are called by a datetimetz or timetz object, in response to their methods of the same names. A datetimetz object passes itself as the argument, and a timetz object passes None as the argument. A tzinfo subclass's methods should therefore be prepared to accept a dt argument of None, or of class datetimetz. When None is passed, it's up to the class designer to decide the best response. For example, ... ... When a datetimetz object is passed in response to a datetimetz method, dt.tzinfo is the same object as self. tzinfo methods can rely on this, unless user code calls tzinfo methods directly. The intent is that the tzinfo methods interpret dt as being in local time, and not need to worry about objects in other timezones. So I can't tell you what *your* dst() method should expect if you call it directly, but I can (and do) tell you that whenever the implementation calls a tzinfo method by magic, the argument will be None, or a datetimetz with a matching tzinfo member and is to be viewed as local time (hmm -- perhaps the distinction between self's notion of local time and your own notion of local time remains unclear).
... Once that's decided, it seems like the "hard" case (whichever is the hard one) could be solved by first computing the UTC offset at the time requested, then computing the UTC offset at a time adjusted by the offset. If the two computed offsets are different, you know you've straddled a daylight savings boundary, and maybe the second offset is the correct offset. That's just a guess.
For a formal proof <wink>, see the long comment at the end of Zope3's src/datetime/_datetime.py (which I keep in synch with the Python sandbox version Guido pointed you at).
feebly-trying-to-catch-up-to-tims-genius-ly y'rs, Shane
It's a lousy 3-segment step function. This isn't genius, it's just a stubborn refusal to give up <0.7 wink>. if-you're-on-an-irritating-project-it's-energizing-to-attack- a-piece-of-it-you-hate-ly y'rs - tim
On Fri, Jan 03, 2003 at 05:04:18PM -0500, Guido van Rossum wrote:
| > One downside is that then you can't expect normal programmers to write a
| > correct tzinfo based on the C libraries. They'll never get it right.
| > :-) It would have to be supplied with Python.
|
| This is one of the reasons against it; we can't possibly supply
| timezone implementations for every country, so we really need people
| to write their own.
Specially when the daylight savings time is not always fixed like in
Brazil, where last year the govt simply decided that it would start
in a different date to save something like 0.001% more energy :)
--
Sidnei da Silva (dreamcatcher)
[Shane Hathaway]
... That sounds perfectly reasonable, but may I suggest moving the assumption by changing the interface of the tzinfo class. The utcoffset() method leads one to naively assume that functions f and g can both depend reliably on utcoffset(). Instead, tzinfo might have two methods, to_local(utc_date) and to_utc(local_date). That way, the tzinfo object encapsulates the madness.
I think we may need from_utc() before this is over, but that most people won't have any need for it. In the other direction, it's already the tzinfo subclass author's responsibility to ensure that the current: d - d.utcoffset() yields exactly the same date and time members as would the hypothesized: d.to_utc()
One downside is that then you can't expect normal programmers to write a correct tzinfo based on the C libraries. They'll never get it right. :-) It would have to be supplied with Python.
I doubt the latter will happen, and it certainly won't happen for 2.3. The current scheme has actually become about as easy as it can become. From the next iteration of the docs, here's a full implementation of a class for DST-aware major US time zones (using the rules that have been in effect for more than a decade): """ from datetime import tzinfo, timedelta, datetime ZERO = timedelta(0) HOUR = timedelta(hours=1) def first_sunday_on_or_after(dt): days_to_go = 6 - dt.weekday() if days_to_go: dt += timedelta(days_to_go) return dt # In the US, DST starts at 2am (standard time) on the first Sunday in # April. DSTSTART = datetime(1, 4, 1, 2) # and ends at 2am (DST time; 1am standard time) on the last Sunday # of October, which is the first Sunday on or after Oct 25. DSTEND = datetime(1, 10, 25, 2) class USTimeZone(tzinfo): def __init__(self, hours, reprname, stdname, dstname): self.stdoffset = timedelta(hours=hours) self.reprname = reprname self.stdname = stdname self.dstname = dstname def __repr__(self): return self.reprname def tzname(self, dt): if self.dst(dt): return self.dstname else: return self.stdname def utcoffset(self, dt): return self.stdoffset + self.dst(dt) def dst(self, dt): if dt is None or dt.tzinfo is None: # An exception may be sensible here, in one or both cases. # It depends on how you want to treat them. The astimezone() # implementation always passes a datetimetz with # dt.tzinfo == self. return ZERO assert dt.tzinfo is self # Find first Sunday in April & the last in October. start = first_sunday_on_or_after(DSTSTART.replace(year=dt.year)) end = first_sunday_on_or_after(DSTEND.replace(year=dt.year)) # Can't compare naive to aware objects, so strip the timezone $ from dt first. if start <= dt.replace(tzinfo=None) < end: return HOUR else: return ZERO Eastern = USTimeZone(-5, "Eastern", "EST", "EDT") Central = USTimeZone(-6, "Central", "CST", "CDT") Mountain = USTimeZone(-7, "Mountain", "MST", "MDT") Pacific = USTimeZone(-8, "Pacific", "PST", "PDT") """ The test suite beats the snot out of this class, and .astimezone() behaves exactly as we've talked about here in all cases now, whether Eastern or Pacific (etc) are source zones or target zones or both. But the coding is really quite simple, doing nothing more nor less than implementing "the plain rules". (BTW, note that no use is made of the platform C time functions here) A similar class for European rules can be found in EU.py in the Python datetime sandbox, and is just as straightforward (relative to the complexity inherent in those rules). Because the only strong assumption astimezone() makes is that tz.utcoffset(d) - tz.dst(d) # tz's "standard offset" is invariant wrt d, it should work fine for tzinfo subclasses that want to use different switch points in different years, or have multiple DST periods in a year (including none at all in some years), etc. So long as a time zone's "standard offset" depends only on a location's longitude, astimezone() is very likely to do the right thing no matter how goofy the rest of the zone is. So, at the moment, I don't have an actual use case in hand anymore that requires a from_utc() method. astimezone() could be written in terms of it, though: def astimezone(self, tz): self -= self.utcoffset() # as UTC other = self.replace(tzinfo=tz) return other.from_utc() and the tzinfo base class could supply a default from_utc() method capturing the current astimezone() implementation. Then we'd have a powerful hook tzinfo subclasses could override -- but I'm not sure anyone will find a need to!
So, at the moment, I don't have an actual use case in hand anymore that requires a from_utc() method. astimezone() could be written in terms of it, though:
def astimezone(self, tz): self -= self.utcoffset() # as UTC other = self.replace(tzinfo=tz) return other.from_utc()
and the tzinfo base class could supply a default from_utc() method capturing the current astimezone() implementation. Then we'd have a powerful hook tzinfo subclasses could override -- but I'm not sure anyone will find a need to!
That's a potentially powerful idea (but call it fromutc() since utcoffset() doesn't have an underscore either). I'd also then perhaps favor the idea of implementing utcoffset() in that base class as returning a fixed standard offset plus whatever dst() returns -- though that requires us to fix the name and type of the offset, and may require a special case for when dst() returns None, so maybe it's not worth it. --Guido van Rossum (home page: http://www.python.org/~guido/)
In case it's of interest: http://www.twinsun.com/tz/tz-link.htm David LeBlanc Seattle, WA USA
-----Original Message----- From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On Behalf Of Tim Peters Sent: Friday, January 03, 2003 22:44 To: Shane Hathaway Cc: zope3-dev@zope.org; PythonDev Subject: RE: [Zope3-dev] Re: [Python-Dev] Holes in time
[Shane Hathaway]
... That sounds perfectly reasonable, but may I suggest moving the assumption by changing the interface of the tzinfo class. The utcoffset() method leads one to naively assume that functions f and g can both depend reliably on utcoffset(). Instead, tzinfo might have two methods, to_local(utc_date) and to_utc(local_date). That way, the tzinfo object encapsulates the madness.
I think we may need from_utc() before this is over, but that most people won't have any need for it. In the other direction, it's already the tzinfo subclass author's responsibility to ensure that the current:
d - d.utcoffset()
yields exactly the same date and time members as would the hypothesized:
d.to_utc()
One downside is that then you can't expect normal programmers to write a correct tzinfo based on the C libraries. They'll never get it right. :-) It would have to be supplied with Python.
I doubt the latter will happen, and it certainly won't happen for 2.3.
The current scheme has actually become about as easy as it can become. From the next iteration of the docs, here's a full implementation of a class for DST-aware major US time zones (using the rules that have been in effect for more than a decade):
""" from datetime import tzinfo, timedelta, datetime
ZERO = timedelta(0) HOUR = timedelta(hours=1)
def first_sunday_on_or_after(dt): days_to_go = 6 - dt.weekday() if days_to_go: dt += timedelta(days_to_go) return dt
# In the US, DST starts at 2am (standard time) on the first Sunday in # April. DSTSTART = datetime(1, 4, 1, 2) # and ends at 2am (DST time; 1am standard time) on the last Sunday # of October, which is the first Sunday on or after Oct 25. DSTEND = datetime(1, 10, 25, 2)
class USTimeZone(tzinfo):
def __init__(self, hours, reprname, stdname, dstname): self.stdoffset = timedelta(hours=hours) self.reprname = reprname self.stdname = stdname self.dstname = dstname
def __repr__(self): return self.reprname
def tzname(self, dt): if self.dst(dt): return self.dstname else: return self.stdname
def utcoffset(self, dt): return self.stdoffset + self.dst(dt)
def dst(self, dt): if dt is None or dt.tzinfo is None: # An exception may be sensible here, in one or both cases. # It depends on how you want to treat them. The astimezone() # implementation always passes a datetimetz with # dt.tzinfo == self. return ZERO assert dt.tzinfo is self
# Find first Sunday in April & the last in October. start = first_sunday_on_or_after(DSTSTART.replace(year=dt.year)) end = first_sunday_on_or_after(DSTEND.replace(year=dt.year))
# Can't compare naive to aware objects, so strip the timezone $ from dt first. if start <= dt.replace(tzinfo=None) < end: return HOUR else: return ZERO
Eastern = USTimeZone(-5, "Eastern", "EST", "EDT") Central = USTimeZone(-6, "Central", "CST", "CDT") Mountain = USTimeZone(-7, "Mountain", "MST", "MDT") Pacific = USTimeZone(-8, "Pacific", "PST", "PDT") """
The test suite beats the snot out of this class, and .astimezone() behaves exactly as we've talked about here in all cases now, whether Eastern or Pacific (etc) are source zones or target zones or both. But the coding is really quite simple, doing nothing more nor less than implementing "the plain rules". (BTW, note that no use is made of the platform C time functions here)
A similar class for European rules can be found in EU.py in the Python datetime sandbox, and is just as straightforward (relative to the complexity inherent in those rules).
Because the only strong assumption astimezone() makes is that
tz.utcoffset(d) - tz.dst(d) # tz's "standard offset"
is invariant wrt d, it should work fine for tzinfo subclasses that want to use different switch points in different years, or have multiple DST periods in a year (including none at all in some years), etc. So long as a time zone's "standard offset" depends only on a location's longitude, astimezone() is very likely to do the right thing no matter how goofy the rest of the zone is.
So, at the moment, I don't have an actual use case in hand anymore that requires a from_utc() method. astimezone() could be written in terms of it, though:
def astimezone(self, tz): self -= self.utcoffset() # as UTC other = self.replace(tzinfo=tz) return other.from_utc()
and the tzinfo base class could supply a default from_utc() method capturing the current astimezone() implementation. Then we'd have a powerful hook tzinfo subclasses could override -- but I'm not sure anyone will find a need to!
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev
Guido van Rossum wrote:
Now, for current time zones, the C lib is a good source of information, so I don't see why you wouldn't want to use it.
It seems you haven't been following this discussion. The issue is not how to get information about timezones. The issue is, given an API that implements an almost-but-not-quite-reversible function from local time to UTC, how to invert that function.
I am not talking about how to get the current timezone; the C lib APIs provide functions to convert between local time and UTC -- that's what was referring to. Local time is all about timezones and DST which is why conversions between UTC and local time always have to deal with timezones and DST.
Please go read the datetime Wiki before commenting further in this thread.
Sorry to have bothered you, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
[Guido]
A goal of the new datetime module is to avoid all dependency on the C library's time facilities -- we must support calculataions outside the range that the C library can deal with.
[M.-A. Lemburg]
I don't see how that can be done for time zones and DST.
You may be missing that datetime supplies no time zone classes, not even a class for UTC. What it does provide is an abstract base class (tzinfo), and a protocol users can follow if they want to supply concrete time zone subclasses of their own. datetimetz.astimezone() is a pretty general tz conversion routine that works with the tzinfo protocol, but datetime supplies no objects astimezone can work *with* out of the box. The time zone rules a user can support are thus whatever can be expressed by arbitrary user-written Python code -- but they have to write that code themself (or talk someone else into writing it for them).
Timezones and even more the DST settings change more often for various locales than you think, so assumptions about the offset between UTC and local time for the future as well as for historical dates can easily be wrong.
Since the datetime module supplies no concrete time zone objects, it makes no concrete time zone assumptions <wink> (whether about past, present, or future).
The tz data used by most C libs has tables which account for many of the known offsets in the past; they can only guess about the future.
A user who wants to use such tables will have to write Python code to read them up. If they want their code to search the web for updates and incorporate them on the fly, they can do that too.
The only usable time scale for historic and future date/time is UTC. The same is true if you're interested in date/time calculations in terms of absolute time.
Users who buy that can pay for it <wink>. Note that datetime doesn't support years outside the range 1-9999, so its appeal to astronomers and ancient history buffs is limited anyway.
Now, for current time zones, the C lib is a good source of information, so I don't see why you wouldn't want to use it.
As with all the rest here, users are free to, if that's what they want. datetime just supplies a framework for what are essentially pluggable time zone strategy objects.
[M.-A. Lemburg]
Why don't you take a look at how this is done in mxDateTime ? It has support for the C lib API timegm() (present in many C libs) and includes a work-around which works for most cases; even close to the DST switch time.
BTW, you should also watch out for broken mktime() implementations and whether the C lib support leap seconds or not. That has bitten me a few times too.
I think there's a relevant difference in datetime: it makes almost no use of timestamps. There is no datetime method that returns a timestamp, for example. All we've got are "backward compatability" constructors that will build a datetime object from a timestamp, if you insist <wink>. Those use the platform localtime() and gmtime() functions, and inherit whatever limitations and problems the C libraries have. Eek -- that reminds me, I should add code to clamp out tm_sec values of 60 and 61 The broken-out year, month, etc, struct tm members aren't combined again internally either, as dates and times in this module are stored with distinct year, month, etc fields. It's not clear what you mean by "broken mktime() implementations", but the implementation of datetime never calls the platform mktime(). The test suite does, though.
[Guido, on astimezone() assumptions] Most of these assumptions aren't needed by the current implementation. The crucial assumption (used in every step) is that tz.utcoffset(d) - tz.dst(d) is invariant across all d with d.tzinfo == tz. Apart from that, only one other assumption is made, at the end:
... It can assume that the DST correction is >= 0 and probably less than 10 hours or so,
Those aren't needed (or used).
and that DST changes don't occur more frequently than twice a year (once on and once off),
Ditto -- although you'll grow another unspellable hour each time DST ends.
and that the DST correction is constant during the DST period,
A weaker form of that is needed at the end. I didn't get around to writing the end of the proof yet. At that point, we've got a guess z'. The missing part of the proof is that z' is UTC-equivalent to the input datetime if and only if (z' + z'.dst()).dst() == z'.dst() Intuitively, and because we know that z'.dst() != 0 at this point in the algorithm, it's saying the result is correct iff "moving a little farther into DST still leaves us in DST". For a class like Eastern, it fails to hold iff we start with the unspellable hour at the end of daylight time: 6:MM UTC maps to z' == 1:MM Eastern, which appears to be dayight time. z'.dst() returns 60 minutes then, but (z' + z'.dst()).dst() == (1:MM Eastern + 1 hour).dst() == (2:MM Eastern).dst() == 0 != 60 then. Another way this *could* hypothetically fail is if .dst() returned different non-zero values at different times. But the assumption is only needed in that specific expression, so it should be OK if distinct daylight periods have distinct dst() offsets (although maintaining the utcoffset() - dst() is-a-constant invariant appears unlikely then).
and that the only variation in UTC offset is due to DST.
That's not used either -- although, again, the assumption that "the standard offset" (utcoffset-dst) is a constant is hard to maintain in a wacky time zone.
Tim & I spent some more time in front of a whiteboard today. We've found a solution that takes the ValueError away. It needs to make assumptions about what the tzinfo implementation does with the impossible and ambiguous times at the DST switch points. Tim thinks that this is the same solution that Aahz arrived at with han dwaving. --Guido van Rossum (home page: http://www.python.org/~guido/)
participants (8)
-
David LeBlanc
-
Guido van Rossum
-
M.-A. Lemburg
-
Neil Schemenauer
-
Shane Hathaway
-
Sidnei da Silva
-
Tim Peters
-
Tim Peters