[Tim]
... Inside pytz you already know everything that can be known about transitions. You don't "poke and hope" to do that, you do a binary search, right? You find the zoneinfo record for the time of interest, and compare that to the transitions on either side to deduce whether there's a fold or gap in play.
I should try to flesh that out. I don't work with tzfiles (I'm on Windows - nobody gives a shit about history on Windows ;-) ). My understanding is that you have a sorted listed of transition times, in UTC. I'll call that list `tt` (transition times). Also a parallel list of (at least) total UTC offsets. I'll call that list `ofs`.
`fold` should be 1 if and only if the UTC time `u` is the later of times ambiguous in the local zone. Ambiguous times exist if and only if a transition decreases the total UTC offset. There are some before the transition ("at the end of its life"), and some after the transition ("at the start of its life"). fold is 1 only in the latter case (the later of ambiguous times).
So first do a binary seach on `tt` to find the largest transition time <= u. Call the index `i`. You must already have code to do that, yes?
Then:
if i == 0: # First entry in the file - there is no # transition _to_ this. return 0 # How much did the total UTC offset change? delta = ofs[i] - ofs[i-1] if delta >= 0: # The offset didn't change, or the # offset increased so this a gap (not # fold) case. return 0 # delta < 0, so the offset decreased. All # and only the times from tt[i] up to (but # not including) tt[i] - delta ("-" because # delta is negative) are the later of # ambiguous times. return int(u - tt[i] < -delta)
# or int(tt[i] - u > delta) to save a "-" # at the cost of making it incomprehensible
All of that may be wrong ;-) The _point_ is that the base ideas are so straightforward that if the code works in practice for some non-trivial case, it probably works for almost all cases. I don't know enough about tzfiles to be sure of anything. In particular, a `delta` of 0 _may_ mean there's some kind of more-or-less artificial (having nothing to do with offsets) entry in the file, and it's necessary to search back for the closest preceding offset that's _not_ the same as ofs[i]. For example, maybe just the name of the zone changed? If so, then runtime search could be avoided by pre-computing such stuff when loading the tzfile to begin with. For that matter, a parallel `delta` list could be precomputed once-&-for-all. This _could_ be made to run very fast. Getting the right answer is minor compared to running fast ;-)
Then compare those 6 lines of code to the incredible gyrations any implementation of mktime() goes through to get is_dst right (if you can find a mktime() implementation that always does get it right).