time and scheduling (was: bug report: [ #447945 ] time.time() is not non-decreasing)

zooko at zooko.com zooko at zooko.com
Thu Aug 9 13:50:54 EDT 2001


 * Can I just ignore this thread and keep using `time.time()'?

Not if you want your events to execute in the right order.  The natural way to
schedule events in Python is to use the result of `time.time()' to key events
in a sorted list.  (For example, the original version of the Mojo Nation[1]
scheduler[2], and the recently announced "Selecting" scheduler[3], both do
this.)

Most of the time, this does what you think it does.  In some rare cases, or in
some very common cases on certain platforms, it does *not* do what you think it
does, and it causes your events to fire in the wrong order, which can do
arbitrarily bad things to your users.  The fact that people naturally assume
that this is the way to do it, and that when they do so it works *most* of the
time, makes it a particularly insidious problem.  


 * What went wrong in the first place?

Some events in Mojo Nation were getting executed in the wrong order, causing
all kinds of weird errors.  At first I thought this must be due to a bug in the
code that scheduled the events, but that code checked out okay, so then 
I thought it must be a bug in Mojo Nation's event scheduler, but it got a clean
bill of health, too.  Then I wrote a test script which tested whether the
answers returned from Python's `time.time()' were non-decreasing, and 
I discovered that `time.time()' does actually go *backwards in time* pretty
often!

I reported this[4] to the Python people, who helped me figure out that
`time.time()' returns the answer from the system call `gettimeofday()', if it
is available, and that on my computer `gettimeofday()' can go backwards in
time!

Many people reported that this doesn't happen on their computer, and Johannes
Stezenbach helpfully pointed out [5] that the variable-speed CPU in my laptop
is probably the reason why I get this behavior frequently and other people
don't.  I also got a report of the same behavior from a user that does not have
a laptop but who does have two CPUs in his computer.

So I was about to report this as a bug in Linux's `gettimeofday()', when
suddenly I realized that *this is not a bug*.


 * What do you mean by "time", exactly?

There are two notions of "time" that you might want to use in a computer
program.  One is the "global, absolute time".  This could mean UTC,
astronomical time, Gregorian Calendar time, or whatever, but the important
thing is that "what time it is" in this sense is *defined* outside of your
application and indeed outside of your computer.  If you want to use this kind
of time in your code, you have to be prepared for your idea of "what time it
is" to change out from under you, and you have to trust remote computers and
your users to keep you informed of what time it "really" is.

We do not use this notion of time in Mojo Nation.  At least, not on purpose.  
I believe that using this notion of time is unreliable (for obvious reasons)
and insecure (since it exposes the behavior of your application to manipulation
by an external party who can influence your computer's idea of "the current
time").

The other notion of "time" is local, always increasing, and always moving
forward at the about the same speed (depending on the quality of your hardware
clock).  This notion of time is useful *only* for *relative* time between two
events in your application.  For example, you may want to know how much time
passed between the beginning and the end of a running a certain routine, and
you may want to schedule an event to be run "180 seconds from now".

Now `gettimeofday()' is defined to report the former notion of time.
Therefore, there is absolutely no guarantee about how the answers from
`gettimeofday()' behave with respect to the actual local time.  Not only the
strange behavior of Linux's timer loop on variable-speed and SMP systems, but
also skew correction, automatic remote synchronization a la NTP, manual
adjustment by the user, and Daylight Savings Time adjustments can cause the
time reported by `gettimeofday()' to be unpredictable.

(Tim Peters pointed this out more succinctly.[6])


 * How do we work around the problem?

The current solution that we've deployed in Mojo Nation is the addition of an
"IncreasingTimer" class [7], which guarantees that the answer will be
monotonically increasing.  In addition, it guarantees that the "delta" between
the answers that it returns and the underlying answer from `time.time()' is
monotonically non-decreasing.

Bengt Richter and Ken Seehof [8, 9] each suggested removing the delta
guarantee, which would make sense if you wanted IncreasingTimer to approximate
global time, but I think not if you want IncreasingTimer to approximate local
time.

This delta feature is important for the use of the Mojo Nation scheduler --
without it, if the underlying hardware clock were set back a day, then for the
next day "local time" would seem to slow to a crawl -- every time you called
`time()', you would get a number only slightly greater than the previous time.
Therefore, if you asked the Mojo Nation scheduler to execute a certain event
"10 second from now", it would never get around to it.

On the other hand, *with* the delta guarantee, if the underlying clock leaps
*backwards* a day and then leaps *forwards* a day, then the event scheduler
will immediately try to process all pending events that were scheduled to go
off until the next day.  This isn't great, but I don't see how to get around
it, and anyway even without the delta you still have the same problem if the
underlying clock just leaps forwards a day from a standing start.


 * What should the Python authors (and Linux kernel hackers) do about it?

At the very least Python's `time.time()' ought to be documented to warn people
from using it in this way.  But what should it suggest that they do instead?

Much nicer would be if the Python authors could somehow offer me access to a
"local time" clock which was guaranteed not to go backwards *nor* to leap
forwards dramatically.

Of course, we can't actually *prevent* the superuser from tinkering with such a
clock, but we *can* define it as a clock that shouldn't be tinkered with,
unlike the `gettimeofday()' clock, which is defined as needing to be tinkered
with in order to keep approximating universal time.  


Regards,

Zooko

Journeyman Hacker and Consultant
http://zooko.com/intro.html

P.S.  I just realized that we *do* use the notion of absolute time in Mojo
Nation for one purpose: to time-stamp the logs in UTC format so that you can
visually compare the logs from two separate instances of Mojo Nation and,
provided that their clocks were properly synched with NTP, correlate messages
sent between them.

P.P.S.  I *love* this list.  I got lots of quick responses and friendly
cooperation, from running my test script to suggesting variations of
IncreasingTimer and showing how to do an IEEE 754 double hack [9, 10].  
I remain convinced that the friendliness and energy of the Python community is
one of the most important factors in Python's success.

[1] http://mojonation.net/
[2] http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/~checkout~/mojonation/evil/common/DoQ.py?content-type=text/plain
[3] http://mail.python.org/pipermail/python-dev/2001-July/016000.html
[4] http://mail.python.org/pipermail/python-list/2001-August/058296.html
[5] http://mail.python.org/pipermail/python-list/2001-August/058332.html
[6] http://mail.python.org/pipermail/python-list/2001-August/058381.html
[7] http://mail.python.org/pipermail/python-list/2001-August/058363.html
[8] http://mail.python.org/pipermail/python-list/2001-August/058394.html
[9] http://mail.python.org/pipermail/python-list/2001-August/058384.html
[10] http://mail.python.org/pipermail/python-list/2001-August/058383.html





More information about the Python-list mailing list