Mailman 3 [Twisted-Python] LoopingCall.withCount countCallable called with 0 - Twisted

[Twisted-Python] LoopingCall.withCount countCallable called with 0

older
[Twisted-Python] Weekly Bug Summary

Dario Vinella

24 Mar 2015 24 Mar '15

4:49 a.m.

Hi all, I found an odd behavior using a LoopingCall.withCount with a very small interval (0.02 sec): sometimes the countCallable is called with 0 as argument and it looks like the now - _realLastTime is smaller than the looping interval. Here you can find the very simple code I use to reproduce the "anomaly", with a main loopingCall and another one that prints its internals https://gist.github.com/dvinella/31d94f9dd0e47586658d and here an extract of the logs it produces https://gist.github.com/dvinella/8cfdce0b9f5fdda7b375 The doc for the withCount states that the argument should be normally 1, higher if the reactor blocks for some reason, but it doesn't say anything about 0. The beahavior is quite misleading since the loop seems to anticipate the interval timeout, saying "I've been called, but I should have not been called" I must add that the anomaly doesn't appear right at the start, but after tens of seconds, mostly when the system is under medium/heavy load. So is it only a documentation issue for an exceptional case, due to timers imprecision, that should be handled at higher levels, or is it a real issue? dario

Attachments:

attachment.htm (text/html — 1.4 KB)

Show replies by date

Glyph Lefkowitz

24 Mar 24 Mar

5:18 p.m.

New subject: [Twisted-Python] LoopingCall.withCount countCallable called with 0

...

On Mar 24, 2015, at 2:49 AM, Dario Vinella <d.vinella@gmail.com> wrote:

Hi all,

I found an odd behavior using a LoopingCall.withCount with a very small interval (0.02 sec): sometimes the countCallable is called with 0 as argument and it looks like the now - _realLastTime is smaller than the looping interval.

So, this can definitely happen if time goes backwards, but it has to go backwards between the time that the reactor starts executing timed calls and the time that LoopingCall looks at the timestamp to compute what the value passed to withCount should be. If it were smaller than the LoopingCall's time-to-run

...

Here you can find the very simple code I use to reproduce the "anomaly", with a main loopingCall and another one that prints its internals

https://gist.github.com/dvinella/31d94f9dd0e47586658d <https://gist.github.com/dvinella/31d94f9dd0e47586658d>

An even more straightforward way to figure out if it's a problem with your machine would be this: import time from itertools import count then = time.time() for ever in count(): now = time.time() print("{},{}".format(ever, (now - then) * 10e6)) then = now It would be good to look at how many of these numbers are negative.

...

and here an extract of the logs it produces

https://gist.github.com/dvinella/8cfdce0b9f5fdda7b375 <https://gist.github.com/dvinella/8cfdce0b9f5fdda7b375>

The doc for the withCount states that the argument should be normally 1, higher if the reactor blocks for some reason, but it doesn't say anything about 0. The beahavior is quite misleading since the loop seems to anticipate the interval timeout, saying "I've been called, but I should have not been called"

That seems right. I think there's a real problem here, since physical clocks do just fail, and sometimes NTP slew has to happen. LoopingCall.withCount has a clear opportunity to notice that time has gone backwards, and can just skip that call. This is assuming that time has gone backwards by a tiny amount for a brief period of time, though, and does not address the more complex issue of "the clock was set some significant distance into the past". LoopingCall still isn't going to be able to deal with large amounts of clock drift, we'd need a monotonic version for that :).

...

I must add that the anomaly doesn't appear right at the start, but after tens of seconds, mostly when the system is under medium/heavy load.

Even seriously broken clocks don't go backwards a lot. ;-)

...

So is it only a documentation issue for an exceptional case, due to timers imprecision, that should be handled at higher levels, or is it a real issue?

I think it's a real issue. The point of withCount is supposed to be that it keeps track of how far into a sequence you should advance, for example, into the frames of an animation. If your callback advances by N frames each time it's called, it's supposed to be up-to-date with "real time". Assuming that the reason time is going backwards is that your ntp daemon is correcting your clock, every time withCount's callback gets called with 0, _realLastTime gets updated, which means that the "frame" we should have advanced is simply lost, and it will slowly fall behind real time each time this happens. So while the calls with 0 themselves are a minor annoyance, they are a symptom of a problem that makes it impossible for LoopingCall to do its job properly in some cases, and should definitely be fixed. Thanks for reporting! Can you open a ticket? -glyph

Dario Vinella

26 Mar 26 Mar

4:06 a.m.

New subject: [Twisted-Python] LoopingCall.withCount countCallable called with 0

2015-03-24 23:18 GMT+01:00 Glyph Lefkowitz <glyph@twistedmatrix.com>:

...

It would be good to look at how many of these numbers are negative.

Not a single one, running the script for a couple of hours, under any condition

...

That seems right. I think there's a real problem here, since physical clocks do just fail, and sometimes NTP slew has to happen. LoopingCall.withCount has a clear opportunity to notice that time has gone backwards, and can just skip that call. This is assuming that time has gone backwards by a tiny amount for a brief period of time, though, and does not address the more complex issue of "the clock was set some significant distance into the past". LoopingCall still isn't going to be able to deal with large amounts of clock drift, we'd need a monotonic version for that :).

I must add that the anomaly doesn't appear right at the start, but after tens of seconds, mostly when the system is under medium/heavy load.

Even seriously broken clocks don't go backwards a *lot*. ;-)

I can reproduce the anomaly with my previous script on my linux machine with or without ntp (systemd-timesyncd) enabled and on a macbook (with kqreactor). The only difference is that since the mac is beefier than the linux box I need to stress it more to trigger the anomaly (a couple of vms + some heavy apps) I think the ntp cause can be excluded since the issue is reproducible without it and furthermore because once triggered the anomaly happens too frequently to think the ntpd applies so many (negative) fixes in such short time. Here it is a log where I print a dot when the calls arg is 1, the value if not 1, each dot is separated by 0.02s https://gist.github.com/dvinella/73e5e45a04c428f037bf

...

Thanks for reporting! Can you open a ticket?

Sure, I'll open it!

Glyph

9 May 9 May

3:04 p.m.

New subject: [Twisted-Python] LoopingCall.withCount countCallable called with 0

...

On Mar 26, 2015, at 2:06 AM, Dario Vinella <d.vinella@gmail.com> wrote:

Here it is a log where I print a dot when the calls arg is 1, the value if not 1, each dot is separated by 0.02s

https://gist.github.com/dvinella/73e5e45a04c428f037bf <https://gist.github.com/dvinella/73e5e45a04c428f037bf>

Thanks for reporting! Can you open a ticket?

Sure, I'll open it!

I just wanted to follow up with this - https://twistedmatrix.com/trac/ticket/7836 <https://twistedmatrix.com/trac/ticket/7836> is the ticket in question and it is now fixed on trunk, pending the next release. -glyph

Dario Vinella

10 May 10 May

2:38 a.m.

New subject: [Twisted-Python] LoopingCall.withCount countCallable called with 0

Thank you glyph and moshez for working on this! 2015-05-09 22:04 GMT+02:00 Glyph <glyph@twistedmatrix.com>:

...

On Mar 26, 2015, at 2:06 AM, Dario Vinella <d.vinella@gmail.com> wrote:

Here it is a log where I print a dot when the calls arg is 1, the value if not 1, each dot is separated by 0.02s

https://gist.github.com/dvinella/73e5e45a04c428f037bf

...
Thanks for reporting! Can you open a ticket?

Sure, I'll open it!

I just wanted to follow up with this - https://twistedmatrix.com/trac/ticket/7836 is the ticket in question and it is now fixed on trunk, pending the next release.

-glyph

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

3470

Age (days ago)

3517

Last active (days ago)

List overview

Download

4 comments

3 participants

participants (3)

Dario Vinella
Glyph
Glyph Lefkowitz

[Twisted-Python] LoopingCall.withCount countCallable called with 0

Dario Vinella

Glyph Lefkowitz

Dario Vinella

Glyph

Dario Vinella

tags

participants (3)