[Twisted-Python] Scalability of timers
Hi, I have a question regarding scalability of timers in Twisted. Say I have a massive number of periodic timers (lets say each with period 1s, but all slightly time shifted to each other). As far as I understand, timers are implemented ultimately by setting the timeout parameter when calling into OS select/poll/epoll/kqueue. If this is true, then the number of timers scales linearly with the number of syscalls. This can get limiting (the total number of syscalls a Linux box can sustain is a couple of 100k's per second). As more and more timers are setup, the timeout essentially will approach 0. On the upside, timers will fire precisely. However, say I am fine with a precision of 1ms. Is there a way that limits the syscall rate to 1000/s (given no FD activity happens) _independently_ of the number of timers setup? Timers that fall into a certain ms slice would all fire roughly at the same time (still ordered). Is that possible? Thanks, Tobias
There is only one select() call (or whatever) at any given time, regardless of how many timers. Syscalls are thus O(1). Timers are stored in sorted order. When event loop wakes up it removes timers that have been reached, which is fast because they're sorted so when you hit one that is still in future you can stop. So that's pretty scalable.
There is only one select() call (or whatever) at any given time, regardless of how many timers.
Yes, I do understand this.
Syscalls are thus O(1). Timers are stored in sorted order. When event loop wakes up it removes timers that have been reached, which is fast because they're sorted so when you hit one that is still in future you can stop. So that's pretty scalable.
syscalls are O(1). But the constant is non zero. A syscall is still quite expensive. try doing 1 mio. syscalls/sec on any x86 box (Linux, BSD, whatever). DEC Alphas and Itanium might be able to do more, but the context switching overhead of x86 architecture is "huge". But I feel I failed in formulating what I am asking. Could you please correct where my thinking below goes wrong? Let's say I issue 1 mio. timers with expirary times t0, t0+1us, t0+2us, .., t0+1s That is 1 mio. timers expiring in 1us pitch within 1s. That will mean 1 mio. select() syscalls done in 1s each with timeout set to 1us Since it's unlikely that a box supports a rate of 1 mio. syscalls/sec, that means a select with timeout 1us won't return after 1us, but >1us. Twisted will process all timers that have expired in the meantime. But there is no way of letting 1 mio. timers fire in 1us pitch within 1s using a syscall. What I am after is to explicitly _control_ the maximum syscall rate to select() - not simply max. out the box on syscall rate. Like: limit syscall rate to select() at 1000Hz - regardless how many timers I issue per second. In above example, with syscall max set to 1000Hz, each time select() returns 1000 timers would have expired. That's fine. That's what I want. How can I do that? Thanks! Tobias
On 09:38 pm, tobias.oberstein@tavendo.de wrote:
What I am after is to explicitly _control_ the maximum syscall rate to select() - not simply max. out the box on syscall rate.
Like: limit syscall rate to select() at 1000Hz - regardless how many timers I issue per second.
In other words:
If you ask Twisted to wake up N times per second, is there an API to
make Twisted wake up M (M
What I am after is to explicitly _control_ the maximum syscall rate to select() - not simply max. out the box on syscall rate.
Like: limit syscall rate to select() at 1000Hz - regardless how many timers I issue per second.
In other words:
If you ask Twisted to wake up N times per second, is there an API to make Twisted wake up M (M
Is that what you're looking for?
Yes, exactly. I want to trade less precision (timers fire at less exact times) for higher efficiency (less context switches). /Tobias
On 08/10/2014 06:23 PM, Tobias Oberstein wrote:
I want to trade less precision (timers fire at less exact times) for higher efficiency (less context switches).
It's easy enough to write one yourself. This might work: from twisted.internet.task import Clock, LoopingCall clock = Clock() LoopingCall(lambda: clock.advance(0.001)).start(0.001) Now just do "clock.callLater" instead of "reactor.callLater".
I want to trade less precision (timers fire at less exact times) for higher efficiency (less context switches).
It's easy enough to write one yourself. This might work: from twisted.internet.task import Clock, LoopingCall
clock = Clock() LoopingCall(lambda: clock.advance(0.001)).start(0.001) Now just do "clock.callLater" instead of "reactor.callLater".
Oh, cool. That make me smile;) Does what I want, is simple and portable. Great. Only worry is http://twistedmatrix.com/trac/browser/tags/releases/twisted-14.0.0/twisted/i... Why does it sort after each and every callLater? And: http://twistedmatrix.com/trac/browser/tags/releases/twisted-14.0.0/twisted/i... It also sorts after each firing of a delayed call. Presumably because that delayed call might reschedule another call that might also fire in same time period? /Tobias
Hey Tobias, Individual OS have their own mechanisms for avoiding the kind of waste you're describing. For example, Linux quite aggressively rounds up the expiry of certain classes of timer at progressively less granular intervals the further in the future they're scheduled ("timer coalescing"). When Twisted wakes, there is no guarantee that that only one timer has expired by then. In fact under load you would expect the select loop to always be running (and thus timing out) late, and so each iteration may process several timers simultaneously. Twisted will set the select() timeout to the timer due to expire the earliest. Finding this timer is a constant time operation. There is only ever one active select() (or select-equivalent) call active at a time. The Twisted timer implementation internally uses a heap, so scheduling and expiry are quite efficint O(logN). With 4 billion timers active, scheduling a new timer in the worst case would require 32 array elements to be swapped. On Sun, Aug 10, 2014 at 05:16:51AM -0700, Tobias Oberstein wrote:
Hi,
I have a question regarding scalability of timers in Twisted.
Say I have a massive number of periodic timers (lets say each with period 1s, but all slightly time shifted to each other).
As far as I understand, timers are implemented ultimately by setting the timeout parameter when calling into OS select/poll/epoll/kqueue.
If this is true, then the number of timers scales linearly with the number of syscalls. This can get limiting (the total number of syscalls a Linux box can sustain is a couple of 100k's per second). As more and more timers are setup, the timeout essentially will approach 0. On the upside, timers will fire precisely.
However, say I am fine with a precision of 1ms.
Is there a way that limits the syscall rate to 1000/s (given no FD activity happens) _independently_ of the number of timers setup?
Timers that fall into a certain ms slice would all fire roughly at the same time (still ordered).
Is that possible?
Thanks, Tobias
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Individual OS have their own mechanisms for avoiding the kind of waste you're describing. For example, Linux quite aggressively rounds up the expiry of certain classes of timer at progressively less granular intervals the further in the future they're scheduled ("timer coalescing").
As far as I understand, Twisted implements timers via the timeout parameter to select(), not using Linux explicit timers. But the coalescing you describe would also apply to those implicit timers (created from select(timeout = ..) within the Linux kernel)?
When Twisted wakes, there is no guarantee that that only one timer has expired by then. In fact under load you would expect the select loop to always be running (and thus timing out) late, and so each iteration may process several timers simultaneously.
Twisted will set the select() timeout to the timer due to expire the earliest. Finding this timer is a constant time operation. There is only ever one active select() (or select-equivalent) call active at a time.
The Twisted timer implementation internally uses a heap, so scheduling and expiry are quite efficint O(logN). With 4 billion timers active, scheduling a new timer in the worst case would require 32 array elements to be swapped.
This is all fine. But how do I _explicitly_ limit the rate at which select() is called to say 1000Hz (at the expense of timer precision)? I don't want to let the box hit it's syscall rate limit. Because the box will spend a fair amount of resources for context switching all the time with to real gain. Thanks for your hints and patience, Tobias
Hi,
I have a question regarding scalability of timers in Twisted.
Say I have a massive number of periodic timers (lets say each with period 1s, but all slightly time shifted to each other).
As far as I understand, timers are implemented ultimately by setting the timeout parameter when calling into OS select/poll/epoll/kqueue.
If this is true, then the number of timers scales linearly with the number of syscalls. This can get limiting (the total number of syscalls a Linux box can sustain is a couple of 100k's per second). As more and more timers are setup,
On Sun, Aug 10, 2014 at 05:16:51AM -0700, Tobias Oberstein wrote: the timeout essentially will approach 0. On the upside, timers will fire precisely.
However, say I am fine with a precision of 1ms.
Is there a way that limits the syscall rate to 1000/s (given no FD activity
happens) _independently_ of the number of timers setup?
Timers that fall into a certain ms slice would all fire roughly at the same time
(still ordered).
Is that possible?
Thanks, Tobias
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
On Sun, Aug 10, 2014 at 02:51:02PM -0700, Tobias Oberstein wrote:
But the coalescing you describe would also apply to those implicit timers (created from select(timeout = ..) within the Linux kernel)?
It applies to all kernel timers not created by realtime processes.
This is all fine. But how do I _explicitly_ limit the rate at which select() is called to say 1000Hz (at the expense of timer precision)?
I don't want to let the box hit it's syscall rate limit. Because the box will spend a fair amount of resources for context switching all the time with to real gain.
From a reading of http://lwn.net/Articles/296578/, at time of writing the default select() implementation coalesces sub-second timeouts to 50us boundaries, and this can be adjusted via prctl(PR_SET_TIMERSLACK) (http://linux.die.net/man/2/prctl) on a per-process basis.
That article is from 2008, and though the relevant kernel code seems to match the article content, a huge amount of power-efficiency related changes went into the kernel since that time. My assumption is nowadays the kernel rounds more aggressively than the default of 50us documented by that article. Short answer is yes, you can set the max hz of select(), but in all likelihood you won't have to. As always, benchmarking and profiling real code might reveal this to be a non-issue. David
Thanks for your hints and patience, Tobias
Hi,
I have a question regarding scalability of timers in Twisted.
Say I have a massive number of periodic timers (lets say each with period 1s, but all slightly time shifted to each other).
As far as I understand, timers are implemented ultimately by setting the timeout parameter when calling into OS select/poll/epoll/kqueue.
If this is true, then the number of timers scales linearly with the number of syscalls. This can get limiting (the total number of syscalls a Linux box can sustain is a couple of 100k's per second). As more and more timers are setup,
On Sun, Aug 10, 2014 at 05:16:51AM -0700, Tobias Oberstein wrote: the timeout essentially will approach 0. On the upside, timers will fire precisely.
However, say I am fine with a precision of 1ms.
Is there a way that limits the syscall rate to 1000/s (given no FD activity
happens) _independently_ of the number of timers setup?
Timers that fall into a certain ms slice would all fire roughly at the same time
(still ordered).
Is that possible?
Thanks, Tobias
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
But the coalescing you describe would also apply to those implicit timers (created from select(timeout = ..) within the Linux kernel)?
It applies to all kernel timers not created by realtime processes.
Alright. I see.
This is all fine. But how do I _explicitly_ limit the rate at which select() is called to say 1000Hz (at the expense of timer precision)?
I don't want to let the box hit it's syscall rate limit. Because the box will spend a fair amount of resources for context switching all the time with to real gain.
From a reading of http://lwn.net/Articles/296578/, at time of writing the default select() implementation coalesces sub-second timeouts to 50us boundaries, and this can be adjusted via prctl(PR_SET_TIMERSLACK) (http://linux.die.net/man/2/prctl) on a per-process basis.
Even more interesting! This might be what I'm looking for .. need to read through.
That article is from 2008, and though the relevant kernel code seems to match the article content, a huge amount of power-efficiency related changes went into the kernel since that time. My assumption is nowadays the kernel rounds more aggressively than the default of 50us documented by that article.
Ah, ok. 50us corresponds to 20k syscalls/sec .. which seems well below the syscall rate limit on modern boxes.
Short answer is yes, you can set the max hz of select(), but in all likelihood you won't have to. As always, benchmarking and profiling real code might reveal this to be a non-issue.
Thanks a lot for those hints! I will read into this material. /Tobias
David
Thanks for your hints and patience, Tobias
Hi,
I have a question regarding scalability of timers in Twisted.
Say I have a massive number of periodic timers (lets say each with period 1s, but all slightly time shifted to each other).
As far as I understand, timers are implemented ultimately by setting the timeout parameter when calling into OS select/poll/epoll/kqueue.
If this is true, then the number of timers scales linearly with the number of syscalls. This can get limiting (the total number of syscalls a Linux box can sustain is a couple of 100k's per second). As more and more timers are setup, the timeout essentially will approach 0. On
On Sun, Aug 10, 2014 at 05:16:51AM -0700, Tobias Oberstein wrote: the upside, timers will fire precisely.
However, say I am fine with a precision of 1ms.
Is there a way that limits the syscall rate to 1000/s (given no FD activity
happens) _independently_ of the number of timers setup?
Timers that fall into a certain ms slice would all fire roughly at the same time
(still ordered).
Is that possible?
Thanks, Tobias
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
participants (4)
-
dw+twisted-python@hmmz.org
-
exarkun@twistedmatrix.com
-
Itamar Turner-Trauring
-
Tobias Oberstein