[Twisted-Python] Question about "starving" the reactor

Hello, I have a question about the "best practices" on making callbacks and making sure you don't hog the reactor. So you pass a deferred to a client, who attaches a chain of callbacks that might probably do some CPU intensive stuff. How should one guard for that? The obvious solution to me for the server part would be to do reactor.callLater(0, d.callback, arg) insted of d.callback(arg) What about the client part? What would be the best way to have a chain of callbacks executed in such a way that the reactor isn't blocking during that time? Thanks, Orestis

On 06/20/2011 10:04 AM, Orestis Markou wrote:
Hello,
I have a question about the "best practices" on making callbacks and making sure you don't hog the reactor.
So you pass a deferred to a client, who attaches a chain of callbacks
Who is "you" and who is "client"?
that might probably do some CPU intensive stuff. How should one guard for that? The obvious solution to me for the server part would be to do
AIUI, callback/errback (or indeed any Twisted function) should not block, which means if they're doing any significant quantity of work they should deferToThread or use a task.cooperator to break it into chunks. You can return a deferred from a callback to pause processing, so this is very easy to implement: def my_callback(data): d = deferToThread(factor_some_prime, data['number']) d.addCallback(lambda x: 'prime factors are '+repr(x)) return d
reactor.callLater(0, d.callback, arg)
That can help in some cases. Specifically if you're receiving datagrams, you might want to service the read() loop as much as possible before packets start to get dropped. But if d.callback is going to do a lot of work, it doesn't solve the problem - just delays it. callbacks/errbacks should not do a lot of work.
insted of
d.callback(arg)
What about the client part? What would be the best way to have a
I don't really understand what you mean by client and server part. A deferred is just a deferred. They don't even have to be used in a network context.

reactor.callLater(0, d.callback, arg)
That can help in some cases. Specifically if you're receiving datagrams, you might want to service the read() loop as much as possible before packets start to get dropped. But if d.callback is going to do a lot of work, it doesn't solve the problem - just delays it.
callbacks/errbacks should not do a lot of work.
insted of
d.callback(arg)
What about the client part? What would be the best way to have a
I don't really understand what you mean by client and server part. A deferred is just a deferred. They don't even have to be used in a network context.
By "server" I mean my application's API, by "client" I mean someone else that will go and attach callbacks to deferreds I will return. I guess the question is, what is the best way to guard my application against callbacks that are doing a lot of work. But probably I'm confused by conflating the neglecting read() loop with starving the reactor. It might be there's no way to do guarantee something like that, so it might be just that everyone should be careful about this.

On 20/06/11 11:29, Orestis Markou wrote:
By "server" I mean my application's API, by "client" I mean someone else that will go and attach callbacks to deferreds I will return. I guess the question is, what is the best way to guard my application against callbacks that are doing a lot of work. But probably I'm confused by conflating the neglecting read() loop with starving the reactor.
It might be there's no way to do guarantee something like that, so it might be just that everyone should be careful about this.
I don't think you can stop callers of a function doing silly things with the return value.

On Jun 20, 2011, at 7:39 AM, Phil Mayers wrote:
On 20/06/11 11:29, Orestis Markou wrote:
By "server" I mean my application's API, by "client" I mean someone else that will go and attach callbacks to deferreds I will return. I guess the question is, what is the best way to guard my application against callbacks that are doing a lot of work. But probably I'm confused by conflating the neglecting read() loop with starving the reactor.
It might be there's no way to do guarantee something like that, so it might be just that everyone should be careful about this.
I don't think you can stop callers of a function doing silly things with the return value.
+1. The best way to deal with this is to make your APIs nice and simple, and their implementation straightforward. If you try to do weird tricks to make callbacks on your Deferreds cooperative, then you might break an otherwise reasonable strategy on the part of the client code to be well-behaved on their part. For the code that is itself trying to be well-behaved, there are things like twisted.internet.task.cooperate.

On Mon, 2011-06-20 at 11:09 -0400, Glyph Lefkowitz wrote:
On Jun 20, 2011, at 7:39 AM, Phil Mayers wrote:
On 20/06/11 11:29, Orestis Markou wrote:
It might be there's no way to do guarantee something like that, so it might be just that everyone should be careful about this.
I don't think you can stop callers of a function doing silly things with the return value.
+1. The best way to deal with this is to make your APIs nice and simple, and their implementation straightforward. If you try to do weird tricks to make callbacks on your Deferreds cooperative, then you might break an otherwise reasonable strategy on the part of the client code to be well-behaved on their part.
For the code that is itself trying to be well-behaved, there are things like twisted.internet.task.cooperate.
There might not be any way to stop consumers of your Deferreds executing blocking operations, but when trying to track down which pieces of code are the culprits, I found exarkun's BigTimesliceTimer very useful: http://twistedmatrix.com/trac/browser/sandbox/exarkun/btt.py It uses a reactor-independent timing mechanism (the setitimer sigcall on Linux, contact me if you want the FreeBSD version) to set an alarm which the reactor then has to "race" to unset, else the itimer handler gets pre-emptively executed. When the alarm runs, it prints a traceback of the current execution point in the "client" code. By setting the alarm frequency sufficiently low and by watching for sufficiently long you can "sample" the code which is running while the reactor is blocked. Code paths which show up frequently are therefore *more likely* to be the culprits. One day it would be nice to turn this into some kind of statistical tool for highlighting which code paths are the "hot-spots" in your code, so that you can optimise the "blockiest" bits first. Premature optimisation, etc. Hope this helps; it helped me :-) -- Best Regards, Luke Marsden CTO, Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting Phone: +447791750420

On Mon, Jun 20, 2011 at 11:23 AM, Luke Marsden < luke-lists@hybrid-logic.co.uk> wrote:
On Mon, 2011-06-20 at 11:09 -0400, Glyph Lefkowitz wrote:
On Jun 20, 2011, at 7:39 AM, Phil Mayers wrote:
On 20/06/11 11:29, Orestis Markou wrote:
It might be there's no way to do guarantee something like that, so it might be just that everyone should be careful about this.
I don't think you can stop callers of a function doing silly things with the return value.
+1. The best way to deal with this is to make your APIs nice and simple, and their implementation straightforward. If you try to do weird tricks to make callbacks on your Deferreds cooperative, then you might break an otherwise reasonable strategy on the part of the client code to be well-behaved on their part.
For the code that is itself trying to be well-behaved, there are things like twisted.internet.task.cooperate.
There might not be any way to stop consumers of your Deferreds executing blocking operations, but when trying to track down which pieces of code are the culprits, I found exarkun's BigTimesliceTimer very useful:
http://twistedmatrix.com/trac/browser/sandbox/exarkun/btt.py
On that note, I made an updated version of the btt: http://pastebin.com/uBbSuDr6 I know i'm digging up ancient history here, but I thought it was relevant
It uses a reactor-independent timing mechanism (the setitimer sigcall on Linux, contact me if you want the FreeBSD version) to set an alarm which the reactor then has to "race" to unset, else the itimer handler gets pre-emptively executed. When the alarm runs, it prints a traceback of the current execution point in the "client" code. By setting the alarm frequency sufficiently low and by watching for sufficiently long you can "sample" the code which is running while the reactor is blocked. Code paths which show up frequently are therefore *more likely* to be the culprits.
One day it would be nice to turn this into some kind of statistical tool for highlighting which code paths are the "hot-spots" in your code, so that you can optimise the "blockiest" bits first. Premature optimisation, etc.
Hope this helps; it helped me :-)
-- Best Regards, Luke Marsden CTO, Hybrid Logic Ltd.
Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting
Phone: +447791750420
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
participants (5)
-
Croepha
-
Glyph Lefkowitz
-
Luke Marsden
-
Orestis Markou
-
Phil Mayers