Mailman 3 [Twisted-Python] Reentrant reactor iteration - Twisted

newer
[Twisted-Python] URL not Found Re:...

[Twisted-Python] Reentrant reactor iteration

older
[Twisted-Python] StandardIO flush'n

Marcel Keller

Feb. 27, 2009

8:26 a.m.

Hi, I am working on the VIFF project (viff.dk) which uses Twisted. I found out that our code is sometimes inefficient because we are generating many deferreds (maybe about 10000) in a callback. While doing that, no network communication is performed. Therefore, I investigated the possibility of adding a function to the reactor which is called every iteration and from which the iteration could be called safely. Then, we could generate all deferreds in that function and activate the reactor from to time. See the attached patch for details. Some of our code runs twice as fast when using that hack. Are there any chances something similar might be included in Twisted? Or does anyone have a better solution for the described problem? Furthermore, I have a question, which is probably related. The documentation of IReactorCore says about the iterate() function: "The reactor must have been started (via the run() method) prior to any invocations of this method. It must also be stopped manually after the last call to this method (via the stop() method). This method is not re-entrant: you must not call it recursively; in particular, you must not call it while the reactor is running." This looks to me as if the reactor needs to be running and not running at the same time so that iterate() can be called. Is there an error in my reasoning? Best regards, Marcel Keller --- /usr/lib/python2.5/site-packages/twisted/internet/base.py 2008-07-29 22:13:54.000000000 +0200 +++ internet/base.py 2009-02-20 12:27:42.000000000 +0100 @@ -1127,17 +1127,32 @@ self.startRunning(installSignalHandlers=installSignalHandlers) self.mainLoop() + + def setLoopCall(self, f, *args, **kw): + self.loopCall = (f, args, kw) + + + def myIteration(self, t): + # Advance simulation time in delayed event + # processors. + self.runUntilCurrent() + + if (t is None): + t2 = self.timeout() + t = self.running and t2 + + self.doIteration(t) + + if ("loopCall" in dir(self)): + f, args, kw = self.loopCall + f(*args, **kw) + def mainLoop(self): while self._started: try: while self._started: - # Advance simulation time in delayed event - # processors. - self.runUntilCurrent() - t2 = self.timeout() - t = self.running and t2 - self.doIteration(t) + self.myIteration(None) except: log.msg("Unexpected error in main loop.") log.err()

Show replies by date

Jean-Paul Calderone

February 2009

10:27 a.m.

On Fri, 27 Feb 2009 15:26:43 +0100, Marcel Keller <mkeller@cs.au.dk> wrote:

...

So you're doing a ton of work all at once now and you want to split up that ton of work into smaller pieces and do it a little at a time? If that's the case, then you don't need to modify the reactor, you just need to split up the work your code is going. There are a lot of techniques for doing this. coiterate and inlineCallbacks are two solutions which are closest to "cookie cutter" (ie, you have the least flexibility in deciding how to use them).

...

Give coiterate or inlineCallbacks a try. If you need help applying either of these to your problem, please ask. :) I can't make any specific suggestions now because I can only guess at how you're using the reactor modification you attached. You have a very long, steep, uphill battle to convince me that adding support for re-entrant iteration is a good idea.

...

It's very subtle and intentionally written to sound impossible to use (maybe that is a stupid idea, but that was my intent when I wrote it ;). What you're missing is that there is a way to make reactor.run() return without stopping the reactor. Jean-Paul

Martin Geisler

March 2009

12:38 p.m.

New subject: [Twisted-Python] Re: Reentrant reactor iteration

Jean-Paul Calderone <exarkun@divmod.com> writes: Hi, Thanks for the answer. I'm also with the VIFF project and I would like to explain a bit more about the background for the hack by Marcel.

...

On Fri, 27 Feb 2009 15:26:43 +0100, Marcel Keller <mkeller@cs.au.dk> wrote:

...
Hi,

I am working on the VIFF project (viff.dk) which uses Twisted. I found out that our code is sometimes inefficient because we are generating many deferreds (maybe about 10000) in a callback. While doing that, no network communication is performed. Therefore, I investigated the possibility of adding a function to the reactor which is called every iteration and from which the iteration could be called safely. Then, we could generate all deferreds in that function and activate the reactor from to time. See the attached patch for details.

So you're doing a ton of work all at once now and you want to split up that ton of work into smaller pieces and do it a little at a time?

Sort of. We have overloaded the arithmetic operators in our library, so people will expect to be able to write # xs and ys are big lists of our objects dot_product for (x, y) in zip(xs, ys): dot_product += x * y Here the multiplications involves network traffic and return Deferreds. We would like the network traffic for the first multiplication to begin immediately, *before* the remaining multiplications are done. Doing all the multiplications up front makes the code block the reactor and uses an awful lot of RAM. If we let each multiplication trigger the sending of its data immediately, and if we process incoming messages along the way, memory can be reclaimed for the earlier multiplications and the above calculation should run in constant memory. Sending and processing data in a more even flow makes our benchmark results better and more consistent from one run to the next.

...

If that's the case, then you don't need to modify the reactor, you just need to split up the work your code is going. There are a lot of techniques for doing this. coiterate and inlineCallbacks are two solutions which are closest to "cookie cutter" (ie, you have the least flexibility in deciding how to use them).

Right -- we might be able to use these techniques. I haven't looked at coiterate yet. With inlineCallbacks I guess the code would look something like this: # xs and ys are big lists of our objects dot_product for (x, y) in zip(xs, ys): dot_product += (yield x * y) which is not so bad, expect that it destroys the nice illusion that x and y behave like normal integers even though the multiplication involves network traffic.

...

You have a very long, steep, uphill battle to convince me that adding support for re-entrant iteration is a good idea.

One problem I can think of is the memory usage associated with a very deep recursion. Since there is no such thing as tail call optimization in Python, each level in the recursion will hold onto any local variables even though they might not be needed any more. Are there other general problems with having a re-entrant reactor? -- Martin Geisler VIFF (Virtual Ideal Functionality Framework) brings easy and efficient SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

Jean-Paul Calderone

2:04 p.m.

New subject: [Twisted-Python] Re: Reentrant reactor iteration

On Sat, 07 Mar 2009 19:38:46 +0100, Martin Geisler <mg@daimi.au.dk> wrote:

...

Jean-Paul Calderone <exarkun@divmod.com> writes:

Hi,

Thanks for the answer. I'm also with the VIFF project and I would like to explain a bit more about the background for the hack by Marcel.

...
[snip]

So you're doing a ton of work all at once now and you want to split up that ton of work into smaller pieces and do it a little at a time?

Sort of. We have overloaded the arithmetic operators in our library, so people will expect to be able to write

# xs and ys are big lists of our objects dot_product for (x, y) in zip(xs, ys): dot_product += x * y

Here the multiplications involves network traffic and return Deferreds. We would like the network traffic for the first multiplication to begin immediately, *before* the remaining multiplications are done.

Doing all the multiplications up front makes the code block the reactor and uses an awful lot of RAM. If we let each multiplication trigger the sending of its data immediately, and if we process incoming messages along the way, memory can be reclaimed for the earlier multiplications and the above calculation should run in constant memory.

Hm. I would bet the constant would be pretty high, though. Things will only balance out once the network is keeping up with the local for loop. Actually, I'm not sure why it would be constant at all. Won't the local for loop always run much faster than any network operations can happen? In this case, memory usage will go towards K(local) * set size - K(remote) * set size, or just O(set size); that is to say, linear.

...

Sending and processing data in a more even flow makes our benchmark results better and more consistent from one run to the next.

It seems like what you'll really benefit from most is pipelining with a maximum pipeline depth that's not too big (so as to conserve memory) but not too small (so as to avoid wasting time on network round trip time).

...

...
If that's the case, then you don't need to modify the reactor, you just need to split up the work your code is going. There are a lot of techniques for doing this. coiterate and inlineCallbacks are two solutions which are closest to "cookie cutter" (ie, you have the least flexibility in deciding how to use them).

Right -- we might be able to use these techniques. I haven't looked at coiterate yet. With inlineCallbacks I guess the code would look something like this:

# xs and ys are big lists of our objects dot_product for (x, y) in zip(xs, ys): dot_product += (yield x * y)

which is not so bad, expect that it destroys the nice illusion that x and y behave like normal integers even though the multiplication involves network traffic.

Perhaps with coiterate this might look something like... def op(xs, ys): dot_product = 0 for (x, y) in zip(xs, ys): dot_product += x * y yield yield dot_product d = coiterate(op(xs, ys)) This is flawed, but maybe it can be fixed. What's good about it is that it doesn't try to drive the reactor re-entrantly. :) It also avoids the yield in the += and * operations, which somewhat preserves your illusion of normalcy (I'm not saying I like that illusion, but I'll leave that aside for now). What's bad about it is that it still lets the local loop run arbitrarily far ahead of the results from the network, giving you unbounded memory usage. To fix that, it should yield a Deferred every once in a while. The reason I leave it flawed here is that it's a little tricky to figure out which Deferred to yield. What would be great would be to yield the Deferred for an operation which preceeds the most recently executed operation by some number of operations. The number of operations by which it preceeds the most recent will be your pipeline depth (roughly). The effect this has on coiterate is to cause your local loop to cease to execute until that Deferred has fired. By making it a Deferred for an operation some time /before/ the most recent operation, you keep the network busy and avoid wasting time on round trip times. Once the Deferred fires, your loop gets run a few more times which has the effect of putting more work into the pipeline - until you've got enough extra work to keep things from stalling again, and then coiterate puts you to sleep for a while again.

...

...
You have a very long, steep, uphill battle to convince me that adding support for re-entrant iteration is a good idea.

One problem I can think of is the memory usage associated with a very deep recursion. Since there is no such thing as tail call optimization in Python, each level in the recursion will hold onto any local variables even though they might not be needed any more.

Are there other general problems with having a re-entrant reactor?

One general problem is simply that the reactor is not written with this in mind. So with the current implementation, even including the patch Marcel contributed, it doesn't work, period. When I say "doesn't work", I mean that there are cases which will simply result in incorrect behavior, though there may be other cases where everything does work out as you expect. Going along with this, it's quite a bit harder to test that things work when re-entrancy is allowed or expected than if it isn't, so even if all of the current reactor implementations were made re-entrant, there would likely be a higher rate of defects related to this in the future, simply because it's harder. This isn't limited solely to the reactor, either. A re-entrant reactor almost certainly means that application code will be invoked re-entrantly. This is actually the case already, unfortunately, and it is a bit of an open question as to what should be done about it. A very common bug I find (and write! :() in Twisted applications is failure to handle various forms of re-entrancy correctly. This is an existing issue with Twisted, not related to this proposed change, but making this change would only make the problem worse and essentially destroy and hope for ever making it better. Actually, that general problem is so general that I can't really think of any others to discuss, so I'll leave it at that for now. :) If you want, I can probably share some specific examples of how re-entrancy leads to surprising bugs in unexpected places (probably from real programs, too :/). Jean-Paul

Martin Geisler

3:58 p.m.

New subject: [Twisted-Python] Re: Reentrant reactor iteration

Jean-Paul Calderone <exarkun@divmod.com> writes:

...

On Sat, 07 Mar 2009 19:38:46 +0100, Martin Geisler <mg@daimi.au.dk> wrote:

...
We have overloaded the arithmetic operators in our library, so people will expect to be able to write

# xs and ys are big lists of our objects dot_product for (x, y) in zip(xs, ys): dot_product += x * y

Here the multiplications involves network traffic and return Deferreds. We would like the network traffic for the first multiplication to begin immediately, *before* the remaining multiplications are done.

Doing all the multiplications up front makes the code block the reactor and uses an awful lot of RAM. If we let each multiplication trigger the sending of its data immediately, and if we process incoming messages along the way, memory can be reclaimed for the earlier multiplications and the above calculation should run in constant memory.

Hm. I would bet the constant would be pretty high, though. Things will only balance out once the network is keeping up with the local for loop. Actually, I'm not sure why it would be constant at all. Won't the local for loop always run much faster than any network operations can happen?

Yeah, you're right. Our results are very biased by only testing on a local area network so far... Three parties execute the same code, and each multiplication is done when we have heard from the two others. With a fast network we can move data around as fast as we can do the local operations needed (some additions and multiplications of numbers with 65 bits or more).

...

In this case, memory usage will go towards K(local) * set size - K(remote) * set size, or just O(set size); that is to say, linear.

Right, and that hints that the real goal is to throttle the multiplications (like you point out below).

...

...
Sending and processing data in a more even flow makes our benchmark results better and more consistent from one run to the next.

It seems like what you'll really benefit from most is pipelining with a maximum pipeline depth that's not too big (so as to conserve memory) but not too small (so as to avoid wasting time on network round trip time).

Yes, that is exactly what we want!

...

...
Right -- we might be able to use these techniques. I haven't looked at coiterate yet. With inlineCallbacks I guess the code would look something like this:

# xs and ys are big lists of our objects dot_product for (x, y) in zip(xs, ys): dot_product += (yield x * y)

which is not so bad, expect that it destroys the nice illusion that x and y behave like normal integers even though the multiplication involves network traffic.

Perhaps with coiterate this might look something like...

def op(xs, ys): dot_product = 0 for (x, y) in zip(xs, ys): dot_product += x * y yield

yield dot_product

d = coiterate(op(xs, ys))

Cool, thanks for the example! That really helps in understanding the alternatives...

...

This is flawed, but maybe it can be fixed. What's good about it is that it doesn't try to drive the reactor re-entrantly. :) It also avoids the yield in the += and * operations, which somewhat preserves your illusion of normalcy (I'm not saying I like that illusion, but I'll leave that aside for now).

Hehe :-) I'm also not sure what to think of the illusion, but it just came very naturally with Twisted and the Deferreds. On the other hand it breaks down all the time since one still needs to directly add callbacks to do certain things...

...

What's bad about it is that it still lets the local loop run arbitrarily far ahead of the results from the network, giving you unbounded memory usage. To fix that, it should yield a Deferred every once in a while. The reason I leave it flawed here is that it's a little tricky to figure out which Deferred to yield. What would be great would be to yield the Deferred for an operation which preceeds the most recently executed operation by some number of operations. The number of operations by which it preceeds the most recent will be your pipeline depth (roughly). The effect this has on coiterate is to cause your local loop to cease to execute until that Deferred has fired. By making it a Deferred for an operation some time /before/ the most recent operation, you keep the network busy and avoid wasting time on round trip times. Once the Deferred fires, your loop gets run a few more times which has the effect of putting more work into the pipeline - until you've got enough extra work to keep things from stalling again, and then coiterate puts you to sleep for a while again.

Thanks a lot for the info! I think I'll forward it to the VIFF mailing list and discuss a bit further there.

...

...
Are there other general problems with having a re-entrant reactor?

One general problem is simply that the reactor is not written with this in mind. So with the current implementation, even including the patch Marcel contributed, it doesn't work, period. [...]

Hmm, lots of good reasons... :-/ Marcel also mentioned that his hack made some unit tests fail, but I don't know yet if that was Twisted or VIFF tests. Not that it matters much -- this change should of course be invisible to existing code. -- Martin Geisler VIFF (Virtual Ideal Functionality Framework) brings easy and efficient SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

Jean-Paul Calderone

February 2009

10:27 a.m.

On Fri, 27 Feb 2009 15:26:43 +0100, Marcel Keller <mkeller@cs.au.dk> wrote:

...

Martin Geisler

March 2009

12:38 p.m.

New subject: [Twisted-Python] Re: Reentrant reactor iteration

Jean-Paul Calderone <exarkun@divmod.com> writes: Hi, Thanks for the answer. I'm also with the VIFF project and I would like to explain a bit more about the background for the hack by Marcel.

...

On Fri, 27 Feb 2009 15:26:43 +0100, Marcel Keller <mkeller@cs.au.dk> wrote:

...
Hi,

I am working on the VIFF project (viff.dk) which uses Twisted. I found out that our code is sometimes inefficient because we are generating many deferreds (maybe about 10000) in a callback. While doing that, no network communication is performed. Therefore, I investigated the possibility of adding a function to the reactor which is called every iteration and from which the iteration could be called safely. Then, we could generate all deferreds in that function and activate the reactor from to time. See the attached patch for details.

So you're doing a ton of work all at once now and you want to split up that ton of work into smaller pieces and do it a little at a time?

...

If that's the case, then you don't need to modify the reactor, you just need to split up the work your code is going. There are a lot of techniques for doing this. coiterate and inlineCallbacks are two solutions which are closest to "cookie cutter" (ie, you have the least flexibility in deciding how to use them).

...

You have a very long, steep, uphill battle to convince me that adding support for re-entrant iteration is a good idea.

Jean-Paul Calderone

2:04 p.m.

New subject: [Twisted-Python] Re: Reentrant reactor iteration

On Sat, 07 Mar 2009 19:38:46 +0100, Martin Geisler <mg@daimi.au.dk> wrote:

...

Jean-Paul Calderone <exarkun@divmod.com> writes:

Hi,

Thanks for the answer. I'm also with the VIFF project and I would like to explain a bit more about the background for the hack by Marcel.

...
[snip]

So you're doing a ton of work all at once now and you want to split up that ton of work into smaller pieces and do it a little at a time?

Sort of. We have overloaded the arithmetic operators in our library, so people will expect to be able to write

# xs and ys are big lists of our objects dot_product for (x, y) in zip(xs, ys): dot_product += x * y

Here the multiplications involves network traffic and return Deferreds. We would like the network traffic for the first multiplication to begin immediately, *before* the remaining multiplications are done.

Doing all the multiplications up front makes the code block the reactor and uses an awful lot of RAM. If we let each multiplication trigger the sending of its data immediately, and if we process incoming messages along the way, memory can be reclaimed for the earlier multiplications and the above calculation should run in constant memory.

...

Sending and processing data in a more even flow makes our benchmark results better and more consistent from one run to the next.

...

...
If that's the case, then you don't need to modify the reactor, you just need to split up the work your code is going. There are a lot of techniques for doing this. coiterate and inlineCallbacks are two solutions which are closest to "cookie cutter" (ie, you have the least flexibility in deciding how to use them).

Right -- we might be able to use these techniques. I haven't looked at coiterate yet. With inlineCallbacks I guess the code would look something like this:

# xs and ys are big lists of our objects dot_product for (x, y) in zip(xs, ys): dot_product += (yield x * y)

which is not so bad, expect that it destroys the nice illusion that x and y behave like normal integers even though the multiplication involves network traffic.

...

...
You have a very long, steep, uphill battle to convince me that adding support for re-entrant iteration is a good idea.

One problem I can think of is the memory usage associated with a very deep recursion. Since there is no such thing as tail call optimization in Python, each level in the recursion will hold onto any local variables even though they might not be needed any more.

Are there other general problems with having a re-entrant reactor?

Martin Geisler

3:58 p.m.

New subject: [Twisted-Python] Re: Reentrant reactor iteration

Jean-Paul Calderone <exarkun@divmod.com> writes:

...

On Sat, 07 Mar 2009 19:38:46 +0100, Martin Geisler <mg@daimi.au.dk> wrote:

...
We have overloaded the arithmetic operators in our library, so people will expect to be able to write

# xs and ys are big lists of our objects dot_product for (x, y) in zip(xs, ys): dot_product += x * y

Here the multiplications involves network traffic and return Deferreds. We would like the network traffic for the first multiplication to begin immediately, *before* the remaining multiplications are done.

Doing all the multiplications up front makes the code block the reactor and uses an awful lot of RAM. If we let each multiplication trigger the sending of its data immediately, and if we process incoming messages along the way, memory can be reclaimed for the earlier multiplications and the above calculation should run in constant memory.

Hm. I would bet the constant would be pretty high, though. Things will only balance out once the network is keeping up with the local for loop. Actually, I'm not sure why it would be constant at all. Won't the local for loop always run much faster than any network operations can happen?

...

In this case, memory usage will go towards K(local) * set size - K(remote) * set size, or just O(set size); that is to say, linear.

Right, and that hints that the real goal is to throttle the multiplications (like you point out below).

...

...
Sending and processing data in a more even flow makes our benchmark results better and more consistent from one run to the next.

It seems like what you'll really benefit from most is pipelining with a maximum pipeline depth that's not too big (so as to conserve memory) but not too small (so as to avoid wasting time on network round trip time).

Yes, that is exactly what we want!

...

...
Right -- we might be able to use these techniques. I haven't looked at coiterate yet. With inlineCallbacks I guess the code would look something like this:

# xs and ys are big lists of our objects dot_product for (x, y) in zip(xs, ys): dot_product += (yield x * y)

which is not so bad, expect that it destroys the nice illusion that x and y behave like normal integers even though the multiplication involves network traffic.

Perhaps with coiterate this might look something like...

def op(xs, ys): dot_product = 0 for (x, y) in zip(xs, ys): dot_product += x * y yield

yield dot_product

d = coiterate(op(xs, ys))

Cool, thanks for the example! That really helps in understanding the alternatives...

...

This is flawed, but maybe it can be fixed. What's good about it is that it doesn't try to drive the reactor re-entrantly. :) It also avoids the yield in the += and * operations, which somewhat preserves your illusion of normalcy (I'm not saying I like that illusion, but I'll leave that aside for now).

...

What's bad about it is that it still lets the local loop run arbitrarily far ahead of the results from the network, giving you unbounded memory usage. To fix that, it should yield a Deferred every once in a while. The reason I leave it flawed here is that it's a little tricky to figure out which Deferred to yield. What would be great would be to yield the Deferred for an operation which preceeds the most recently executed operation by some number of operations. The number of operations by which it preceeds the most recent will be your pipeline depth (roughly). The effect this has on coiterate is to cause your local loop to cease to execute until that Deferred has fired. By making it a Deferred for an operation some time /before/ the most recent operation, you keep the network busy and avoid wasting time on round trip times. Once the Deferred fires, your loop gets run a few more times which has the effect of putting more work into the pipeline - until you've got enough extra work to keep things from stalling again, and then coiterate puts you to sleep for a while again.

Thanks a lot for the info! I think I'll forward it to the VIFF mailing list and discuss a bit further there.

...

...
Are there other general problems with having a re-entrant reactor?

One general problem is simply that the reactor is not written with this in mind. So with the current implementation, even including the patch Marcel contributed, it doesn't work, period. [...]

5824

Age (days ago)

5832

Last active (days ago)

List overview

Download

4 comments

3 participants

participants (3)

Jean-Paul Calderone
Marcel Keller
Martin Geisler

[Twisted-Python] Reentrant reactor iteration

Marcel Keller

Jean-Paul Calderone

Martin Geisler

Jean-Paul Calderone

Martin Geisler

Jean-Paul Calderone

Martin Geisler

Jean-Paul Calderone

Martin Geisler

tags

participants (3)