[Twisted-Python] CPU intensive threads
![](https://secure.gravatar.com/avatar/5474ac01095dc18e731eddf9a17b204a.jpg?s=120&d=mm&r=g)
Is there anything twisted can do about CPU intensive threads? When I run a CPU intensive method via threads.deferToThread it takes all the CPU away and renders the twisted process unresponsive. The perspective broker for example will not make any connections until the CPU intensive methods finish. Is there a way to set the priority of the main part of twisted so that it can run CPU intensive threads and still service connections. Thanks, Nate
![](https://secure.gravatar.com/avatar/2c498e6b589e4a4318a8280da536fb36.jpg?s=120&d=mm&r=g)
Nathaniel Haggard <natester@gmail.com> writes:
Is there a way to set the priority of the main part of twisted so that it can run CPU intensive threads and still service connections.
Although you could try using OS-dependent methods for boosting the priority of the main thread (or lowering the background threads) if you're truly CPU bound in pure Python code, it probably won't help much, since even if it gets preferential control, the main twisted loop calls out to I/O operations so much it would just be releasing it back pretty fast. While Python does use native threads, due to its GIL (global interpreter lock), if you have a thread that is purely CPU bound in Python code generally the only way other threads get time is during a periodic byte-code interval. (You can find many more discussions about the GIL and its implications in the comp.lang.python archives) As mentioned in another response, the simplest way to help force context switches in a tight CPU thread is by performing some operation that releases the GIL (a simple one is something like time.sleep(0)), so if your routine is such that it has a tight loop or some repetitive code path, putting something like that in there might help quite a bit. If not, you might try fiddling with sys.setcheckinterval(), which represents the number of byte-codes before Python forces the potential for a thread switch (explicitly releases/grabs the GIL). It was bumped higher in recent Python releases - I think it's 100 now - so you could try dropping it down to 10 or something. Doing so will probably cause your process to burn slightly more cpu/time overall, but it should permit individual threads to remain more responsive in the presence of CPU-bound threads. If that still doesn't give you enough resilience, another option would probably be to offload the processing to a separate process, maintaining a simple link to the other process and communicating requests over there for processing. This supposes that transmitting requests and results won't be a terribly high burden in time/space. Since you're already using PB for other stuff, having an internal (same host) PB server with appropriate processing objects would probably work really well. For your own interprocess communication you could also decide to do a simpler RPC protocol that just pickled the information to minimize the performance/transport impact. Such a setup would also give you (mostly for free) the flexibility to scale the processing to multiple hosts should the need arise, as well as being more friendly to SMP systems (which Python's GIL also interferes with). If you can't afford the time to transfer requests and/or results to a separate process over a normal channel, and are on a Posix platform, you might also investigate POSH (http://poshmodule.sourceforge.net), which implements object sharing in shared memory between processes, which would eliminate the transport overhead but still let you separate the processing into distinct processes. It's an early development project that I've only experimented with but if it fits your needs it might do well. (It comes with a simple producer/consumer example that could probably be used as a starting point) -- David
![](https://secure.gravatar.com/avatar/7ed9784cbb1ba1ef75454034b3a8e6a1.jpg?s=120&d=mm&r=g)
On 27 Jul 2005 11:05:43 -0400, David Bolen <db3l@fitlinxx.com> wrote:
POSH doesn't elimate the transport overhead. I've done some basic investigation, and it's incredibly slow. mmap() is probably a better solution in most cases, although I am not convinced multiple processes are called for in this case. A solution which hasn't been suggested yet is to drop the native thread and use a cooperative Python thread. With this approach, you can choose to schedule it however you like, including /not/ scheduling it when you have other more important tasks to complete. Jp
![](https://secure.gravatar.com/avatar/2c498e6b589e4a4318a8280da536fb36.jpg?s=120&d=mm&r=g)
Jp Calderone <exarkun@divmod.com> writes:
Me neither, but it would be a logical way to progress if you couldn't resolve things in the single process. Good to know about POSH though.
Well, but you'd still have the problem of ensuring that it was yielding back at a reasonable frequency wouldn't you? So it would be similar to sprinkling in a time.sleep(0) in a non-cooperative thread, and still subject to cases where it might not be that simple. -- David
![](https://secure.gravatar.com/avatar/5474ac01095dc18e731eddf9a17b204a.jpg?s=120&d=mm&r=g)
These CPU intensive threads are python/C extensions generated with pyrex. Instead of messing with the GIL I will use processes as David suggests. It would be nice to have a process pool implementation that was as easy to use as deferToThread. Evidently they use the perspective broker for inter process communication in quotient.searchup, and that may be a good starting place. Thanks, -Nate On 27 Jul 2005 17:10:42 -0400, David Bolen <db3l@fitlinxx.com> wrote:
![](https://secure.gravatar.com/avatar/2c498e6b589e4a4318a8280da536fb36.jpg?s=120&d=mm&r=g)
Nathaniel Haggard <natester@gmail.com> writes:
Is there a way to set the priority of the main part of twisted so that it can run CPU intensive threads and still service connections.
Although you could try using OS-dependent methods for boosting the priority of the main thread (or lowering the background threads) if you're truly CPU bound in pure Python code, it probably won't help much, since even if it gets preferential control, the main twisted loop calls out to I/O operations so much it would just be releasing it back pretty fast. While Python does use native threads, due to its GIL (global interpreter lock), if you have a thread that is purely CPU bound in Python code generally the only way other threads get time is during a periodic byte-code interval. (You can find many more discussions about the GIL and its implications in the comp.lang.python archives) As mentioned in another response, the simplest way to help force context switches in a tight CPU thread is by performing some operation that releases the GIL (a simple one is something like time.sleep(0)), so if your routine is such that it has a tight loop or some repetitive code path, putting something like that in there might help quite a bit. If not, you might try fiddling with sys.setcheckinterval(), which represents the number of byte-codes before Python forces the potential for a thread switch (explicitly releases/grabs the GIL). It was bumped higher in recent Python releases - I think it's 100 now - so you could try dropping it down to 10 or something. Doing so will probably cause your process to burn slightly more cpu/time overall, but it should permit individual threads to remain more responsive in the presence of CPU-bound threads. If that still doesn't give you enough resilience, another option would probably be to offload the processing to a separate process, maintaining a simple link to the other process and communicating requests over there for processing. This supposes that transmitting requests and results won't be a terribly high burden in time/space. Since you're already using PB for other stuff, having an internal (same host) PB server with appropriate processing objects would probably work really well. For your own interprocess communication you could also decide to do a simpler RPC protocol that just pickled the information to minimize the performance/transport impact. Such a setup would also give you (mostly for free) the flexibility to scale the processing to multiple hosts should the need arise, as well as being more friendly to SMP systems (which Python's GIL also interferes with). If you can't afford the time to transfer requests and/or results to a separate process over a normal channel, and are on a Posix platform, you might also investigate POSH (http://poshmodule.sourceforge.net), which implements object sharing in shared memory between processes, which would eliminate the transport overhead but still let you separate the processing into distinct processes. It's an early development project that I've only experimented with but if it fits your needs it might do well. (It comes with a simple producer/consumer example that could probably be used as a starting point) -- David
![](https://secure.gravatar.com/avatar/7ed9784cbb1ba1ef75454034b3a8e6a1.jpg?s=120&d=mm&r=g)
On 27 Jul 2005 11:05:43 -0400, David Bolen <db3l@fitlinxx.com> wrote:
POSH doesn't elimate the transport overhead. I've done some basic investigation, and it's incredibly slow. mmap() is probably a better solution in most cases, although I am not convinced multiple processes are called for in this case. A solution which hasn't been suggested yet is to drop the native thread and use a cooperative Python thread. With this approach, you can choose to schedule it however you like, including /not/ scheduling it when you have other more important tasks to complete. Jp
![](https://secure.gravatar.com/avatar/2c498e6b589e4a4318a8280da536fb36.jpg?s=120&d=mm&r=g)
Jp Calderone <exarkun@divmod.com> writes:
Me neither, but it would be a logical way to progress if you couldn't resolve things in the single process. Good to know about POSH though.
Well, but you'd still have the problem of ensuring that it was yielding back at a reasonable frequency wouldn't you? So it would be similar to sprinkling in a time.sleep(0) in a non-cooperative thread, and still subject to cases where it might not be that simple. -- David
![](https://secure.gravatar.com/avatar/5474ac01095dc18e731eddf9a17b204a.jpg?s=120&d=mm&r=g)
These CPU intensive threads are python/C extensions generated with pyrex. Instead of messing with the GIL I will use processes as David suggests. It would be nice to have a process pool implementation that was as easy to use as deferToThread. Evidently they use the perspective broker for inter process communication in quotient.searchup, and that may be a good starting place. Thanks, -Nate On 27 Jul 2005 17:10:42 -0400, David Bolen <db3l@fitlinxx.com> wrote:
participants (5)
-
David Bolen
-
Itamar Shtull-Trauring
-
Jp Calderone
-
Nathaniel Haggard
-
William Waites