Re: [Python-Dev] Improved thread switching

On Wed, Mar 19, 2008 at 10:42 AM, Stefan Ring s.r@visotech.at wrote:
On Mar 19, 2008 05:24 PM, Adam Olsen rhamph@gmail.com wrote:
On Wed, Mar 19, 2008 at 10:09 AM, Stefan Ring s.r@visotech.at wrote:
Adam Olsen <rhamph <at> gmail.com> writes:
Can you try with a call to sched_yield(), rather than nanosleep()? It should have the same benefit but without as much performance hit.
If it works, but is still too much hit, try tuning the checkinterval to see if you can find an acceptable throughput/responsiveness balance.
I tried that, and it had no effect whatsoever. I suppose it would make an effect on a single CPU or an otherwise heavily loaded SMP system but that's not the secnario we care about.
So you've got a lightly loaded SMP system? Multiple threads all blocked on the GIL, multiple CPUs to run them, but only one CPU is active? I that case I can imagine how sched_yield() might finish before the other CPUs wake up a thread.
A FIFO scheduler would be the right thing here, but it's only a short term solution. Care for a long term solution? ;)
I've already seen that but it would not help us in our current situation. The performance penalty really is too heavy. Our system is slow enough already ;). And it would be very difficult bordering on impossible to parallelize Plus, I can imagine that all extension modules (and our own code) would have to be adapted.
The FIFO scheduler is perfect for us because the load is typically quite low. It's mostly at those times when someone runs a lengthy calculation that all other users suffer greatly increased response times.
So you want responsiveness when idle but throughput when busy?
Are those calculations primarily python code, or does a C library do the grunt work? If it's a C library you shouldn't be affected by safethread's increased overhead.

Adam Olsen <rhamph <at> gmail.com> writes:
So you want responsiveness when idle but throughput when busy?
Exactly ;)
Are those calculations primarily python code, or does a C library do the grunt work? If it's a C library you shouldn't be affected by safethread's increased overhead.
It's Python code all the way. Frankly, it's a huge mess, but it would be very very hard to come up with a scalable solution that would allow to optimize certain hotspots and redo them in C or C++. There isn't even anything left to optimize in particular because all those low hanging fruit have already been taken care of. So it's just ~30kloc Python code over which the total time spent is quite uniformly distributed :(.

On Wed, Mar 19, 2008 at 11:25 AM, Stefan Ring s.r@visotech.at wrote:
Adam Olsen <rhamph <at> gmail.com> writes:
So you want responsiveness when idle but throughput when busy?
Exactly ;)
Are those calculations primarily python code, or does a C library do the grunt work? If it's a C library you shouldn't be affected by safethread's increased overhead.
It's Python code all the way. Frankly, it's a huge mess, but it would be very very hard to come up with a scalable solution that would allow to optimize certain hotspots and redo them in C or C++. There isn't even anything left to optimize in particular because all those low hanging fruit have already been taken care of. So it's just ~30kloc Python code over which the total time spent is quite uniformly distributed :(.
I see. Well, at this point I think the most you can do is file a bug so the problem doesn't get forgotten. If nothing else, if my safethread stuff goes in it'll very likely include a --with-gil option, so I may put together a FIFO scheduler.

Hmmm, sorry if I'm missing something obvious, but, if the occasional background computations are sufficiently heavy -- why not fork, do said computations in the child thread, and return the results via any of the various available IPC approaches? I've recently (at Pycon, mostly) been playing devil's advocate (i.e., being PRO-threads, for once) on the subject of utilizing multiple cores effectively -- but the classic approach (using multiple _processes_ instead) actually works quite well in many cases, and this application server would appear to be one. (the pyProcessing package appears to offer an easy way to migrate threaded code to multiple-processes approaches, although I've only played around with it, not [yet] used it for production code).
Alex
On Wed, Mar 19, 2008 at 10:49 AM, Adam Olsen rhamph@gmail.com wrote:
On Wed, Mar 19, 2008 at 11:25 AM, Stefan Ring s.r@visotech.at wrote:
Adam Olsen <rhamph <at> gmail.com> writes:
So you want responsiveness when idle but throughput when busy?
Exactly ;)
Are those calculations primarily python code, or does a C library do the grunt work? If it's a C library you shouldn't be affected by safethread's increased overhead.
It's Python code all the way. Frankly, it's a huge mess, but it would be very very hard to come up with a scalable solution that would allow to optimize certain hotspots and redo them in C or C++. There isn't even anything left to optimize in particular because all those low hanging fruit have already been taken care of. So it's just ~30kloc Python code over which the total time spent is quite uniformly distributed :(.
I see. Well, at this point I think the most you can do is file a bug so the problem doesn't get forgotten. If nothing else, if my safethread stuff goes in it'll very likely include a --with-gil option, so I may put together a FIFO scheduler.
-- Adam Olsen, aka Rhamphoryncus
Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/aleaxit%40gmail.com

FYI: I shot an email to stdlib-sig about the fact I am proposing the inclusion of the pyProcessing module into the stdlib. Comments and thoughts regarding that would be welcome. I've got a rough outline of the PEP, but I need to spend more time with the code examples.
-jesse
On Wed, Mar 19, 2008 at 9:52 PM, Alex Martelli aleaxit@gmail.com wrote:
Hmmm, sorry if I'm missing something obvious, but, if the occasional background computations are sufficiently heavy -- why not fork, do said computations in the child thread, and return the results via any of the various available IPC approaches? I've recently (at Pycon, mostly) been playing devil's advocate (i.e., being PRO-threads, for once) on the subject of utilizing multiple cores effectively -- but the classic approach (using multiple _processes_ instead) actually works quite well in many cases, and this application server would appear to be one. (the pyProcessing package appears to offer an easy way to migrate threaded code to multiple-processes approaches, although I've only played around with it, not [yet] used it for production code).
Alex
On Wed, Mar 19, 2008 at 10:49 AM, Adam Olsen rhamph@gmail.com wrote:
On Wed, Mar 19, 2008 at 11:25 AM, Stefan Ring s.r@visotech.at wrote:
Adam Olsen <rhamph <at> gmail.com> writes:
So you want responsiveness when idle but throughput when busy?
Exactly ;)
Are those calculations primarily python code, or does a C library do the grunt work? If it's a C library you shouldn't be affected by safethread's increased overhead.
It's Python code all the way. Frankly, it's a huge mess, but it would be very very hard to come up with a scalable solution that would allow to optimize certain hotspots and redo them in C or C++. There isn't even anything left to optimize in particular because all those low hanging fruit have already been taken care of. So it's just ~30kloc Python code over which the total time spent is quite uniformly distributed :(.
I see. Well, at this point I think the most you can do is file a bug so the problem doesn't get forgotten. If nothing else, if my safethread stuff goes in it'll very likely include a --with-gil option, so I may put together a FIFO scheduler.
-- Adam Olsen, aka Rhamphoryncus
Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/aleaxit%40gmail.com
Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/jnoller%40gmail.com

On Thu, Mar 20, 2008 at 09:58:46AM -0400, Jesse Noller wrote:
FYI: I shot an email to stdlib-sig about the fact I am proposing the inclusion of the pyProcessing module into the stdlib. Comments and thoughts regarding that would be welcome. I've got a rough outline of the PEP, but I need to spend more time with the code examples.
Since we officially encourage people to spawn processes instead of threads, I think that this would be a great idea. The processing module has a similar API to threading. It's easy to use, works well, and most importantly, gives us some place to point people to when they complain about the GIL.

2008/3/20, Andrew McNabb amcnabb@mcnabbs.org:
Since we officially encourage people to spawn processes instead of threads, I think that this would be a great idea. The processing module has a similar API to threading. It's easy to use, works well, and most importantly, gives us some place to point people to when they complain about the GIL.
I'm +1 to include the processing module in the stdlib.
just avoid confussions, with these libraries with alike names, I'm meaning this [1] module, the one that emulates the semantics of threading module.
Does anybody has strong reasons for this module to not get included?
Regards,
[1] http://pypi.python.org/pypi/processing

Even I, as a strong advocate for it's inclusion think I should finish the PEP and outline all of the questions/issues that may come out of it.
On Thu, Mar 20, 2008 at 1:37 PM, Facundo Batista facundobatista@gmail.com wrote:
2008/3/20, Andrew McNabb amcnabb@mcnabbs.org:
Since we officially encourage people to spawn processes instead of threads, I think that this would be a great idea. The processing module has a similar API to threading. It's easy to use, works well, and most importantly, gives us some place to point people to when they complain about the GIL.
I'm +1 to include the processing module in the stdlib.
just avoid confussions, with these libraries with alike names, I'm meaning this [1] module, the one that emulates the semantics of threading module.
Does anybody has strong reasons for this module to not get included?
Regards,
[1] http://pypi.python.org/pypi/processing
-- . Facundo
Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/jnoller%40gmail.com

Facundo Batista wrote:
2008/3/20, Andrew McNabb amcnabb@mcnabbs.org:
Since we officially encourage people to spawn processes instead of threads, I think that this would be a great idea. The processing module has a similar API to threading. It's easy to use, works well, and most importantly, gives us some place to point people to when they complain about the GIL.
I'm +1 to include the processing module in the stdlib.
just avoid confussions, with these libraries with alike names, I'm meaning this [1] module, the one that emulates the semantics of threading module.
Does anybody has strong reasons for this module to not get included?
Other than the pre-release version number and the fact that doing such a thing would require R. Oudkerk to actually make the offer rather than anyone else? There would also need to be the usual thing of at least a couple of people stepping up and being willing to maintain it.
I also wouldn't mind seeing some performance figures for an application that was limited to making good use of only one CPU when run with the threading module, but was able to exploit multiple processors to obtain a speed improvements when run with the processing module.
That said, I'm actually +1 on the general idea, since I always write my threaded Python code using worker threads that I communicate with via Queue objects. Pyprocessing would be a great way for me to scale to multiple processors if I was running CPU intensive tasks rather than potentially long-running hardware IO operations (I've been meaning to check it out for a long time, but have never actually needed to for either work or any home projects).
Cheers, Nick.
participants (7)
-
Adam Olsen
-
Alex Martelli
-
Andrew McNabb
-
Facundo Batista
-
Jesse Noller
-
Nick Coghlan
-
Stefan Ring