Parallel test execution on buildbot
Hi all, Has anyone considered using regrtest's -j option in the buildbot configuration to speed up the test runs? Antoine Pitrou pointed out that even for single CPU slaves, this could be a win due to the number of tests that spend time sleeping or waiting on I/O. And on slaves with multiple CPUs it would clearly be even better. eg, I don't know what hardware is actually in the Solaris slave (bot loewis-sun), but if it has the full 4 UltraSPARCs that it could, then running with -j4 or -j5 there might bring its runtime down from nearly 100 minutes to 20 or 25 - competitive with some of the more reasonable slaves. Jean-Paul
Hi,
Has anyone considered using regrtest's -j option in the buildbot configuration to speed up the test runs?
Perhaps some buildbots are doing other useful tasks, in addition to simply building Python. This should probably be a case by case setting. I don't know how easy it is to add specific options to a buildslave. One small issue would be that, currently, when a buildbot hangs, you know which test it is hung in since it's the last one displayed. This isn't true with -j (but perhaps we can improve the runner to fix this). Regards Antoine.
Martin v. Löwis wrote:
Has anyone considered using regrtest's -j option in the buildbot configuration to speed up the test runs?
Yes, I did. I turned it off again when the tests started failing because of it.
Yeah, a lot of our tests weren't written with parallel execution in mind (e.g. the existence of test_support.TESTFN, using specific ports for test servers). While they've been getting better (e.g. increased use of randomly generated temporary directories over specific filenames, letting the OS assign a free port to servers), I believe there is still a fair bit of work to be done to make them all "parallel execution friendly" (of course, some tests, such as those that try to trigger MemoryError, are inherently parallel execution unfriendly). So yes, we've thought about it, but there's still work to be done before that option can be used without having to worry about false alarms. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
On 09/05/2010 03:32, Nick Coghlan wrote:
Martin v. Löwis wrote:
Has anyone considered using regrtest's -j option in the buildbot configuration to speed up the test runs?
Yes, I did. I turned it off again when the tests started failing because of it.
Yeah, a lot of our tests weren't written with parallel execution in mind (e.g. the existence of test_support.TESTFN, using specific ports for test servers).
While they've been getting better (e.g. increased use of randomly generated temporary directories over specific filenames, letting the OS assign a free port to servers), I believe there is still a fair bit of work to be done to make them all "parallel execution friendly" (of course, some tests, such as those that try to trigger MemoryError, are inherently parallel execution unfriendly).
So yes, we've thought about it, but there's still work to be done before that option can be used without having to worry about false alarms.
FWIW I *usually* run the test suite with parallelization (it is just so much quicker) and these days *rarely* see spurious failures as a result. This is on Mac OS X, YMMV. Michael Foord
Cheers, Nick.
Le dimanche 09 mai 2010 11:08:29, Michael Foord a écrit :
FWIW I *usually* run the test suite with parallelization (it is just so much quicker) and these days *rarely* see spurious failures as a result. This is on Mac OS X, YMMV.
I use regrtest.py -j 4 on a Intel Quad Core on Linux: 4 tests are really running at the same time, and I only get only one spurious failure: test_ioctl. It should be easy to fix this test. Or we can maybe write a blacklist of tests not compatible with multiprocessing mode (or the opposite, a whitelist of compatible tests). -- Victor Stinner http://www.haypocalc.com/
FWIW I *usually* run the test suite with parallelization (it is just so much quicker) and these days *rarely* see spurious failures as a result. This is on Mac OS X, YMMV.
I may misremember the details, but IIRC, the multiprocessing tests would fail to terminate on Solaris. This made it unsuitable for buildbot usage. Regards, Martin
participants (6)
-
"Martin v. Löwis"
-
Antoine Pitrou
-
exarkun@twistedmatrix.com
-
Michael Foord
-
Nick Coghlan
-
Victor Stinner