Is numpy.test() supposed to be multithreaded?
![](https://secure.gravatar.com/avatar/2029e80618164e1a4919925c0be5db28.jpg?s=120&d=mm&r=g)
Heya, I'm not a numbers guy, but I maintain servers for scientists and researchers who are. Someone pointed out that our numpy installation on a particular server was only using one core. I'm unaware of the who/how the previous version of numpy/OpenBLAS were installed, so I installed them from scratch, and confirmed that the users test code now runs on multiple cores as expected, drastically increasing performance time. Now the user is writing back to say, "my test code is fast now, but numpy.test() is still about three times slower than <some other server we don't manage>". When I watch htop as numpy.test() executes, sure enough, it's using one core. Now I'm not sure if that's the expected behavior or not. Questions: * if numpy.test() is supposed to be using multiple cores, why isn't it, when we've established with other test code that it's now using multiple cores? * if numpy.test() is not supposed to be using multiple cores, what could be the reason that the performance is drastically slower than another server with a comparable CPU, when the user's test code performs comparably? For what it's worth, the users "test" code which does run on multiple cores is as simple as: size=4000 a = np.random.random_sample((size,size)) b = np.random.random_sample((size,size)) x = np.dot(a,b) Whereas this uses only one core: numpy.test() --------------------------- OpenBLAS 0.2.18 was basically just compiled with "make", nothing special to it. Numpy 1.11.0 was installed from source (python setup.py install), using a site.cfg file to point numpy to the new OpenBLAS. Thanks, Mike
![](https://secure.gravatar.com/avatar/5f88830d19f9c83e2ddfd913496c5025.jpg?s=120&d=mm&r=g)
On Tue, Jun 28, 2016 at 10:36 PM, Michael Ward <mward@cims.nyu.edu> wrote:
Some numpy.linalg functions (like np.dot) will be using multiple cores, but np.linalg.test() takes only ~1% of the time of the full test suite. Everything else will be running single core. So your observations are not surprising. Cheers, Ralf
![](https://secure.gravatar.com/avatar/5dde29b54a3f1b76b2541d0a4a9b232c.jpg?s=120&d=mm&r=g)
* if numpy.test() is supposed to be using multiple cores, why isn't it,
when we've established with other test code that it's now using multiple cores?
Some numpy.linalg functions (like np.dot) will be using multiple cores, but np.linalg.test() takes only ~1% of the time of the full test suite. Everything else will be running single core. So your observations are not surprising. Though why it would run slower on one box than another comparable box is a mystery... -CHB
![](https://secure.gravatar.com/avatar/97c543aca1ac7bbcfb5279d0300c8330.jpg?s=120&d=mm&r=g)
As a general rule I wouldn't worry too much about test speed. Speed is extremely dependent on exact workloads. And this is doubly so for test suites -- production workloads tend to do a small number of normal things over and over, while a good test suite never does the same thing twice and spends most of its time exercising weird edge conditions. So unless your actual workload is running the numpy test suite :-), it's probably not worth trying to track down. And yeah, numpy does not in general do automatic multithreading -- the only automatic multithreading you should see is when using linear algebra functions (matrix multiply, eigenvalue calculations, etc.) that dispatch to the BLAS. -n On Wed, Jun 29, 2016 at 12:07 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:
-- Nathaniel J. Smith -- https://vorpus.org
![](https://secure.gravatar.com/avatar/b4f6d4f8b501cb05fd054944a166a121.jpg?s=120&d=mm&r=g)
On Mi, 2016-06-29 at 02:03 -0700, Nathaniel Smith wrote:
Agreed, the test suit, and likely also the few tests which might take most time in the end, could be arbitrarily weird and skewed. I could for example imagine IO speed being a big factor. Also depending on system configuration (or numpy version) a different number of tests may be run sometimes. What might make somewhat more sense would be to compare some of the benchmarks `python runtests.py --bench` if you have airspeed velocity installed. While not extensive, a lot of those things at least do test more typical use cases. Though in any case I think the user should probably just test some other thing. - Sebastian
![](https://secure.gravatar.com/avatar/5f88830d19f9c83e2ddfd913496c5025.jpg?s=120&d=mm&r=g)
On Tue, Jun 28, 2016 at 10:36 PM, Michael Ward <mward@cims.nyu.edu> wrote:
Some numpy.linalg functions (like np.dot) will be using multiple cores, but np.linalg.test() takes only ~1% of the time of the full test suite. Everything else will be running single core. So your observations are not surprising. Cheers, Ralf
![](https://secure.gravatar.com/avatar/5dde29b54a3f1b76b2541d0a4a9b232c.jpg?s=120&d=mm&r=g)
* if numpy.test() is supposed to be using multiple cores, why isn't it,
when we've established with other test code that it's now using multiple cores?
Some numpy.linalg functions (like np.dot) will be using multiple cores, but np.linalg.test() takes only ~1% of the time of the full test suite. Everything else will be running single core. So your observations are not surprising. Though why it would run slower on one box than another comparable box is a mystery... -CHB
![](https://secure.gravatar.com/avatar/97c543aca1ac7bbcfb5279d0300c8330.jpg?s=120&d=mm&r=g)
As a general rule I wouldn't worry too much about test speed. Speed is extremely dependent on exact workloads. And this is doubly so for test suites -- production workloads tend to do a small number of normal things over and over, while a good test suite never does the same thing twice and spends most of its time exercising weird edge conditions. So unless your actual workload is running the numpy test suite :-), it's probably not worth trying to track down. And yeah, numpy does not in general do automatic multithreading -- the only automatic multithreading you should see is when using linear algebra functions (matrix multiply, eigenvalue calculations, etc.) that dispatch to the BLAS. -n On Wed, Jun 29, 2016 at 12:07 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:
-- Nathaniel J. Smith -- https://vorpus.org
![](https://secure.gravatar.com/avatar/b4f6d4f8b501cb05fd054944a166a121.jpg?s=120&d=mm&r=g)
On Mi, 2016-06-29 at 02:03 -0700, Nathaniel Smith wrote:
Agreed, the test suit, and likely also the few tests which might take most time in the end, could be arbitrarily weird and skewed. I could for example imagine IO speed being a big factor. Also depending on system configuration (or numpy version) a different number of tests may be run sometimes. What might make somewhat more sense would be to compare some of the benchmarks `python runtests.py --bench` if you have airspeed velocity installed. While not extensive, a lot of those things at least do test more typical use cases. Though in any case I think the user should probably just test some other thing. - Sebastian
participants (5)
-
Chris Barker - NOAA Federal
-
Michael Ward
-
Nathaniel Smith
-
Ralf Gommers
-
Sebastian Berg