[SciPy-User] Speeding things up - how to use more than one computer core
Troels Emtekær Linnet
tlinnet at gmail.com
Sun Apr 7 08:11:07 EDT 2013
Thanks for pointing that out.
I did not understand the tuble way to call the function.
But now I get these results:
Why is joblib so slow?
And should I go for threading or processes?
Method was normal
Done :0:00:00.040000
[9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0,
9999.0] <type 'numpy.float64'>
Method was multi Pool
Done :0:00:00.422000
[9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0,
9999.0] <type 'numpy.float64'>
Method was joblib delayed
Done :0:00:02.569000
[9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0,
9999.0] <type 'numpy.float64'>
Method was handythread
Done :0:00:00.582000
[9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0,
9999.0] <type 'numpy.float64'>
import numpy as np
import multiprocessing
from multiprocessing import Pool
from datetime import datetime
from joblib import Parallel, delayed
from handythread import foreach
def getsqrt(n):
res = np.sqrt(n**2)
def main():
jobs = multiprocessing.cpu_count()-1
a = range(10000)
for method in ['normal','multi Pool','joblib delayed','handythread']:
startTime = datetime.now()
if method=='normal':
res = []
for i in a:
b = getsqrt(i)
elif method=='multi Pool':
pool = Pool(processes=jobs)
res = pool.map(getsqrt, a)
elif method=='joblib delayed':
res = Parallel(n_jobs=jobs)(delayed(getsqrt)(i) for i in a)
elif method=='handythread':
res = foreach(getsqrt,a,threads=jobs,return_=True)
if sprint:
print "Method was %s"%method
print "Done :%s"%(datetime.now()-startTime)
print res[-10:], type(res[-1])
if __name__ == "__main__":
res = main()
x at normalesup.org>
On Sun, Apr 07, 2013 at 12:17:59AM +0200, Troels Emtekær Linnet wrote:
> Method was joblib delayed
> Done :0:00:00
Hum, this is fishy, isn't it?
> elif method=='joblib delayed':
> Parallel(n_jobs=-2) #Can also use '-1' for all cores, '-2'
for all
> cores=-1
> func,res = delayed(getsqrt), a
I have a hard time reading your code, but it seems to me that you haven't
computed anything here, just instanciated to Parallel object.
You need to do:
res = Parallel(n_jobs=-2)(delayed(getsqrt)(i) for i in a)
I would expect joblib to be on the same order of magnitude speed-wise as
multiprocessing (hell, it's just a wrapper on multiprocessing). It's just
going to be more robust code than instanciating manually a Pool (deal
better with error, and optionally dispatching on-demand computation).
SciPy-User mailing list
SciPy-User at scipy.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20130407/f0f2a528/attachment.html>
More information about the SciPy-User
mailing list