I'm about to build numpy using Intel's MKL 9.1 beta and want to compare it with the version I built using MKL 8.1. Is the LINPACK benchmark the most appropriate? Thanks, -rex -- Pollytheism: n., the belief that there are many gods, all of them parrots.
as soon as you do it, I'd like to compare them with the benchmarks I posted
here few days ago (compiled with gcc):
http://lbolla.wordpress.com/2007/04/11/numerical-computing-matlab-vs-pythonn...
lorenzo.
On 4/17/07, rex
I'm about to build numpy using Intel's MKL 9.1 beta and want to compare it with the version I built using MKL 8.1. Is the LINPACK benchmark the most appropriate?
Thanks,
-rex -- Pollytheism: n., the belief that there are many gods, all of them parrots. _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
lorenzo bolla
as soon as you do it, I'd like to compare them with the benchmarks I posted here few days ago (compiled with gcc):
http://lbolla.wordpress.com/2007/04/11/numerical-computing-matlab-vs-pythonn... Thanks for the link. I haven't built numpy with MKL 9.1 yet, but here are some results running laplace.py using MKL 8.1. The CPU is a Core 2 Duo (currently) overclocked to 2.94 GHz (it will run at 3.52 GHz). Using Python2.5 compiled with icc 9.1, numpy built with MKL 8.1 Doing 100 iterations on a 500x500 grid numeric took 1.53 seconds slow (100 iterations) took 130.02 seconds slow with Psyco (100 iterations) took 107.91 seconds Python compiled with icc takes 85 times longer to run this benchmark than Python/NumPy does. Using Python2.5 compiled with gcc, numpy built with MKL 8.1 Doing 100 iterations on a 500x500 grid numeric took 1.57 seconds slow (100 iterations) took 154.29 seconds slow with Psyco (100 iterations) took 119.88 seconds Python compiled with gcc takes 101 times longer to run this benchmark than Python/NumPy/icc does. The C++ version compiled with gcc 4.1.2 runs in .19 seconds. -rex -- I liked Occam's razor so much I bought the company.
Using MKL 9.1_beta made no difference in the prior benchmark, but it does improve speed in an earlier benchmark I posted. From: http://projects.scipy.org/pipermail/numpy-discussion/2007-January/025673.htm... ================================================================================ ''' A program that uses Monte Carlo to estimate how often the number of rare events with a Poisson distribution will differ by a given amount. ''' import numpy as n from numpy.random import poisson from time import time lam = 4.0 # mu & var for Poisson distributed rands (they are equal in Poisson) N = 10 #number of times to run the program maxNumEvents = 20 #events larger than this are ignored numPois = 100000 #number of pairs of outcomes to generate freqA = 2 #number of times event A occurred freqB = 6 #number of times event B occurred print "#rands fraction [freqA,freqB] fraction [lam,lam] largest% total[mean,mean]" t0 = time() for g in range(1): for h in range(N): suma = n.zeros((maxNumEvents+1,maxNumEvents+1), int) #possible outcomes array count = poisson(lam, size =(numPois,2)) #generate array of pairs of Poissons for i in range(numPois): #if count[i,0] > maxNumEvents: continue #if count[i,1] > maxNumEvents: continue suma[count[i,0],count[i,1]] += 1 d = n.sum(suma) print d, float(suma[freqA,freqB])/d, float(suma[lam,lam])/d , suma.max(), suma[lam,lam] print 'time', time()-t0 Using the SUSE NumPy rpm: python relative_risk.py #rands fraction [2,6] fraction [lam,lam] largest% total[mean,mean] 100000 0.01539 0.03869 3869 3869 100000 0.01534 0.03766 3907 3766 100000 0.01553 0.03841 3859 3841 100000 0.01496 0.03943 3943 3943 100000 0.01513 0.03829 3856 3829 100000 0.01485 0.03825 3993 3825 100000 0.01545 0.03716 3859 3716 100000 0.01526 0.03909 3919 3909 100000 0.01491 0.03826 3913 3826 100000 0.01478 0.03771 3782 3771 time 2.38847184181 Using the MKL [8.1] NumPy: python relative_risk.py #rands fraction [2,6] fraction [lam,lam] largest% total[mean,mean] 100000 0.01502 0.03764 3895 3764 100000 0.01513 0.03841 3841 3841 100000 0.01511 0.03753 3810 3753 100000 0.01577 0.03766 3873 3766 100000 0.01541 0.0373 3963 3730 100000 0.01586 0.03862 3912 3862 100000 0.01552 0.03785 3870 3785 100000 0.01502 0.03854 3896 3854 100000 0.015 0.03803 3880 3803 100000 0.01515 0.03749 3855 3749 time 2.0455300808 So the rpm version only takes ~17% longer to run this program. I'm surprised that there isn't a larger difference. Perhaps there will be in a different type of program. BTW, the cpu is an Intel e6600 Core 2 Duo overclocked to 3.06 GHz (it will run reliably at 3.24 GHz). ============================================================================ With NumPy built using MKL 9.1_beta the program runs in 1.66 seconds. Correcting for the slightly lower CPU speed used (2.93 GHz), this corresponds to 1.59 seconds. The 8.1 version takes 29% longer (this may be partially/all due to different compiler flags being used), and the 8.1 version used with Python compiled with gcc instead of icc takes 50% longer to run. The icc flags used were: -fast (enables -xP -O3 -ipo -no-prec-div -static) -funroll-loops -fno-alias -parallel I don't know if there are obviously better choices for the Core 2 Duo. I'd like to run a more comprehensive benchmark, say Scimark translated from C to Python/NumPy. http://math.nist.gov/scimark2/download_c.html -rex -- Those who forget the pasta are condemed to reheat it.
the amazing performance of C++ code does not surprise me: a tenfold
improvement of the simple Python/Numpy code can be achieved with
weave.inline or Pyrex.
Hence your benchmarks seems to confirm that "weaved" or "pyrexed" code run
as fast as C++ compiled one.
Moreover, from your numbers, I can tell that compiling numpy with gcc or icc
makes no big difference.
Am I correct?
If yes, let me know if I can add this info to the scipy wiki: I'm preparing
an extention to this page http://www.scipy.org/PerformancePython.
cheers,
lorenzo
On 4/17/07, rex
as soon as you do it, I'd like to compare them with the benchmarks I
lorenzo bolla
[2007-04-17 00:37]: posted here few days ago (compiled with gcc):
http://lbolla.wordpress.com/2007/04/11/numerical-computing-matlab-vs-pythonn...
Thanks for the link.
I haven't built numpy with MKL 9.1 yet, but here are some results running laplace.py using MKL 8.1. The CPU is a Core 2 Duo (currently) overclocked to 2.94 GHz (it will run at 3.52 GHz).
Using Python2.5 compiled with icc 9.1, numpy built with MKL 8.1 Doing 100 iterations on a 500x500 grid numeric took 1.53 seconds slow (100 iterations) took 130.02 seconds slow with Psyco (100 iterations) took 107.91 seconds
Python compiled with icc takes 85 times longer to run this benchmark than Python/NumPy does.
Using Python2.5 compiled with gcc, numpy built with MKL 8.1 Doing 100 iterations on a 500x500 grid numeric took 1.57 seconds slow (100 iterations) took 154.29 seconds slow with Psyco (100 iterations) took 119.88 seconds
Python compiled with gcc takes 101 times longer to run this benchmark than Python/NumPy/icc does.
The C++ version compiled with gcc 4.1.2 runs in .19 seconds.
-rex -- I liked Occam's razor so much I bought the company. _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
rex
I'm about to build numpy using Intel's MKL 9.1 beta and want to compare it with the version I built using MKL 8.1. Is the LINPACK benchmark the most appropriate?
I'm buried in responses. Not. A well-known benchmark (Scimark?) coded using NumPy/SciPy might help people realize that they don't have to use a compiled language for their problem. Alas, I can't find much in the way of benchmarks coded using NumPy/SciPy. All I've found is LINPACK, coded using Numarray. import numarray, time import numarray.random_array as naRA import numarray.linear_algebra as naLA n = 1000 a = naRA.random([n, n]) b = naRA.random([n, 1]) t = -time.time() x = naLA.solve_linear_equations(a, b) t += time.time() r = numarray.dot(a, x) - b r_n = numarray.maximum.reduce(abs(r)) print t, 2.0e-9 / 3.0 * n**3 / t print r_n, r_n / (n * 1e-16) Scimark is a broader test, but AFAIK it's only available in Java and C. FWIW, one of my PCs was the first to break a gigaflop using Scimark. Its score is 1043, which is 44% higher than the 2nd place score. http://math.nist.gov/cgi-bin/ScimarkSummary -rex -- Neutrinos have bad breadth.
participants (2)
-
lorenzo bolla
-
rex