[Numpy-discussion] numexpr efficency depends on the size of the computing kernel

Francesc Altet faltet at carabos.com
Wed Mar 14 17:05:51 EDT 2007


Hi,

Now that I'm commanding my old AMD Duron machine, I've made some
benchmarks just to prove that the numexpr computing is not influenced by
the size of the CPU cache, but I failed miserably (and Tim was right:
there is a dependency of the numexpr efficency on CPU cache size).

Provided that the pytables instance of the computing kernel of numexpr
is quite larger (it supports more datatypes) than the original,
comparing the performance of both versions can be a good way to check
the influence of CPU cache on the computing efficency.

The attached benchmark is a small modification of the timing.py that
comes with the numexpr package (the modification was needed to allow the
numexpr version of pytables to run all the cases). Basically, the
expressions tested operations with arrays of 1 million of elements, with
a mix of contiguous and strided arrays (no unaligned arrays are present
here). See the code in benchmark for the details.

The speed-ups of numexpr over plain numpy on a AMD Duron machine (64 +
64 KB L1 cache, 64 KB L2 cache) are:

For the original numexpr package:

2.14, 2.21, 2.21  (these represent averages for 3 complete runs)

For the modified pytables version (enlarged computing kernel):

1.32, 1.34, 1.37

So, with a CPU with a very small cache, the original numexpr kernel is
1.6x faster than the pytables one.

However, using a AMD Opteron which has a much bigger L2 cache (64 + 64
KB L1 cache, 1 MB L2 cache), the speed-ups are quite similar:

For the original numexpr package:

3.10, 3.35, 3.35

For the modified pytables version (enlarged computing kernel):

3.37, 3.50, 3.45

So, there is effectively a dependency on the CPU cache size. It would be
nice to run the benchmark with other CPUs with a L2 cache in the range
between 64 KB and 1 MB so as to find the point where the performance
starts to be similar (this should be a good guess on the size of the
computing kernel).

Meanwhile, the lesson learned is that Tim worries were correct: one
should be very careful on adding more opcodes (at least, until CPUs with
a very small L2 cache are in use).  With this, perhaps we will have to
reduce the opcodes in the numexpr version for pytables to a bare
minimum :-/

Cheers,

-- 
Francesc Altet    |  Be careful about using the following code --
Carabos Coop. V.  |  I've only proven that it works, 
www.carabos.com   |  I haven't tested it. -- Donald Knuth
-------------- next part --------------
Expression: b*c+d*e
numpy: 0.284756803513
Skipping weave timing
numexpr: 0.267185997963
Speed-up of numexpr over numpy: 1.06576244894

Expression: 2*a+3*b
numpy: 0.228031897545
Skipping weave timing
numexpr: 0.190967607498
Speed-up of numexpr over numpy: 1.19408679059

Expression: 2*a + (cos(3)+5)*sinh(cos(b))
numpy: 0.875679397583
Skipping weave timing
numexpr: 0.729962491989
Speed-up of numexpr over numpy: 1.19962245621

Expression: 2*a + arctan2(a, b)
numpy: 0.530754685402
Skipping weave timing
numexpr: 0.440991616249
Speed-up of numexpr over numpy: 1.20354824411

Expression: a**2 + (b+1)**-2.5
numpy: 0.830808615685
Skipping weave timing
numexpr: 0.408902907372
Speed-up of numexpr over numpy: 2.03179923817

Expression: (a+1)**50
numpy: 0.486846494675
Skipping weave timing
numexpr: 0.394672584534
Speed-up of numexpr over numpy: 1.23354525689

Expression: sqrt(a**2 + b**2)
numpy: 0.387914180756
Skipping weave timing
numexpr: 0.292760682106
Speed-up of numexpr over numpy: 1.3250214406

Average = 1.32191226793
Expression: b*c+d*e
numpy: 0.279518294334
Skipping weave timing
numexpr: 0.225658392906
Speed-up of numexpr over numpy: 1.23867891965

Expression: 2*a+3*b
numpy: 0.227924203873
Skipping weave timing
numexpr: 0.190263104439
Speed-up of numexpr over numpy: 1.19794221031

Expression: 2*a + (cos(3)+5)*sinh(cos(b))
numpy: 0.865833806992
Skipping weave timing
numexpr: 0.736699199677
Speed-up of numexpr over numpy: 1.17528810588

Expression: 2*a + arctan2(a, b)
numpy: 0.536459088326
Skipping weave timing
numexpr: 0.465694189072
Speed-up of numexpr over numpy: 1.15195572742

Expression: a**2 + (b+1)**-2.5
numpy: 0.803207492828
Skipping weave timing
numexpr: 0.402952003479
Speed-up of numexpr over numpy: 1.99330810095

Expression: (a+1)**50
numpy: 0.506087398529
Skipping weave timing
numexpr: 0.390724515915
Speed-up of numexpr over numpy: 1.29525376043

Expression: sqrt(a**2 + b**2)
numpy: 0.390014004707
Skipping weave timing
numexpr: 0.292934322357
Speed-up of numexpr over numpy: 1.33140426007

Average = 1.34054729781
Expression: b*c+d*e
numpy: 0.282696795464
Skipping weave timing
numexpr: 0.227395987511
Speed-up of numexpr over numpy: 1.2431916612

Expression: 2*a+3*b
numpy: 0.247914505005
Skipping weave timing
numexpr: 0.206929206848
Speed-up of numexpr over numpy: 1.19806434665

Expression: 2*a + (cos(3)+5)*sinh(cos(b))
numpy: 0.87483150959
Skipping weave timing
numexpr: 0.722416090965
Speed-up of numexpr over numpy: 1.21098009932

Expression: 2*a + arctan2(a, b)
numpy: 0.546046590805
Skipping weave timing
numexpr: 0.440475416183
Speed-up of numexpr over numpy: 1.23967552046

Expression: a**2 + (b+1)**-2.5
numpy: 0.841809201241
Skipping weave timing
numexpr: 0.40777721405
Speed-up of numexpr over numpy: 2.06438509126

Expression: (a+1)**50
numpy: 0.484260010719
Skipping weave timing
numexpr: 0.37349460125
Speed-up of numexpr over numpy: 1.29656495462

Expression: sqrt(a**2 + b**2)
numpy: 0.428371477127
Skipping weave timing
numexpr: 0.316362810135
Speed-up of numexpr over numpy: 1.35405130883

Average = 1.37241614033
Averages: 1.32, 1.34, 1.37
-------------- next part --------------
Expression: b*c+d*e
numpy: 0.290255403519
Skipping weave timing
numexpr: 0.190418314934
Speed-up of numexpr over numpy: 1.52430402306

Expression: 2*a+3*b
numpy: 0.226468586922
Skipping weave timing
numexpr: 0.127545499802
Speed-up of numexpr over numpy: 1.77559057179

Expression: 2*a + (cos(3)+5)*sinh(cos(b))
numpy: 0.87546172142
Skipping weave timing
numexpr: 0.621131896973
Speed-up of numexpr over numpy: 1.4094618642

Expression: 2*a + arctan2(a, b)
numpy: 0.528830099106
Skipping weave timing
numexpr: 0.346895003319
Speed-up of numexpr over numpy: 1.52446732886

Expression: a**2 + (b+1)**-2.5
numpy: 0.792816495895
Skipping weave timing
numexpr: 0.218543100357
Speed-up of numexpr over numpy: 3.62773519091

Expression: (a+1)**50
numpy: 0.482146501541
Skipping weave timing
numexpr: 0.186633110046
Speed-up of numexpr over numpy: 2.58339209705

Expression: sqrt(a**2 + b**2)
numpy: 0.388063216209
Skipping weave timing
numexpr: 0.151627588272
Speed-up of numexpr over numpy: 2.55931800164

Average = 2.14346701107
Expression: b*c+d*e
numpy: 0.283156108856
Skipping weave timing
numexpr: 0.181364917755
Speed-up of numexpr over numpy: 1.56125072236

Expression: 2*a+3*b
numpy: 0.226498603821
Skipping weave timing
numexpr: 0.124421000481
Speed-up of numexpr over numpy: 1.8204210137

Expression: 2*a + (cos(3)+5)*sinh(cos(b))
numpy: 0.868006300926
Skipping weave timing
numexpr: 0.623650097847
Speed-up of numexpr over numpy: 1.39181618655

Expression: 2*a + arctan2(a, b)
numpy: 0.517928004265
Skipping weave timing
numexpr: 0.348434090614
Speed-up of numexpr over numpy: 1.48644469131

Expression: a**2 + (b+1)**-2.5
numpy: 0.799534797668
Skipping weave timing
numexpr: 0.216258502007
Speed-up of numexpr over numpy: 3.69712538582

Expression: (a+1)**50
numpy: 0.487076807022
Skipping weave timing
numexpr: 0.164514088631
Speed-up of numexpr over numpy: 2.96069966455

Expression: sqrt(a**2 + b**2)
numpy: 0.387224507332
Skipping weave timing
numexpr: 0.153417181969
Speed-up of numexpr over numpy: 2.52399700192

Average = 2.20596495232
Expression: b*c+d*e
numpy: 0.278421878815
Skipping weave timing
numexpr: 0.18240711689
Speed-up of numexpr over numpy: 1.52637618291

Expression: 2*a+3*b
numpy: 0.234265589714
Skipping weave timing
numexpr: 0.124828195572
Speed-up of numexpr over numpy: 1.87670412635

Expression: 2*a + (cos(3)+5)*sinh(cos(b))
numpy: 0.852713894844
Skipping weave timing
numexpr: 0.606571722031
Speed-up of numexpr over numpy: 1.40579236366

Expression: 2*a + arctan2(a, b)
numpy: 0.5161703825
Skipping weave timing
numexpr: 0.348170495033
Speed-up of numexpr over numpy: 1.48252189621

Expression: a**2 + (b+1)**-2.5
numpy: 0.794040799141
Skipping weave timing
numexpr: 0.215844082832
Speed-up of numexpr over numpy: 3.67877028975

Expression: (a+1)**50
numpy: 0.481977200508
Skipping weave timing
numexpr: 0.164862012863
Speed-up of numexpr over numpy: 2.92351883941

Expression: sqrt(a**2 + b**2)
numpy: 0.386767506599
Skipping weave timing
numexpr: 0.14988219738
Speed-up of numexpr over numpy: 2.58047662338

Average = 2.21059433167
Averages: 2.14, 2.21, 2.21
-------------- next part --------------
A non-text attachment was scrubbed...
Name: numexpr-timing.py
Type: text/x-python
Size: 3269 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070314/321ec5f8/attachment.py>
-------------- next part --------------
Expression: b*c+d*e
numpy: 0.0430510997772
Skipping weave timing
numexpr: 0.0235065937042
Speed-up of numexpr over numpy: 1.83144781923

Expression: 2*a+3*b
numpy: 0.0429566144943
Skipping weave timing
numexpr: 0.0219662904739
Speed-up of numexpr over numpy: 1.95556981026

Expression: 2*a + (cos(3)+5)*sinh(cos(b))
numpy: 0.286458492279
Skipping weave timing
numexpr: 0.250001215935
Speed-up of numexpr over numpy: 1.14582839611

Expression: 2*a + arctan2(a, b)
numpy: 0.139817690849
Skipping weave timing
numexpr: 0.121367192268
Speed-up of numexpr over numpy: 1.15202212588

Expression: a**2 + (b+1)**-2.5
numpy: 0.369387292862
Skipping weave timing
numexpr: 0.0481228113174
Speed-up of numexpr over numpy: 7.67592920591

Expression: (a+1)**50
numpy: 0.283995580673
Skipping weave timing
numexpr: 0.0360183000565
Speed-up of numexpr over numpy: 7.88475803211

Expression: sqrt(a**2 + b**2)
numpy: 0.0699777126312
Skipping weave timing
numexpr: 0.03638920784
Speed-up of numexpr over numpy: 1.9230347893

Average = 3.36694145411
Expression: b*c+d*e
numpy: 0.0497439146042
Skipping weave timing
numexpr: 0.0267603874207
Speed-up of numexpr over numpy: 1.85886376838

Expression: 2*a+3*b
numpy: 0.0438626050949
Skipping weave timing
numexpr: 0.0191017150879
Speed-up of numexpr over numpy: 2.29626527739

Expression: 2*a + (cos(3)+5)*sinh(cos(b))
numpy: 0.277396702766
Skipping weave timing
numexpr: 0.269183421135
Speed-up of numexpr over numpy: 1.03051184058

Expression: 2*a + arctan2(a, b)
numpy: 0.159837794304
Skipping weave timing
numexpr: 0.137581419945
Speed-up of numexpr over numpy: 1.16176875023

Expression: a**2 + (b+1)**-2.5
numpy: 0.375256705284
Skipping weave timing
numexpr: 0.0533778905869
Speed-up of numexpr over numpy: 7.03018986248

Expression: (a+1)**50
numpy: 0.317774915695
Skipping weave timing
numexpr: 0.0351259946823
Speed-up of numexpr over numpy: 9.04671650068

Expression: sqrt(a**2 + b**2)
numpy: 0.0805351018906
Skipping weave timing
numexpr: 0.039293885231
Speed-up of numexpr over numpy: 2.04955812888

Average = 3.49626773266
Expression: b*c+d*e
numpy: 0.0495269060135
Skipping weave timing
numexpr: 0.0265894889832
Speed-up of numexpr over numpy: 1.86264978785

Expression: 2*a+3*b
numpy: 0.0449105024338
Skipping weave timing
numexpr: 0.0221442937851
Speed-up of numexpr over numpy: 2.02808465556

Expression: 2*a + (cos(3)+5)*sinh(cos(b))
numpy: 0.312991595268
Skipping weave timing
numexpr: 0.283522415161
Speed-up of numexpr over numpy: 1.10393950718

Expression: 2*a + arctan2(a, b)
numpy: 0.159363889694
Skipping weave timing
numexpr: 0.13733689785
Speed-up of numexpr over numpy: 1.16038655444

Expression: a**2 + (b+1)**-2.5
numpy: 0.368414521217
Skipping weave timing
numexpr: 0.0534101009369
Speed-up of numexpr over numpy: 6.89784356807

Expression: (a+1)**50
numpy: 0.312214398384
Skipping weave timing
numexpr: 0.0343459129333
Speed-up of numexpr over numpy: 9.09029260599

Expression: sqrt(a**2 + b**2)
numpy: 0.077935218811
Skipping weave timing
numexpr: 0.0383999109268
Speed-up of numexpr over numpy: 2.02956769768

Average = 3.45325205383
Averages: 3.37, 3.50, 3.45
-------------- next part --------------
Expression: b*c+d*e
numpy: 0.0426661014557
Skipping weave timing
numexpr: 0.0238104820251
Speed-up of numexpr over numpy: 1.79190414586

Expression: 2*a+3*b
numpy: 0.0391938924789
Skipping weave timing
numexpr: 0.0195196151733
Speed-up of numexpr over numpy: 2.00792342118

Expression: 2*a + (cos(3)+5)*sinh(cos(b))
numpy: 0.274058103561
Skipping weave timing
numexpr: 0.248371481895
Speed-up of numexpr over numpy: 1.10342017316

Expression: 2*a + arctan2(a, b)
numpy: 0.139664411545
Skipping weave timing
numexpr: 0.121141600609
Speed-up of numexpr over numpy: 1.15290214792

Expression: a**2 + (b+1)**-2.5
numpy: 0.331180119514
Skipping weave timing
numexpr: 0.0499030828476
Speed-up of numexpr over numpy: 6.63646613829

Expression: (a+1)**50
numpy: 0.282083797455
Skipping weave timing
numexpr: 0.0398853063583
Speed-up of numexpr over numpy: 7.07237384416

Expression: sqrt(a**2 + b**2)
numpy: 0.0711817026138
Skipping weave timing
numexpr: 0.0363766908646
Speed-up of numexpr over numpy: 1.95679433511

Average = 3.10311202938
Expression: b*c+d*e
numpy: 0.0431445121765
Skipping weave timing
numexpr: 0.0230684041977
Speed-up of numexpr over numpy: 1.87028594639

Expression: 2*a+3*b
numpy: 0.0386809110641
Skipping weave timing
numexpr: 0.0188805103302
Speed-up of numexpr over numpy: 2.04872169172

Expression: 2*a + (cos(3)+5)*sinh(cos(b))
numpy: 0.275234413147
Skipping weave timing
numexpr: 0.247427392006
Speed-up of numexpr over numpy: 1.11238457034

Expression: 2*a + arctan2(a, b)
numpy: 0.138790893555
Skipping weave timing
numexpr: 0.120497584343
Speed-up of numexpr over numpy: 1.15181473813

Expression: a**2 + (b+1)**-2.5
numpy: 0.330480790138
Skipping weave timing
numexpr: 0.0492552995682
Speed-up of numexpr over numpy: 6.70954786664

Expression: (a+1)**50
numpy: 0.282364106178
Skipping weave timing
numexpr: 0.0327146053314
Speed-up of numexpr over numpy: 8.63113289363

Expression: sqrt(a**2 + b**2)
numpy: 0.0695419073105
Skipping weave timing
numexpr: 0.0363955020905
Speed-up of numexpr over numpy: 1.91072806573

Average = 3.34780225322
Expression: b*c+d*e
numpy: 0.04261469841
Skipping weave timing
numexpr: 0.0229945898056
Speed-up of numexpr over numpy: 1.85324890639

Expression: 2*a+3*b
numpy: 0.0387926101685
Skipping weave timing
numexpr: 0.0188351154327
Speed-up of numexpr over numpy: 2.05958972256

Expression: 2*a + (cos(3)+5)*sinh(cos(b))
numpy: 0.275676703453
Skipping weave timing
numexpr: 0.24797129631
Speed-up of numexpr over numpy: 1.11172828289

Expression: 2*a + arctan2(a, b)
numpy: 0.139141917229
Skipping weave timing
numexpr: 0.121482086182
Speed-up of numexpr over numpy: 1.14536983684

Expression: a**2 + (b+1)**-2.5
numpy: 0.330592417717
Skipping weave timing
numexpr: 0.04945499897
Speed-up of numexpr over numpy: 6.68471185122

Expression: (a+1)**50
numpy: 0.281901097298
Skipping weave timing
numexpr: 0.0324407100677
Speed-up of numexpr over numpy: 8.68973264484

Expression: sqrt(a**2 + b**2)
numpy: 0.0694071054459
Skipping weave timing
numexpr: 0.0360074043274
Speed-up of numexpr over numpy: 1.92757869506

Average = 3.35313713426
Averages: 3.10, 3.35, 3.35


More information about the NumPy-Discussion mailing list