[Python-Dev] Joys of Optimization
Delaney, Timothy C (Timothy)
tdelaney at avaya.com
Sun Mar 14 18:07:11 EST 2004
I did a bit of benchmarking of various versions on my FreeBSD box ... I may have got a bit carried away ;)
A few interesting things to note:
1. [i for i in xrange(1000)] in Python 2.4 is approaching the speed of the same construct under Python 2.3 with psyco - 1.46 msec/loop compared to 1.03 msec/loop. This is very impressive.
2. Python 2.4 gains no benefit from psyco for this construct - presumably because psyco does not yet recognise the LIST_APPEND opcode.
3. However, even including the above, psyco improves 2.4 more than it improves 2.3 - so once the above has been fixed, the improvement should be even greater.
4. 2.4 gives the best pystone results.
5. 2.4 gives the best parrotbench results (I forgot to send the parrotbench results here to work ...).
6. There were certain tests in parrotbench that were slightly slower under 2.4 than 2.3 - this should probably be investigated. However, b3.py (IIRC) had a significant performance improvement - 2.3 was ~17 seconds, 2.4 was ~12 seconds.
Hardware/OS:
Pentium II 266MHz (128K cache I think - might be 256K).
FreeBSD sasami.mshome.net 5.2.1-RELEASE FreeBSD 5.2.1-RELEASE
#0: Mon Feb 23 20:45:55 GMT 2004
root at wv1u.btc.adaptec.com:/usr/obj/usr/src/sys/GENERIC i386
Python versions:
2.0.1 (#2, Dec 5 2003, 03:07:29)
[GCC 3.3.3 [FreeBSD] 20031106]
2.1.3 (#1, Dec 5 2003, 03:03:53)
[GCC 3.3.3 [FreeBSD] 20031106]
2.2.3 (#1, Dec 5 2003, 03:06:39)
[GCC 3.3.3 [FreeBSD] 20031106]
2.3.3 (#2, Mar 14 2004, 09:28:06)
[GCC 3.3.3 [FreeBSD] 20031106]
2.4.a0.20040311 (#2, Mar 13 2004, 19:38:05)
[GCC 3.3.3 [FreeBSD] 20031106]
2.3 and 2.4 were built from source, the others were installed as binary packages.
# Basic list comprehension test
/usr/home/Tim> python2.0 /usr/local/lib/python2.4/timeit.py -n 1000 "[i for i in xrange(1000)]"
1000 loops, best of 3: 5.74 msec per loop
/usr/home/Tim> python2.1 /usr/local/lib/python2.4/timeit.py -n 1000 "[i for i in xrange(1000)]"
1000 loops, best of 3: 7.16 msec per loop
/usr/home/Tim> python2.2 /usr/local/lib/python2.4/timeit.py -n 1000 "[i for i in xrange(1000)]"
1000 loops, best of 3: 3.75 msec per loop
/usr/home/Tim> python2.3 /usr/local/lib/python2.4/timeit.py -n 1000 "[i for i in xrange(1000)]"
1000 loops, best of 3: 2.41 msec per loop
/usr/home/Tim> python2.4 /usr/local/lib/python2.4/timeit.py -n 1000 "[i for i in xrange(1000)]"
1000 loops, best of 3: 1.44 msec per loop
# List comprehension test where listcomp is in a function.
/usr/home/Tim> python2.0 /usr/local/lib/python2.4/timeit.py -n 1000 -s "def main():[i for i in xrange(1000)]" "main()"
1000 loops, best of 3: 5.57 msec per loop
/usr/home/Tim> python2.1 /usr/local/lib/python2.4/timeit.py -n 1000 -s "def main():[i for i in xrange(1000)]" "main()"
1000 loops, best of 3: 6.84 msec per loop
/usr/home/Tim> python2.2 /usr/local/lib/python2.4/timeit.py -n 1000 -s "def main():[i for i in xrange(1000)]" "main()"
1000 loops, best of 3: 3.88 msec per loop
/usr/home/Tim> python2.3 /usr/local/lib/python2.4/timeit.py -n 1000 -s "def main():[i for i in xrange(1000)]" "main()"
1000 loops, best of 3: 2.35 msec per loop
/usr/home/Tim> python2.4 /usr/local/lib/python2.4/timeit.py -n 1000 -s "def main():[i for i in xrange(1000)]" "main()"
1000 loops, best of 3: 1.46 msec per loop
# Listcomp in function + psyco.bind.
/usr/home/Tim> python2.2 /usr/local/lib/python2.4/timeit.py -n 1000 -s "def main():[i for i in xrange(1000)]" -s "import psyco;psyco.bind(main)" "main()"
1000 loops, best of 3: 1.05 msec per loop
/usr/home/Tim> python2.3 /usr/local/lib/python2.4/timeit.py -n 1000 -s "def main():[i for i in xrange(1000)]" -s "import psyco;psyco.bind(main)" "main()"
1000 loops, best of 3: 1.03 msec per loop
/usr/home/Tim> python2.4 /usr/local/lib/python2.4/timeit.py -n 1000 -s "def main():[i for i in xrange(1000)]" -s "import psyco;psyco.bind(main)" "main()"
1000 loops, best of 3: 1.49 msec per loop
# Proof that psyco was working with 2.4 ...
/usr/home/Tim> python2.4 /usr/local/lib/python2.4/timeit.py -n 10 -s "a='1'" "for i in xrange(1000):a+='1'"
10 loops, best of 3: 14.4 msec per loop
/usr/home/Tim> python2.4 /usr/local/lib/python2.4/timeit.py -n 10 -s "import psyco;psyco.full()" "a='1'" "for i in xrange(1000):a+='1'"
10 loops, best of 3: 413 usec per loop
# Pystone
/usr/home/Tim> python2.1 -OO /usr/local/lib/python2.4/test/pystone.py
Pystone(1.1) time for 50000 passes = 17.6797
This machine benchmarks at 2828.1 pystones/second
/usr/home/Tim> python2.2 -OO /usr/local/lib/python2.4/test/pystone.py
Pystone(1.1) time for 50000 passes = 18.7422
This machine benchmarks at 2667.78 pystones/second
/usr/home/Tim> python2.3 -OO /usr/local/lib/python2.4/test/pystone.py
Pystone(1.1) time for 50000 passes = 13.2656
This machine benchmarks at 3769.14 pystones/second
/usr/home/Tim> python2.4 -OO /usr/local/lib/python2.4/test/pystone.py
Pystone(1.1) time for 50000 passes = 12.6875
This machine benchmarks at 3940.89 pystones/second
# Pystone - regular vs. psyco.
/usr/home/Tim> python2.1 -OO /home/Tim/psyco-1.1.1/test/pystone.py
Pystone(1.1) time loops per second
regular Python for 20000 passes 7.16406 2791.71
Psyco for 10000 passes 0.867188 11531.5
Psyco for 10000 more passes 0.9375 10666.7
Total for 20000 passes 1.80469 11082.3
Separated compilation/execution timings for 20000 passes
Compilation (i.e. start-up) -0.0703125 -14.2222
Machine code execution 1.875 10666.7
Relative execution frequencies (iterations per second)
iterations Psyco Python Psyco is ... times faster
1 -14.2412 2791.71 -0.01
10 -144.144 2791.71 -0.05
100 -1641.03 2791.71 -0.59
1000 42666.7 2791.71 15.28
10000 11531.5 2791.71 4.13
100000 10747.3 2791.71 3.85
1000000 10674.7 2791.71 3.82
10000000 10667.5 2791.71 3.82
Cut-off point: -265.9 iterations
/usr/home/Tim> python2.2 -OO /home/Tim/psyco-1.1.1/test/pystone.py
Pystone(1.1) time loops per second
regular Python for 20000 passes 7.51562 2661.12
Psyco for 10000 passes 0.78125 12800
Psyco for 10000 more passes 0.8125 12307.7
Total for 20000 passes 1.59375 12549
Separated compilation/execution timings for 20000 passes
Compilation (i.e. start-up) -0.03125 -32
Machine code execution 1.625 12307.7
Relative execution frequencies (iterations per second)
iterations Psyco Python Psyco is ... times faster
1 -32.0834 2661.12 -0.01
10 -328.542 2661.12 -0.12
100 -4324.32 2661.12 -1.62
1000 20000 2661.12 7.52
10000 12800 2661.12 4.81
100000 12355.2 2661.12 4.64
1000000 12312.4 2661.12 4.63
10000000 12308.2 2661.12 4.63
Cut-off point: -106.1 iterations
/usr/home/Tim> python2.3 -OO /home/Tim/psyco-1.1.1/test/pystone.py
Pystone(1.1) time loops per second
regular Python for 20000 passes 5.33594 3748.17
Psyco for 10000 passes 0.679688 14712.6
Psyco for 10000 more passes 0.695312 14382
Total for 20000 passes 1.375 14545.5
Separated compilation/execution timings for 20000 passes
Compilation (i.e. start-up) -0.015625 -64
Machine code execution 1.39062 14382
Relative execution frequencies (iterations per second)
iterations Psyco Python Psyco is ... times faster
1 -64.2861 3748.17 -0.02
10 -669.806 3748.17 -0.18
100 -11531.5 3748.17 -3.08
1000 18550.7 3748.17 4.95
10000 14712.6 3748.17 3.93
100000 14414.4 3748.17 3.85
1000000 14385.3 3748.17 3.84
10000000 14382.3 3748.17 3.84
Cut-off point: -79.2 iterations
/usr/home/Tim> python2.4 -OO /home/Tim/psyco-1.1.1/test/pystone.py
Pystone(1.1) time loops per second
regular Python for 20000 passes 5.08594 3932.41
Psyco for 10000 passes 0.609375 16410.3
Psyco for 10000 more passes 0.625 16000
Total for 20000 passes 1.23438 16202.5
Separated compilation/execution timings for 20000 passes
Compilation (i.e. start-up) -0.015625 -64
Machine code execution 1.25 16000
Relative execution frequencies (iterations per second)
iterations Psyco Python Psyco is ... times faster
1 -64.257 3932.41 -0.02
10 -666.667 3932.41 -0.17
100 -10666.7 3932.41 -2.71
1000 21333.3 3932.41 5.42
10000 16410.3 3932.41 4.17
100000 16040.1 3932.41 4.08
1000000 16004 3932.41 4.07
10000000 16000.4 3932.41 4.07
Cut-off point: -81.5 iterations
Tim Delaney
More information about the Python-Dev
mailing list