Re: [Python-Dev] [ANN] VPython 0.1
Feedback is, of course, very welcome and it'd be great to have some pybench results from different machines.
My results are very similar to Jakob's. Gentoo Linux, 32-bit x86, Athlon 6400+ underclocked to 3.0 GHz. make test: 282 tests OK. 5 tests failed: test_doctest test_hotshot test_inspect test_subprocess test_trace ------------------------------------------------------------------------------- PYBENCH 2.0 ------------------------------------------------------------------------------- * using Python 2.5.2 (r252:60911, Oct 22 2008, 13:47:58) [GCC 4.1.2 20070214 ( (gdc 0.24, using dmd 1.020)) (Gentoo 4.1.2 p1.0.2)] * disabled garbage collection * system check interval set to maximum: 2147483647 * using timer: time.time Calibrating tests. Please wait... done. Running 10 round(s) of the suite at warp factor 10: * Round 1 done in 8.474 seconds. * Round 2 done in 8.389 seconds. * Round 3 done in 8.438 seconds. * Round 4 done in 8.411 seconds. * Round 5 done in 8.484 seconds. * Round 6 done in 8.471 seconds. * Round 7 done in 8.492 seconds. * Round 8 done in 8.549 seconds. * Round 9 done in 8.429 seconds. * Round 10 done in 8.542 seconds. ------------------------------------------------------------------------------- Benchmark: 2008-10-22 20:45:22 ------------------------------------------------------------------------------- Rounds: 10 Warp: 10 Timer: time.time Machine Details: Platform ID: Linux-2.6.26-gentoo-r1-i686-AMD_Athlon-tm-_64_X2_Dual_Core_Processor_6400+-with-glibc2.3 Processor: AMD Athlon(tm) 64 X2 Dual Core Processor 6400+ Python: Implementation: n/a Executable: /var/tmp/VPython-2.5.2/python Version: 2.5.2 Compiler: GCC 4.1.2 20070214 ( (gdc 0.24, using dmd 1.020)) (Gentoo 4.1.2 p1.0.2) Bits: 32bit Build: Oct 22 2008 13:47:58 (#r252:60911) Unicode: UCS2 ------------------------------------------------------------------------------- Comparing with: /tmp/vanilla252.pybench ------------------------------------------------------------------------------- Rounds: 10 Warp: 10 Timer: time.time Machine Details: Platform ID: Linux-2.6.26-gentoo-r1-i686-AMD_Athlon-tm-_64_X2_Dual_Core_Processor_6400+-with-glibc2.3 Processor: AMD Athlon(tm) 64 X2 Dual Core Processor 6400+ Python: Implementation: n/a Executable: /usr/local/bin/python2.5 Version: 2.5.2 Compiler: GCC 4.1.2 20070214 ( (gdc 0.24, using dmd 1.020)) (Gentoo 4.1.2 p1.0.2) Bits: 32bit Build: Oct 22 2008 20:39:10 (#r252:60911) Unicode: UCS2 Test minimum run-time average run-time this other diff this other diff ------------------------------------------------------------------------------- BuiltinFunctionCalls: 144ms 205ms -30.1% 162ms 240ms -32.5% BuiltinMethodLookup: 164ms 222ms -26.2% 167ms 236ms -29.2% CompareFloats: 90ms 211ms -57.5% 103ms 222ms -53.7% CompareFloatsIntegers: 88ms 182ms -51.4% 107ms 200ms -46.6% CompareIntegers: 63ms 258ms -75.5% 84ms 272ms -69.1% CompareInternedStrings: 93ms 252ms -63.0% 103ms 261ms -60.5% CompareLongs: 65ms 180ms -63.9% 87ms 203ms -57.1% CompareStrings: 113ms 211ms -46.5% 120ms 218ms -44.9% CompareUnicode: 187ms 273ms -31.7% 228ms 290ms -21.4% ComplexPythonFunctionCalls: 261ms 330ms -20.9% 277ms 336ms -17.5% ConcatStrings: 204ms 255ms -20.2% 209ms 297ms -29.7% ConcatUnicode: 143ms 118ms +20.3% 159ms 228ms -30.0% CreateInstances: 172ms 112ms +53.0% 187ms 211ms -11.5% CreateNewInstances: 165ms 100ms +65.0% 171ms 196ms -12.6% CreateStringsWithConcat: 141ms 133ms +5.8% 160ms 256ms -37.3% CreateUnicodeWithConcat: 145ms 126ms +14.8% 167ms 242ms -30.9% DictCreation: 129ms 98ms +31.6% 131ms 184ms -28.8% DictWithFloatKeys: 185ms 143ms +29.6% 216ms 268ms -19.6% DictWithIntegerKeys: 122ms 115ms +6.0% 126ms 227ms -44.4% DictWithStringKeys: 92ms 112ms -17.6% 104ms 216ms -51.8% ForLoops: 98ms 224ms -56.2% 117ms 243ms -52.0% IfThenElse: 89ms 221ms -59.9% 97ms 237ms -59.1% ListSlicing: 123ms 111ms +10.8% 131ms 141ms -6.8% NestedForLoops: 138ms 234ms -41.1% 153ms 262ms -41.6% NormalClassAttribute: 131ms 225ms -41.5% 139ms 243ms -42.9% NormalInstanceAttribute: 121ms 191ms -36.9% 121ms 210ms -42.5% PythonFunctionCalls: 134ms 200ms -32.6% 144ms 219ms -34.2% PythonMethodCalls: 173ms 228ms -23.9% 185ms 251ms -26.5% Recursion: 177ms 298ms -40.5% 187ms 316ms -40.8% SecondImport: 135ms 133ms +1.5% 160ms 147ms +8.9% SecondPackageImport: 148ms 141ms +5.0% 166ms 162ms +2.7% SecondSubmoduleImport: 209ms 188ms +11.4% 221ms 203ms +8.6% SimpleComplexArithmetic: 131ms 219ms -40.0% 139ms 239ms -41.7% SimpleDictManipulation: 105ms 210ms -49.9% 123ms 233ms -47.1% SimpleFloatArithmetic: 93ms 224ms -58.6% 109ms 246ms -55.8% SimpleIntFloatArithmetic: 84ms 190ms -56.0% 89ms 213ms -58.4% SimpleIntegerArithmetic: 82ms 191ms -57.1% 84ms 218ms -61.5% SimpleListManipulation: 85ms 188ms -54.6% 90ms 207ms -56.7% SimpleLongArithmetic: 111ms 198ms -44.0% 134ms 215ms -37.6% SmallLists: 126ms 182ms -30.7% 143ms 202ms -28.9% SmallTuples: 132ms 193ms -31.3% 143ms 210ms -31.7% SpecialClassAttribute: 110ms 221ms -50.4% 144ms 241ms -40.1% SpecialInstanceAttribute: 146ms 236ms -38.2% 165ms 258ms -36.1% StringMappings: 177ms 209ms -15.2% 186ms 218ms -14.5% StringPredicates: 169ms 219ms -22.9% 178ms 238ms -25.0% StringSlicing: 130ms 206ms -37.0% 151ms 223ms -32.4% TryExcept: 92ms 230ms -59.9% 94ms 258ms -63.5% TryFinally: 139ms 183ms -23.6% 160ms 204ms -21.8% TryRaiseExcept: 139ms 147ms -5.0% 151ms 162ms -6.7% TupleSlicing: 135ms 174ms -22.0% 151ms 190ms -20.7% UnicodeMappings: 222ms 244ms -8.9% 241ms 257ms -6.3% UnicodePredicates: 170ms 214ms -20.6% 179ms 227ms -21.2% UnicodeProperties: 136ms 159ms -14.9% 154ms 206ms -25.3% UnicodeSlicing: 142ms 215ms -34.1% 171ms 248ms -31.3% WithFinally: 208ms 260ms -20.1% 212ms 271ms -21.9% WithRaiseExcept: 175ms 193ms -9.0% 186ms 209ms -11.0% ------------------------------------------------------------------------------- Totals: 7682ms 10935ms -29.8% 8468ms 12832ms -34.0% (this=2008-10-22 20:45:22, other=/tmp/vanilla252.pybench) -- David Ripton dripton@ripton.net
David Ripton wrote:
Feedback is, of course, very welcome and it'd be great to have some pybench results from different machines.
My results are very similar to Jakob's.
From looking thru the vmgen manual, there are two things it is doing that CPython is not. 1. gcc-specific threaded code; claim doubles speed. 2. caching top of stack in a register; claim increases speed 0-40%, depending on system.
Here's another data point. My results are similar to Skip's (unsurprising since I'm also using a mac). My wild guess is that the 30% vs 10% improvement is an AMD vs. Intel thing? It's not 32-bit vs. 64-bit since both David and Jakob got a 30% speedup, but David had a 32-bit build while Jakob had a 64-bit build. There's also a crashing bug on: def f(): a+=1 f() I have a fix by changing the load_fast opcode to adjust the stack on error, but it requires removing all the superinstructions involving load_fast, which costs me 1-2% in performance. The fix is not included in these numbers. ------------------------------------------------------------------------------- PYBENCH 2.0 ------------------------------------------------------------------------------- * using CPython 2.6+ (unknown, Nov 19 2008, 09:14:51) [GCC 4.0.1 (Apple Inc. build 5484)] * disabled garbage collection * system check interval set to maximum: 2147483647 * using timer: time.time ------------------------------------------------------------------------------- Benchmark: /Users/jyasskin/src/python/bzr/2.6_cxx/build_c4.0/pybench.out ------------------------------------------------------------------------------- Rounds: 10 Warp: 10 Timer: time.time Machine Details: Platform ID: Darwin-9.5.0-i386-32bit Processor: i386 Python: Implementation: CPython Executable: /Users/jyasskin/src/python/bzr/2.6_cxx/build_c4.0/python.exe Version: 2.6.0 Compiler: GCC 4.0.1 (Apple Inc. build 5484) Bits: 32bit Build: Nov 19 2008 09:14:51 (#unknown) Unicode: UCS2 ------------------------------------------------------------------------------- Comparing with: /Users/jyasskin/src/python/bzr/2.6_vmgen/build/pybench.out ------------------------------------------------------------------------------- Rounds: 10 Warp: 10 Timer: time.time Machine Details: Platform ID: Darwin-9.5.0-i386-32bit Processor: i386 Python: Implementation: CPython Executable: /Users/jyasskin/src/python/bzr/2.6_vmgen/build/python.exe Version: 2.6.0 Compiler: GCC 4.0.1 (Apple Inc. build 5488) Bits: 32bit Build: Nov 24 2008 20:20:04 (#unknown) Unicode: UCS2 Test minimum run-time average run-time this other diff this other diff ------------------------------------------------------------------------------- BuiltinFunctionCalls: 131ms 118ms +10.9% 134ms 120ms +11.3% BuiltinMethodLookup: 109ms 90ms +20.9% 111ms 96ms +15.7% CompareFloats: 91ms 65ms +40.4% 92ms 66ms +39.2% CompareFloatsIntegers: 99ms 85ms +16.5% 99ms 85ms +16.4% CompareIntegers: 83ms 49ms +67.3% 83ms 50ms +67.2% CompareInternedStrings: 93ms 72ms +30.3% 95ms 73ms +29.3% CompareLongs: 84ms 62ms +36.6% 86ms 63ms +37.3% CompareStrings: 82ms 68ms +20.2% 84ms 71ms +17.7% CompareUnicode: 104ms 89ms +17.5% 109ms 94ms +15.1% ComplexPythonFunctionCalls: 139ms 126ms +11.1% 142ms 127ms +11.4% ConcatStrings: 149ms 138ms +8.0% 154ms 148ms +3.8% ConcatUnicode: 88ms 84ms +4.7% 90ms 85ms +5.8% CreateInstances: 142ms 130ms +9.5% 143ms 131ms +9.0% CreateNewInstances: 106ms 99ms +7.4% 107ms 99ms +7.6% CreateStringsWithConcat: 116ms 94ms +23.3% 118ms 95ms +25.0% CreateUnicodeWithConcat: 91ms 83ms +10.3% 92ms 84ms +9.6% DictCreation: 92ms 80ms +14.8% 93ms 81ms +14.8% DictWithFloatKeys: 95ms 90ms +5.2% 98ms 91ms +6.7% DictWithIntegerKeys: 99ms 91ms +9.1% 104ms 92ms +13.8% DictWithStringKeys: 83ms 73ms +13.8% 87ms 76ms +14.9% ForLoops: 77ms 62ms +23.2% 79ms 63ms +24.5% IfThenElse: 78ms 55ms +41.6% 79ms 56ms +42.7% ListSlicing: 115ms 185ms -37.7% 120ms 187ms -36.1% NestedForLoops: 135ms 100ms +35.0% 136ms 102ms +33.8% NormalClassAttribute: 105ms 98ms +6.9% 106ms 99ms +6.8% NormalInstanceAttribute: 93ms 84ms +11.2% 94ms 85ms +10.8% PythonFunctionCalls: 102ms 90ms +13.5% 105ms 93ms +13.4% PythonMethodCalls: 147ms 133ms +10.5% 148ms 135ms +9.7% Recursion: 142ms 118ms +20.2% 147ms 119ms +22.9% SecondImport: 99ms 98ms +1.3% 100ms 100ms +0.1% SecondPackageImport: 102ms 101ms +1.2% 104ms 102ms +1.8% SecondSubmoduleImport: 133ms 133ms +0.4% 135ms 134ms +1.0% SimpleComplexArithmetic: 100ms 93ms +7.3% 101ms 94ms +7.8% SimpleDictManipulation: 110ms 93ms +18.3% 111ms 94ms +18.3% SimpleFloatArithmetic: 92ms 76ms +19.9% 94ms 82ms +15.5% SimpleIntFloatArithmetic: 73ms 62ms +16.8% 73ms 63ms +16.4% SimpleIntegerArithmetic: 73ms 64ms +13.5% 74ms 65ms +13.0% SimpleListManipulation: 79ms 67ms +18.6% 80ms 69ms +15.6% SimpleLongArithmetic: 111ms 98ms +13.3% 112ms 99ms +13.3% SmallLists: 126ms 112ms +12.9% 129ms 114ms +12.7% SmallTuples: 123ms 104ms +18.5% 125ms 105ms +18.7% SpecialClassAttribute: 101ms 95ms +6.5% 102ms 97ms +5.2% SpecialInstanceAttribute: 173ms 154ms +12.8% 175ms 158ms +10.7% StringMappings: 165ms 163ms +1.1% 166ms 164ms +1.3% StringPredicates: 126ms 121ms +4.3% 130ms 124ms +5.0% StringSlicing: 125ms 107ms +17.1% 130ms 111ms +16.4% TryExcept: 83ms 57ms +44.6% 84ms 58ms +45.3% TryFinally: 102ms 104ms -1.8% 107ms 105ms +2.2% TryRaiseExcept: 98ms 95ms +2.9% 99ms 97ms +2.7% TupleSlicing: 124ms 141ms -12.5% 138ms 144ms -4.4% UnicodeMappings: 142ms 142ms -0.2% 143ms 143ms +0.1% UnicodePredicates: 107ms 100ms +7.4% 108ms 101ms +7.2% UnicodeProperties: 109ms 101ms +7.8% 111ms 102ms +8.3% UnicodeSlicing: 107ms 84ms +27.8% 111ms 89ms +24.4% WithFinally: 156ms 151ms +3.4% 157ms 151ms +3.9% WithRaiseExcept: 124ms 120ms +3.1% 125ms 121ms +3.0% ------------------------------------------------------------------------------- Totals: 6137ms 5548ms +10.6% 6258ms 5653ms +10.7% (this=/Users/jyasskin/src/python/bzr/2.6_cxx/build_c4.0/pybench.out, other=/Users/jyasskin/src/python/bzr/2.6_vmgen/build/pybench.out) On Wed, Oct 22, 2008 at 5:48 PM, David Ripton <dripton@ripton.net> wrote:
Feedback is, of course, very welcome and it'd be great to have some pybench results from different machines.
My results are very similar to Jakob's.
Gentoo Linux, 32-bit x86, Athlon 6400+ underclocked to 3.0 GHz.
make test: 282 tests OK. 5 tests failed: test_doctest test_hotshot test_inspect test_subprocess test_trace
------------------------------------------------------------------------------- PYBENCH 2.0 ------------------------------------------------------------------------------- * using Python 2.5.2 (r252:60911, Oct 22 2008, 13:47:58) [GCC 4.1.2 20070214 ( (gdc 0.24, using dmd 1.020)) (Gentoo 4.1.2 p1.0.2)] * disabled garbage collection * system check interval set to maximum: 2147483647 * using timer: time.time
Calibrating tests. Please wait... done.
Running 10 round(s) of the suite at warp factor 10:
* Round 1 done in 8.474 seconds. * Round 2 done in 8.389 seconds. * Round 3 done in 8.438 seconds. * Round 4 done in 8.411 seconds. * Round 5 done in 8.484 seconds. * Round 6 done in 8.471 seconds. * Round 7 done in 8.492 seconds. * Round 8 done in 8.549 seconds. * Round 9 done in 8.429 seconds. * Round 10 done in 8.542 seconds.
------------------------------------------------------------------------------- Benchmark: 2008-10-22 20:45:22 -------------------------------------------------------------------------------
Rounds: 10 Warp: 10 Timer: time.time
Machine Details: Platform ID: Linux-2.6.26-gentoo-r1-i686-AMD_Athlon-tm-_64_X2_Dual_Core_Processor_6400+-with-glibc2.3 Processor: AMD Athlon(tm) 64 X2 Dual Core Processor 6400+
Python: Implementation: n/a Executable: /var/tmp/VPython-2.5.2/python Version: 2.5.2 Compiler: GCC 4.1.2 20070214 ( (gdc 0.24, using dmd 1.020)) (Gentoo 4.1.2 p1.0.2) Bits: 32bit Build: Oct 22 2008 13:47:58 (#r252:60911) Unicode: UCS2
------------------------------------------------------------------------------- Comparing with: /tmp/vanilla252.pybench -------------------------------------------------------------------------------
Rounds: 10 Warp: 10 Timer: time.time
Machine Details: Platform ID: Linux-2.6.26-gentoo-r1-i686-AMD_Athlon-tm-_64_X2_Dual_Core_Processor_6400+-with-glibc2.3 Processor: AMD Athlon(tm) 64 X2 Dual Core Processor 6400+
Python: Implementation: n/a Executable: /usr/local/bin/python2.5 Version: 2.5.2 Compiler: GCC 4.1.2 20070214 ( (gdc 0.24, using dmd 1.020)) (Gentoo 4.1.2 p1.0.2) Bits: 32bit Build: Oct 22 2008 20:39:10 (#r252:60911) Unicode: UCS2
Test minimum run-time average run-time this other diff this other diff ------------------------------------------------------------------------------- BuiltinFunctionCalls: 144ms 205ms -30.1% 162ms 240ms -32.5% BuiltinMethodLookup: 164ms 222ms -26.2% 167ms 236ms -29.2% CompareFloats: 90ms 211ms -57.5% 103ms 222ms -53.7% CompareFloatsIntegers: 88ms 182ms -51.4% 107ms 200ms -46.6% CompareIntegers: 63ms 258ms -75.5% 84ms 272ms -69.1% CompareInternedStrings: 93ms 252ms -63.0% 103ms 261ms -60.5% CompareLongs: 65ms 180ms -63.9% 87ms 203ms -57.1% CompareStrings: 113ms 211ms -46.5% 120ms 218ms -44.9% CompareUnicode: 187ms 273ms -31.7% 228ms 290ms -21.4% ComplexPythonFunctionCalls: 261ms 330ms -20.9% 277ms 336ms -17.5% ConcatStrings: 204ms 255ms -20.2% 209ms 297ms -29.7% ConcatUnicode: 143ms 118ms +20.3% 159ms 228ms -30.0% CreateInstances: 172ms 112ms +53.0% 187ms 211ms -11.5% CreateNewInstances: 165ms 100ms +65.0% 171ms 196ms -12.6% CreateStringsWithConcat: 141ms 133ms +5.8% 160ms 256ms -37.3% CreateUnicodeWithConcat: 145ms 126ms +14.8% 167ms 242ms -30.9% DictCreation: 129ms 98ms +31.6% 131ms 184ms -28.8% DictWithFloatKeys: 185ms 143ms +29.6% 216ms 268ms -19.6% DictWithIntegerKeys: 122ms 115ms +6.0% 126ms 227ms -44.4% DictWithStringKeys: 92ms 112ms -17.6% 104ms 216ms -51.8% ForLoops: 98ms 224ms -56.2% 117ms 243ms -52.0% IfThenElse: 89ms 221ms -59.9% 97ms 237ms -59.1% ListSlicing: 123ms 111ms +10.8% 131ms 141ms -6.8% NestedForLoops: 138ms 234ms -41.1% 153ms 262ms -41.6% NormalClassAttribute: 131ms 225ms -41.5% 139ms 243ms -42.9% NormalInstanceAttribute: 121ms 191ms -36.9% 121ms 210ms -42.5% PythonFunctionCalls: 134ms 200ms -32.6% 144ms 219ms -34.2% PythonMethodCalls: 173ms 228ms -23.9% 185ms 251ms -26.5% Recursion: 177ms 298ms -40.5% 187ms 316ms -40.8% SecondImport: 135ms 133ms +1.5% 160ms 147ms +8.9% SecondPackageImport: 148ms 141ms +5.0% 166ms 162ms +2.7% SecondSubmoduleImport: 209ms 188ms +11.4% 221ms 203ms +8.6% SimpleComplexArithmetic: 131ms 219ms -40.0% 139ms 239ms -41.7% SimpleDictManipulation: 105ms 210ms -49.9% 123ms 233ms -47.1% SimpleFloatArithmetic: 93ms 224ms -58.6% 109ms 246ms -55.8% SimpleIntFloatArithmetic: 84ms 190ms -56.0% 89ms 213ms -58.4% SimpleIntegerArithmetic: 82ms 191ms -57.1% 84ms 218ms -61.5% SimpleListManipulation: 85ms 188ms -54.6% 90ms 207ms -56.7% SimpleLongArithmetic: 111ms 198ms -44.0% 134ms 215ms -37.6% SmallLists: 126ms 182ms -30.7% 143ms 202ms -28.9% SmallTuples: 132ms 193ms -31.3% 143ms 210ms -31.7% SpecialClassAttribute: 110ms 221ms -50.4% 144ms 241ms -40.1% SpecialInstanceAttribute: 146ms 236ms -38.2% 165ms 258ms -36.1% StringMappings: 177ms 209ms -15.2% 186ms 218ms -14.5% StringPredicates: 169ms 219ms -22.9% 178ms 238ms -25.0% StringSlicing: 130ms 206ms -37.0% 151ms 223ms -32.4% TryExcept: 92ms 230ms -59.9% 94ms 258ms -63.5% TryFinally: 139ms 183ms -23.6% 160ms 204ms -21.8% TryRaiseExcept: 139ms 147ms -5.0% 151ms 162ms -6.7% TupleSlicing: 135ms 174ms -22.0% 151ms 190ms -20.7% UnicodeMappings: 222ms 244ms -8.9% 241ms 257ms -6.3% UnicodePredicates: 170ms 214ms -20.6% 179ms 227ms -21.2% UnicodeProperties: 136ms 159ms -14.9% 154ms 206ms -25.3% UnicodeSlicing: 142ms 215ms -34.1% 171ms 248ms -31.3% WithFinally: 208ms 260ms -20.1% 212ms 271ms -21.9% WithRaiseExcept: 175ms 193ms -9.0% 186ms 209ms -11.0% ------------------------------------------------------------------------------- Totals: 7682ms 10935ms -29.8% 8468ms 12832ms -34.0%
(this=2008-10-22 20:45:22, other=/tmp/vanilla252.pybench)
-- David Ripton dripton@ripton.net _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/jyasskin%40gmail.com
-- Namasté, Jeffrey Yasskin http://jeffrey.yasskin.info/
participants (3)
-
David Ripton
-
Jeffrey Yasskin
-
Terry Reedy