On 05/28/2015 02:17 AM, Parasa, Srinivas Vamsi wrote:
Hi All,
This is Vamsi from Server Scripting Languages Optimization team at Intel Corporation.
Would like to submit a request to enable the computed goto based dispatch in Python 2.x (which happens to be enabled by default in Python 3 given its performance benefits on a wide range of workloads). We talked about this patch with Guido and he encouraged us to submit a request on Python-dev (email conversation with Guido shown at the bottom of this email).
Attached is the computed goto patch (along with instructions to run) for Python 2.7.10 (based on the patch submitted by Jeffrey Yasskin at http://bugs.python.org/issue4753). We built and tested this patch for Python 2.7.10 on a Linux machine (Ubuntu 14.04 LTS server, Intel Xeon - Haswell EP CPU with 18 cores, hyper-threading off, turbo off).
Below is a summary of the performance we saw on the "grand unified python benchmarks" suite (available at https://hg.python.org/benchmarks/). We made 3 rigorous runs of the following benchmarks. In each rigorous run, a benchmark is run 100 times with and without the computed goto patch. Below we show the average performance boost for the 3 rigorous runs.
Python 2.7.10 (original) vs Computed Goto performance Benchmark
-1 As Gregory pointed out, there are other options to build the interpreter, and we are missing data how these compare with your patch. I assume, you tested with the Intel compiler, so it would be good to see results for other compilers as well (GCC, clang). Please could you provide the data for LTO and profile guided optimized builds (maybe combined too)? I'm happy to work with you on setting up these builds, but currently don't have the machine resources to do so myself. If the benefits show up for these configurations too, then I'm +/-0 on this patch. Matthias