[Python-Dev] Profile Guided Optimization active by-default

Brett Cannon brett at python.org
Sat Aug 22 20:00:08 CEST 2015


I just realized I didn't see anyone say it, but please upload the patches
to bugs.Python.org for easier tracking and reviewing.

On Sat, Aug 22, 2015, 08:01 Patrascu, Alecsandru <
alecsandru.patrascu at intel.com> wrote:

> Hi All,
>
> This is Alecsandru from Server Scripting Languages Optimization team at
> Intel Corporation.
>
> I would like to submit a request to turn-on Profile Guided Optimization or
> PGO as the default build option for Python (both 2.7 and 3.6), given its
> performance benefits on a wide variety of workloads and hardware.  For
> instance, as shown from attached sample performance results from the Grand
> Unified Python Benchmark, >20% speed up was observed.  In addition, we are
> seeing 2-9% performance boost from OpenStack/Swift where more than 60% of
> the codes are in Python 2.7. Our analysis indicates the performance gain
> was mainly due to reduction of icache misses and CPU front-end stalls.
>
> Attached is the Makefile patches that modify the all build target and adds
> a new one called "disable-profile-opt". We built and tested this patch for
> Python 2.7 and 3.6 on our Linux machines (CentOS 7/Ubuntu Server 14.04,
> Intel Xeon Haswell/Broadwell with 18/8 cores).  We use "regrtest" suite for
> training as it provides the best performance improvement.  Some of the test
> programs in the suite may fail which leads to build fail.  One solution is
> to disable the specific failed test using the "-x " flag (as shown in the
> patch)
>
> Steps to apply the patch:
> 1.  hg clone https://hg.python.org/cpython cpython
> 2.  cd cpython
> 3.  hg update 2.7 (needed for 2.7 only)
> 4.  Copy *.patch to the current directory
> 5.  patch < python2.7-pgo.patch (or patch < python3.6-pgo.patch)
> 6.  ./configure
> 7.  make
>
> To disable PGO
> 7b. make disable-profile-opt
>
> In the following, please find our sample performance results from latest
> XEON machine, XEON Broadwell EP.
> Hardware (HW):      Intel XEON (Broadwell) 8 Cores
>
> BIOS settings:      Intel Turbo Boost Technology: false
>                     Hyper-Threading: false
>
> Operating System:   Ubuntu 14.04.3 LTS trusty
>
> OS configuration:   CPU freq set at fixed: 2.6GHz by
>                         echo 2600000 >
> /sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq
>                         echo 2600000 >
> /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq
>                     Address Space Layout Randomization (ASLR) disabled (to
> reduce run to run variation) by
>                         echo 0 > /proc/sys/kernel/randomize_va_space
>
> GCC version:        gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04)
>
> Benchmark:          Grand Unified Python Benchmark (GUPB)
>                     GUPB Source: https://hg.python.org/benchmarks/
>
> Python2.7 results:
>     Python source: hg clone https://hg.python.org/cpython cpython
>     Python Source: hg update 2.7
>     hg id: 0511b1165bb6 (2.7)
>     hg id -r 'ancestors(.) and tag()': 15c95b7d81dc (2.7) v2.7.10
>     hg --debug id -i: 0511b1165bb6cf40ada0768a7efc7ba89316f6a5
>
>         Benchmarks          Speedup(%)
>         simple_logging      20
>         raytrace            20
>         silent_logging      19
>         richards            19
>         chaos               16
>         formatted_logging   16
>         json_dump           15
>         hexiom2             13
>         pidigits            12
>         slowunpickle        12
>         django_v2           12
>         unpack_sequence     11
>         float               11
>         mako                11
>         slowpickle          11
>         fastpickle          11
>         django              11
>         go                  10
>         json_dump_v2        10
>         pathlib             10
>         regex_compile       10
>         pybench             9.9
>         etree_process       9
>         regex_v8            8
>         bzr_startup         8
>         2to3                8
>         slowspitfire        8
>         telco               8
>         pickle_list         8
>         fannkuch            8
>         etree_iterparse     8
>         nqueens             8
>         mako_v2             8
>         etree_generate      8
>         call_method_slots   7
>         html5lib_warmup     7
>         html5lib            7
>         nbody               7
>         spectral_norm       7
>         spambayes           7
>         fastunpickle        6
>         meteor_contest      6
>         chameleon           6
>         rietveld            6
>         tornado_http        5
>         unpickle_list       5
>         pickle_dict         4
>         regex_effbot        3
>         normal_startup      3
>         startup_nosite      3
>         etree_parse         2
>         call_method_unknown 2
>         call_simple         1
>         json_load           1
>         call_method         1
>
> Python3.6 results
>     Python source: hg clone https://hg.python.org/cpython cpython
>     hg id: 96d016f78726 tip
>     hg id -r 'ancestors(.) and tag()': 1a58b1227501 (3.5) v3.5.0rc1
>     hg --debug id -i: 96d016f78726afbf66d396f084b291ea43792af1
>
>
>         Benchmark           Speedup(%)
>         fastunpickle        22.94
>         fastpickle          21.67
>         json_load           17.64
>         simple_logging      17.49
>         meteor_contest      16.67
>         formatted_logging   15.33
>         etree_process       14.61
>         raytrace            13.57
>         etree_generate      13.56
>         chaos               12.09
>         hexiom2             12
>         nbody               11.88
>         json_dump_v2        11.24
>         richards            11.02
>         nqueens             10.96
>         fannkuch            10.79
>         go                  10.77
>         float               10.26
>         regex_compile       9.8
>         silent_logging      9.63
>         pidigits            9.58
>         etree_iterparse     9.48
>         2to3                8.44
>         regex_v8            8.09
>         regex_effbot        7.88
>         call_simple         7.63
>         tornado_http        7.38
>         etree_parse         4.92
>         spectral_norm       4.72
>         normal_startup      4.39
>         telco               3.88
>         startup_nosite      3.7
>         call_method         3.63
>         unpack_sequence     3.6
>         call_method_slots   2.91
>         call_method_unknown 2.59
>         iterative_count     0.45
>         threaded_count      -2.79
>
>
> Thank you,
> Alecsandru
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20150822/8e904906/attachment.html>


More information about the Python-Dev mailing list