[Python-Dev] Profile Guided Optimization active by-default
Brett Cannon
brett at python.org
Sat Aug 22 19:50:18 CEST 2015
On Sat, Aug 22, 2015, 09:58 Patrascu, Alecsandru <
alecsandru.patrascu at intel.com> wrote:
This target replaces the existing one in the CPython Makefile, which now
uses a quick run of pybench and the obtained binary does not perform well
on general Python loads. I don't think is a good idea to add a by-default
target that does PGO on dedicated workloads, like Django, because then it
will perform better on that particular load and poorly on other.
Sorry for not being clearer, but I was not suggesting that the default be
for Django, just whether making the Makefile easier to work with when
generating a PGO build for a custom workload. If we already have a rule
that uses pybench then it should definitely be changed to use regrtest (and
honestly pybench should not be used for benchmarking anything since it
doesn't reflect real world usage in any way; its just for quick checks
while doing development on the core of Python and otherwise shouldn't be
used to measure anything substantial).
Of course, if any user has a dedicated workload for which he or she want to
get the best benefit over PGO, it will have to run that training separately
from the proposed one. Our proposal targets the broader audience that uses
Python in various scenarios, and they will see an overall improvement after
compiling Python from sources.
Right, but my question was whether there was any benefit to making the
Makefile rules generic to make building PGO binaries easier for people who
do want to do a custom profile and it sounds like it isn't worth the effort.
So I'm with Guido where I'm happy to see the build rules added/updated to
use regrtest for a PGO build but have it be an opt-in flag and not on by
default (at least for now).
-Brett
Alecsandru
From: Brett Cannon [mailto:brett at python.org]
Sent: Saturday, August 22, 2015 7:25 PM
To: guido at python.org; Patrascu, Alecsandru
Cc: python-dev at python.org
Subject: Re: [Python-Dev] Profile Guided Optimization active by-default
On Sat, Aug 22, 2015, 09:17 Guido van Rossum <guido at python.org> wrote:
How about we first add a new Makefile target that enables PGO, without
turning it on by default? Then later we can enable it by default.
I agree. Updating the Makefile so it's easier to use PGO is great, but we
should do a release with it as opt-in and go from there.
Also, I have my doubts about regrtest. How sure are we that it represents a
typical Python load? Tests are often using a different mix of operations
than production code.
That was also my question. You said that "it provides the best performance
improvement", but compared to what; what else was tried? And what
difference does it make to e.g. a Django app that is trained on their own
simulated workload compared to using regrtest? IOW is regrtest displaying
the best across-the-board performance because it stresses the largest swath
of Python and thus catches generic patterns in the code but individuals
could get better performance with a simulated workload?
-Brett
On Sat, Aug 22, 2015 at 7:46 AM, Patrascu, Alecsandru <
alecsandru.patrascu at intel.com> wrote:
Hi All,
This is Alecsandru from Server Scripting Languages Optimization team at
Intel Corporation.
I would like to submit a request to turn-on Profile Guided Optimization or
PGO as the default build option for Python (both 2.7 and 3.6), given its
performance benefits on a wide variety of workloads and hardware. For
instance, as shown from attached sample performance results from the Grand
Unified Python Benchmark, >20% speed up was observed. In addition, we are
seeing 2-9% performance boost from OpenStack/Swift where more than 60% of
the codes are in Python 2.7. Our analysis indicates the performance gain
was mainly due to reduction of icache misses and CPU front-end stalls.
Attached is the Makefile patches that modify the all build target and adds
a new one called "disable-profile-opt". We built and tested this patch for
Python 2.7 and 3.6 on our Linux machines (CentOS 7/Ubuntu Server 14.04,
Intel Xeon Haswell/Broadwell with 18/8 cores). We use "regrtest" suite for
training as it provides the best performance improvement. Some of the test
programs in the suite may fail which leads to build fail. One solution is
to disable the specific failed test using the "-x " flag (as shown in the
patch)
Steps to apply the patch:
1. hg clone https://hg.python.org/cpython cpython
2. cd cpython
3. hg update 2.7 (needed for 2.7 only)
4. Copy *.patch to the current directory
5. patch < python2.7-pgo.patch (or patch < python3.6-pgo.patch)
6. ./configure
7. make
To disable PGO
7b. make disable-profile-opt
In the following, please find our sample performance results from latest
XEON machine, XEON Broadwell EP.
Hardware (HW): Intel XEON (Broadwell) 8 Cores
BIOS settings: Intel Turbo Boost Technology: false
Hyper-Threading: false
Operating System: Ubuntu 14.04.3 LTS trusty
OS configuration: CPU freq set at fixed: 2.6GHz by
echo 2600000 >
/sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq
echo 2600000 >
/sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq
Address Space Layout Randomization (ASLR) disabled (to
reduce run to run variation) by
echo 0 > /proc/sys/kernel/randomize_va_space
GCC version: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04)
Benchmark: Grand Unified Python Benchmark (GUPB)
GUPB Source: https://hg.python.org/benchmarks/
Python2.7 results:
Python source: hg clone https://hg.python.org/cpython cpython
Python Source: hg update 2.7
hg id: 0511b1165bb6 (2.7)
hg id -r 'ancestors(.) and tag()': 15c95b7d81dc (2.7) v2.7.10
hg --debug id -i: 0511b1165bb6cf40ada0768a7efc7ba89316f6a5
Benchmarks Speedup(%)
simple_logging 20
raytrace 20
silent_logging 19
richards 19
chaos 16
formatted_logging 16
json_dump 15
hexiom2 13
pidigits 12
slowunpickle 12
django_v2 12
unpack_sequence 11
float 11
mako 11
slowpickle 11
fastpickle 11
django 11
go 10
json_dump_v2 10
pathlib 10
regex_compile 10
pybench 9.9
etree_process 9
regex_v8 8
bzr_startup 8
2to3 8
slowspitfire 8
telco 8
pickle_list 8
fannkuch 8
etree_iterparse 8
nqueens 8
mako_v2 8
etree_generate 8
call_method_slots 7
html5lib_warmup 7
html5lib 7
nbody 7
spectral_norm 7
spambayes 7
fastunpickle 6
meteor_contest 6
chameleon 6
rietveld 6
tornado_http 5
unpickle_list 5
pickle_dict 4
regex_effbot 3
normal_startup 3
startup_nosite 3
etree_parse 2
call_method_unknown 2
call_simple 1
json_load 1
call_method 1
Python3.6 results
Python source: hg clone https://hg.python.org/cpython cpython
hg id: 96d016f78726 tip
hg id -r 'ancestors(.) and tag()': 1a58b1227501 (3.5) v3.5.0rc1
hg --debug id -i: 96d016f78726afbf66d396f084b291ea43792af1
Benchmark Speedup(%)
fastunpickle 22.94
fastpickle 21.67
json_load 17.64
simple_logging 17.49
meteor_contest 16.67
formatted_logging 15.33
etree_process 14.61
raytrace 13.57
etree_generate 13.56
chaos 12.09
hexiom2 12
nbody 11.88
json_dump_v2 11.24
richards 11.02
nqueens 10.96
fannkuch 10.79
go 10.77
float 10.26
regex_compile 9.8
silent_logging 9.63
pidigits 9.58
etree_iterparse 9.48
2to3 8.44
regex_v8 8.09
regex_effbot 7.88
call_simple 7.63
tornado_http 7.38
etree_parse 4.92
spectral_norm 4.72
normal_startup 4.39
telco 3.88
startup_nosite 3.7
call_method 3.63
unpack_sequence 3.6
call_method_slots 2.91
call_method_unknown 2.59
iterative_count 0.45
threaded_count -2.79
Thank you,
Alecsandru
_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/guido%40python.org
--
--Guido van Rossum (python.org/~guido)
_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/brett%40python.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20150822/f875eb42/attachment-0001.html>
More information about the Python-Dev
mailing list