<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"\@SimSun";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">I believe we have evaluated clang vs gcc before (long time ago), and gcc won at that time.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">PGO might have overshadowed impact from computed goto, and thus the latter may no longer be needed.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">When the performance difference is as large as 50%, there could be various options to nail down the root cause, including bytecode analysis. However, coming
down to 3.6 sec vs 3.4 sec, a delta of ~5%, it could be hard to find out. Internally we use sampling based performance tools for micro-architecture level analysis. Or generic Linux based and open source tool “perf” is very good to use. You could also do
a disassembly analysis/comparison of the object files such as the main loop, ceval.o, looking at the efficiency of the generated codes (which gives generic info regarding to Python2 and 3, but may not tell you the run time behavior with respect your specific
app, pentomino.py). <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Hope that helps.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Peter<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><a name="_MailEndCompose"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></a></p>
<p class="MsoNormal"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif"> Ben Hoyt [mailto:benhoyt@gmail.com]
<br>
<b>Sent:</b> Monday, July 24, 2017 12:35 PM<br>
<b>To:</b> Wang, Peter Xihong <peter.xihong.wang@intel.com><br>
<b>Cc:</b> Nathaniel Smith <njs@pobox.com>; Python-Dev <python-dev@python.org><br>
<b>Subject:</b> Re: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference?<o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">Thanks for testing.<o:p></o:p></p>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">Oddly, I just tested it in Linux (Ubuntu), and get the same results as you -- Python 2.7.13 outperforms 3 (3.5.3 in my case) by a few percent. And even under a Virtualbox VM it takes 3.4 and 3.6 seconds, compared to ~5s on the host macOS
operating system. Very odd. I guess that means Virtualbox is very good, and that clang/LLVM is not as good at optimizing the Python VM as gcc is.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">I can't find anything majorly different about my macOS Python 2 and 3 builds. Both look like they have PGO turned on (from sysconfig.get_config_vars()). Both have HAVE_COMPUTED_GOTOS=1 but USE_COMPUTED_GOTOS=0 for some reason. My Python
2 version is the macOS system version (/usr/local/bin/python2), whereas my Python3 version is from "brew install", so that's probably the difference, though still doesn't explain exactly why.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">-Ben<o:p></o:p></p>
</div>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">On Mon, Jul 24, 2017 at 1:49 PM, Wang, Peter Xihong <<a href="mailto:peter.xihong.wang@intel.com" target="_blank">peter.xihong.wang@intel.com</a>> wrote:<o:p></o:p></p>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<p class="MsoNormal">Hi Ben,<br>
<br>
Out of curiosity with a quick experiment, I ran your pentomino.py with 2.7.12 PGO+LTO build (Ubuntu OS 16.04.2 LTS default at /usr/bin/python), and compared with 3.7.0 alpha1 PGO+LTO (which I built a while ago), on my SkyLake processor based desktop, and 2.7
outperforms 3.7 by 3.5%.<br>
On your 2.5 GHz i7 system, I'd recommend making sure the 2 Python binaries you are comparing are in equal footings (compiled with same optimization PGO+LTO).<br>
<br>
Thanks,<br>
<br>
Peter<o:p></o:p></p>
<div>
<div>
<p class="MsoNormal"><br>
<br>
<br>
-----Original Message-----<br>
From: Python-Dev [mailto:<a href="mailto:python-dev-bounces%2Bpeter.xihong.wang">python-dev-bounces+peter.xihong.wang</a>=<a href="mailto:intel.com@python.org">intel.com@python.org</a>] On Behalf Of Nathaniel Smith<br>
Sent: Tuesday, July 18, 2017 7:00 PM<br>
To: Ben Hoyt <<a href="mailto:benhoyt@gmail.com">benhoyt@gmail.com</a>><br>
Cc: Python-Dev <<a href="mailto:python-dev@python.org">python-dev@python.org</a>><br>
Subject: Re: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference?<br>
<br>
I'd probably start with a regular C-level profiler, like perf or callgrind. They're not very useful for comparing two versions of code written in Python, but here the Python code is the same (modulo changes in the stdlib), and it's changes in the interpreter's
C code that probably make the difference.<br>
<br>
On Tue, Jul 18, 2017 at 9:03 AM, Ben Hoyt <<a href="mailto:benhoyt@gmail.com">benhoyt@gmail.com</a>> wrote:<br>
> Hi folks,<br>
><br>
> (Not entirely sure this is the right place for this question, but<br>
> hopefully it's of interest to several folks.)<br>
><br>
> A few days ago I posted a note in response to Victor Stinner's<br>
> articles on his CPython contributions, noting that I wrote a program<br>
> that ran in 11.7 seconds on Python 2.7, but only takes 5.1 seconds on<br>
> Python 3.5 (on my 2.5 GHz macOS i7), more than 2x as fast. Obviously<br>
> this is a Good Thing, but I'm curious as to why there's so much difference.<br>
><br>
> The program is a pentomino puzzle solver, and it works via code<br>
> generation, generating a ton of nested "if" statements, so I believe<br>
> it's exercising the Python bytecode interpreter heavily. Obviously<br>
> there have been some big optimizations to make this happen, but I'm<br>
> curious what the main improvements are that are causing this much difference.<br>
><br>
> There's a writeup about my program here, with benchmarks at the bottom:<br>
> <a href="http://benhoyt.com/writings/python-pentomino/" target="_blank">http://benhoyt.com/writings/python-pentomino/</a><br>
><br>
> This is the generated Python code that's being exercised:<br>
> <a href="https://github.com/benhoyt/python-pentomino/blob/master/generated_solv" target="_blank">
https://github.com/benhoyt/python-pentomino/blob/master/generated_solv</a><br>
> e.py<br>
><br>
> For reference, on Python 3.6 it runs in 4.6 seconds (same on Python<br>
> 3.7 alpha). This smallish increase from Python 3.5 to Python 3.6 was<br>
> more expected to me due to the bytecode changing to wordcode in 3.6.<br>
><br>
> I tried using cProfile on both Python versions, but that didn't say<br>
> much, because the functions being called aren't taking the majority of the time.<br>
> How does one benchmark at a lower level, or otherwise explain what's<br>
> going on here?<br>
><br>
> Thanks,<br>
> Ben<br>
><br>
> _______________________________________________<br>
> Python-Dev mailing list<br>
> <a href="mailto:Python-Dev@python.org">Python-Dev@python.org</a><br>
> <a href="https://mail.python.org/mailman/listinfo/python-dev" target="_blank">https://mail.python.org/mailman/listinfo/python-dev</a><br>
> Unsubscribe:<br>
> <a href="https://mail.python.org/mailman/options/python-dev/njs%40pobox.com" target="_blank">
https://mail.python.org/mailman/options/python-dev/njs%40pobox.com</a><br>
><br>
<br>
<br>
<br>
--<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal">Nathaniel J. Smith -- <a href="https://vorpus.org" target="_blank">
https://vorpus.org</a> _______________________________________________<br>
Python-Dev mailing list<br>
<a href="mailto:Python-Dev@python.org">Python-Dev@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/python-dev" target="_blank">https://mail.python.org/mailman/listinfo/python-dev</a><br>
Unsubscribe: <a href="https://mail.python.org/mailman/options/python-dev/peter.xihong.wang%40intel.com" target="_blank">
https://mail.python.org/mailman/options/python-dev/peter.xihong.wang%40intel.com</a><o:p></o:p></p>
</blockquote>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</div>
</body>
</html>