[Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference?

Wang, Peter Xihong peter.xihong.wang at intel.com
Mon Jul 24 13:49:44 EDT 2017


Hi Ben,

Out of curiosity with a quick experiment, I ran your pentomino.py with 2.7.12 PGO+LTO build (Ubuntu OS 16.04.2 LTS default at /usr/bin/python), and compared with 3.7.0 alpha1 PGO+LTO (which I built a while ago), on my SkyLake processor based desktop, and 2.7 outperforms 3.7 by 3.5%.
On your 2.5 GHz i7 system, I'd recommend making sure the 2 Python binaries you are comparing are in equal footings (compiled with same optimization PGO+LTO).

Thanks,

Peter



-----Original Message-----
From: Python-Dev [mailto:python-dev-bounces+peter.xihong.wang=intel.com at python.org] On Behalf Of Nathaniel Smith
Sent: Tuesday, July 18, 2017 7:00 PM
To: Ben Hoyt <benhoyt at gmail.com>
Cc: Python-Dev <python-dev at python.org>
Subject: Re: [Python-Dev] Program runs in 12s on Python 2.7, but 5s on Python 3.5 -- why so much difference?

I'd probably start with a regular C-level profiler, like perf or callgrind. They're not very useful for comparing two versions of code written in Python, but here the Python code is the same (modulo changes in the stdlib), and it's changes in the interpreter's C code that probably make the difference.

On Tue, Jul 18, 2017 at 9:03 AM, Ben Hoyt <benhoyt at gmail.com> wrote:
> Hi folks,
>
> (Not entirely sure this is the right place for this question, but 
> hopefully it's of interest to several folks.)
>
> A few days ago I posted a note in response to Victor Stinner's 
> articles on his CPython contributions, noting that I wrote a program 
> that ran in 11.7 seconds on Python 2.7, but only takes 5.1 seconds on 
> Python 3.5 (on my 2.5 GHz macOS i7), more than 2x as fast. Obviously 
> this is a Good Thing, but I'm curious as to why there's so much difference.
>
> The program is a pentomino puzzle solver, and it works via code 
> generation, generating a ton of nested "if" statements, so I believe 
> it's exercising the Python bytecode interpreter heavily. Obviously 
> there have been some big optimizations to make this happen, but I'm 
> curious what the main improvements are that are causing this much difference.
>
> There's a writeup about my program here, with benchmarks at the bottom:
> http://benhoyt.com/writings/python-pentomino/
>
> This is the generated Python code that's being exercised:
> https://github.com/benhoyt/python-pentomino/blob/master/generated_solv
> e.py
>
> For reference, on Python 3.6 it runs in 4.6 seconds (same on Python 
> 3.7 alpha). This smallish increase from Python 3.5 to Python 3.6 was 
> more expected to me due to the bytecode changing to wordcode in 3.6.
>
> I tried using cProfile on both Python versions, but that didn't say 
> much, because the functions being called aren't taking the majority of the time.
> How does one benchmark at a lower level, or otherwise explain what's 
> going on here?
>
> Thanks,
> Ben
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/njs%40pobox.com
>



--
Nathaniel J. Smith -- https://vorpus.org _______________________________________________
Python-Dev mailing list
Python-Dev at python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/peter.xihong.wang%40intel.com


More information about the Python-Dev mailing list